Conversion of Table (Blink/batch) to DataStream

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Conversion of Table (Blink/batch) to DataStream

Maciek Próchniak
Hello,

I'm playing around with Table/SQL API (Flink 1.9/1.10) and I was
wondering how I can do the following:

1. read batch data (e.g. from files)

2. sort them using Table/SQL SortOperator

3. perform further operations using "normal" DataStream API (treating my
batch as finite stream) - to reuse the code I have developed for stream
processing cases.


Now, to perform step 2. I understand I should use Blink planner in batch
mode, but then - although there is StreamExecutionEnvironment underneath
- there seems to be no easy

(or at least documented ;)) way of going from Table to DataStream.

The toAppendStream/toRetractStream are restricted to stream mode, and if
I use it I cannot use SortOperator easily.

Of course, I can write results to some external output like files, but
I'd like to avoid that...

Is there any nice way to do this? And if not - are there plans to make
it possible in the future?


thanks,

maciek


ps. the new Table/SQL stuff is really, really cool!

Reply | Threaded
Open this post in threaded view
|

Re: Conversion of Table (Blink/batch) to DataStream

Jark Wu-3
Hi Maciek,

This will be supported in the future. 
Currently, you can create a `StreamTableEnvironmentImpl` by yourself using the constructor (the construct does'n restrict batch mode). 
SQL CLI also does in the same way [1] (even though it's a hack).

Best,
Jark


On Sat, 4 Apr 2020 at 15:42, Maciek Próchniak <[hidden email]> wrote:
Hello,

I'm playing around with Table/SQL API (Flink 1.9/1.10) and I was
wondering how I can do the following:

1. read batch data (e.g. from files)

2. sort them using Table/SQL SortOperator

3. perform further operations using "normal" DataStream API (treating my
batch as finite stream) - to reuse the code I have developed for stream
processing cases.


Now, to perform step 2. I understand I should use Blink planner in batch
mode, but then - although there is StreamExecutionEnvironment underneath
- there seems to be no easy

(or at least documented ;)) way of going from Table to DataStream.

The toAppendStream/toRetractStream are restricted to stream mode, and if
I use it I cannot use SortOperator easily.

Of course, I can write results to some external output like files, but
I'd like to avoid that...

Is there any nice way to do this? And if not - are there plans to make
it possible in the future?


thanks,

maciek


ps. the new Table/SQL stuff is really, really cool!

Reply | Threaded
Open this post in threaded view
|

Re: Conversion of Table (Blink/batch) to DataStream

Maciek Próchniak

Hi Jark,

thanks for quick answer - I strongly suspected there is a hack like that somewhere - but couldn't find it easily in the maze of old and new scala and java APIs :D

For my current experiments it's ok, I'm sure in next releases everything will be cleaned up :)


best,

maciek



On 05/04/2020 06:04, Jark Wu wrote:
Hi Maciek,

This will be supported in the future. 
Currently, you can create a `StreamTableEnvironmentImpl` by yourself using the constructor (the construct does'n restrict batch mode). 
SQL CLI also does in the same way [1] (even though it's a hack).

Best,
Jark


On Sat, 4 Apr 2020 at 15:42, Maciek Próchniak <[hidden email]> wrote:
Hello,

I'm playing around with Table/SQL API (Flink 1.9/1.10) and I was
wondering how I can do the following:

1. read batch data (e.g. from files)

2. sort them using Table/SQL SortOperator

3. perform further operations using "normal" DataStream API (treating my
batch as finite stream) - to reuse the code I have developed for stream
processing cases.


Now, to perform step 2. I understand I should use Blink planner in batch
mode, but then - although there is StreamExecutionEnvironment underneath
- there seems to be no easy

(or at least documented ;)) way of going from Table to DataStream.

The toAppendStream/toRetractStream are restricted to stream mode, and if
I use it I cannot use SortOperator easily.

Of course, I can write results to some external output like files, but
I'd like to avoid that...

Is there any nice way to do this? And if not - are there plans to make
it possible in the future?


thanks,

maciek


ps. the new Table/SQL stuff is really, really cool!