[Flink] How to Converting DataStream<Row> to Dataset or Table?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Flink] How to Converting DataStream<Row> to Dataset or Table?

Richard Xin
I have DataStream<Row>, is there a way to convert it DataSet or table so that I could sort it and persist it a file?

Thanks a lot!
Reply | Threaded
Open this post in threaded view
|

Re: [Flink] How to Converting DataStream<Row> to Dataset or Table?

Stefan Richter
Hi,


Best,
Stefan

Am 15.11.2017 um 20:33 schrieb Richard Xin <[hidden email]>:

I have DataStream<Row>, is there a way to convert it DataSet or table so that I could sort it and persist it a file?

Thanks a lot!

Reply | Threaded
Open this post in threaded view
|

Re: [Flink] How to Converting DataStream<Row> to Dataset or Table?

Timo Walther
Hi Richard,

in general it is difficult to sort a DataStream that is potentially neverending. However, if you use Flink's event-time semantics with watermarks that indicate that you stream is complete until a certain point you can sort it. The Table API will offer a a sort option in 1.4 (https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/sql.html#orderby--limit) based on that. The easiest way to implement a sort is to buffer the records in state and sort them when you think it is reasonable to sort them, you can use a ProcessFunction for that.

I hope that helps.

Regards,
Timo


Am 11/16/17 um 2:37 PM schrieb Stefan Richter:
Hi,


Best,
Stefan

Am 15.11.2017 um 20:33 schrieb Richard Xin <[hidden email]>:

I have DataStream<Row>, is there a way to convert it DataSet or table so that I could sort it and persist it a file?

Thanks a lot!