sorting data into sink

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

sorting data into sink

robert
Does any know if this is a correct assumption


DataStream<KeyedAvroRecord> sorted = stream.keyBy("partition");

Will automattically put same record to the same sink thread ?


The behavior I am seeing is that a Sink setup with multiple threads is see data from the same hour.

Any good examples of how to sort data so that Sink threads only get the same type of data ?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: sorting data into sink

Fabian Hueske-2
Hi,

To be honest, I did not understand your requirements and what you are looking for.

stream.keyBy("partition").addSink(...) will partition the output on the "partition" attribute before handing it to the sink.
Hence, all records with the same "partition" value will be handled by the same parallel sink instance.

Best, Fabian

2018-03-13 20:20 GMT+01:00 Telco Phone <[hidden email]>:
Does any know if this is a correct assumption


DataStream<KeyedAvroRecord> sorted = stream.keyBy("partition");

Will automattically put same record to the same sink thread ?


The behavior I am seeing is that a Sink setup with multiple threads is see data from the same hour.

Any good examples of how to sort data so that Sink threads only get the same type of data ?

Thanks