Different watermarks on keyed stream

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Different watermarks on keyed stream

Björn Hedström
Hi,

I'm building a small application which reads CSV-files from Kafka and passes them to Flink. In Flink i parse these values which contains an embedded timestamp and assign watermarks with a BoundedOutOfOrdernessTimestampExtractor. Then i "transform" the stream into a KeyedStream by an ID embedded in the datavalue. These are later processed by CEP. However since the timestamps may differ depending on extracted ID I would like to have a seperate watermark for each ID in the keyed stream before they are processed in CEP. Is this possible and in that case how do i procceed? Any help would be appreciated.

Best,
Björn
Reply | Threaded
Open this post in threaded view
|

Re: Different watermarks on keyed stream

Till Rohrmann
Hi Björn,

unfortunately Flink does not support per key watermarks. Watermarks are always global.

One way to solve this problem would be to split your input data up into disjunct pieces where each piece only contains data for one key. You could do this either by creating new Kafka topics or by splitting the input stream via `split`. Then you can assign the watermarks based on these splits and then it should work.

Cheers,
Till

On Wed, Aug 9, 2017 at 9:57 AM, Björn Hedström <[hidden email]> wrote:
Hi,

I'm building a small application which reads CSV-files from Kafka and passes them to Flink. In Flink i parse these values which contains an embedded timestamp and assign watermarks with a BoundedOutOfOrdernessTimestampExtractor. Then i "transform" the stream into a KeyedStream by an ID embedded in the datavalue. These are later processed by CEP. However since the timestamps may differ depending on extracted ID I would like to have a seperate watermark for each ID in the keyed stream before they are processed in CEP. Is this possible and in that case how do i procceed? Any help would be appreciated.

Best,
Björn