Re: Watermarks per key

Posted by Fabian Hueske-2 on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Watermarks-per-key-tp11628p11632.html

Hi Jordan,

it is not possible to generate watermarks per key. This feature has been requested a couple of times but I think there are no plans to implement that.
As far as I understand, the management of watermarks would be quite expensive (maintaining several watermarks, purging watermarks of expired keys, etc.) but Aljoscha (in CC) can share details about that.

Best,
Fabian

2017-02-15 2:02 GMT+01:00 Jordan Ganoff <[hidden email]>:
Hi,

I’m designing a streaming job whose elements need to be windowed by event time across a large set of keys. All elements are read from the same source. Event time progresses independently across keys. Is it possible to assign timestamps, and thus generate independent watermarks, per keyed stream, so late arriving elements can be handled per keyed stream?

And in general, what’s the best approach to designing a job that needs to process different keyed streams whose event times do not relate to each other? My current approach generates timestamps at the source but never generates watermarks so no record is ever considered late. This has the unfortunate side effect of windows never closing. As a result, each event time window relies on a custom trigger which fires and purges the window after a given amount of processing time elapses during which no new records arrived.

Thanks,
Jordan