Re: Cannot see all events in window apply() for big input
Posted by
Till Rohrmann on
Nov 08, 2016; 2:46pm
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Cannot-see-all-events-in-window-apply-for-big-input-tp9945p9986.html
Hi Sendoh,
Flink should actually never lose data unless it is so late that it arrives after the allowed lateness. This should be independent of the total data size.
The watermarks are indeed global and not bound to a specific input element or a group. So for example if you create the watermarks from the timestamp information of your events and you have the following input event sequence: (eventA, 01-01), (eventB, 02-01), (eventC, 01-02). Then you would generate the watermark W(02-01) after the second event. The third event would then be a late element and if it exceeds the allowed lateness, then it will be discarded.
What you have to make sure is that the events in your queue have a monotonically increasing timestamp if you generate the watermarks from a timestamp field of the events.
Cheers,
Till