Hi, I am having few issues with event time windowing. Here is my scenario, data is ingested from a kafka consumer and then keyed by user followed by a Tumbling event window for 10 seconds. The max lag tolerance limit is 1 second. I have the BoundedOutOfOrdernessGenerator that extends AssignerWithPeriodicWatermarks to assign watermarks. When the data is ingested even after receiving multiple messages per user the window never gets evicted. What am I missing here? .window(TumblingEventTimeWindows.of(Time.seconds(10))) The other issue I am having is there will be scenarios where there is just one message per user for more than a minute. In that case I want the window content to be evicted after the defined window interval of 10 seconds. Is there a way to evict the window data even when there is no more incoming data for that key? I have tried setAutoWatermarkInterval(10000) but still no luck. How do I get the current watermark to be displayed in the flink dashboard UI under watermarks sections? Currently it shows no watermarks. Also is there a way to count the number of messages that missed the time window due to late arrival? Thanks and appreciate all the help. |
Hi,
you can write your own trigger and window, and implement whatever logic there. There are some examples https://github.com/apache/flink/blob/1875cac03042dad4a4c47b0de8364f02fbe457c6/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/ If you don't see any event, it means window is not triggered. It would mean Watermark is not increasing. The issue can be the timestamp is not extracted correctly. Or, if you miss the trigger if use the window function doesn't have it. Cheers, Sendoh -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Thanks Sendoh. Is there a way to advance watermark even when there are no incoming events. What exactly does setAutoWatermarkInterval do?
Also I don't see the watermark displayed in flink dashboard. Will the watermark advance only when there is data from all consuming kafka topic and partitions? I have 3 topics with 3 partitions in each topic. Thanks. Regards, Navneeth On Tue, Jan 23, 2018 at 9:32 AM, Sendoh <[hidden email]> wrote: Hi, |
setAutoWatermarkInterval configures how often the watermark is produced.
so if watermark is not proceeding, if you set shorter interval, you would see t1, t1, t1, t1, t1 more often. But what you would like to see is t1, t2, t3, t4.... If you want to see count 0 when there is no incoming events,0 sounds for me it's your use case, you can check sliding window. I think seeing watermark in UI is possible now, or you can use debug mode to see it. The watermark you use won't wait for all topics(partitions). It's possible if you implement your own watermark. Cheers, Sendoh -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Free forum by Nabble | Edit this page |