Event time window questions

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Event time window questions

Navneeth Krishnan
Hi,

I am having few issues with event time windowing. Here is my scenario, data is ingested from a kafka consumer and then keyed by user followed by a Tumbling event window for 10 seconds. The max lag tolerance limit is 1 second. 

I have the BoundedOutOfOrdernessGenerator that extends AssignerWithPeriodicWatermarks to assign watermarks. When the data is ingested even after receiving multiple messages per user the window never gets evicted. What am I missing here?

.window(TumblingEventTimeWindows.of(Time.seconds(10))) 

The other issue I am having is there will be scenarios where there is just one message per user for more than a minute. In that case I want the window content to be evicted after the defined window interval of 10 seconds. Is there a way to evict the window data even when there is no more incoming data for that key? I have tried setAutoWatermarkInterval(10000) but still no luck. 

How do I get the current watermark to be displayed in the flink dashboard UI under watermarks sections? Currently it shows no watermarks.

Also is there a way to count the number of messages that missed the time window due to late arrival? 

Thanks and appreciate all the help.
Reply | Threaded
Open this post in threaded view
|

Re: Event time window questions

Hung
Hi,

you can write your own trigger and window, and implement whatever logic
there.
There are some examples
https://github.com/apache/flink/blob/1875cac03042dad4a4c47b0de8364f02fbe457c6/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/

If you don't see any event, it means window is not triggered.

It would mean Watermark is not increasing. The issue can be the timestamp is
not extracted correctly.
Or, if you miss the trigger if use the window function doesn't have it.

Cheers,

Sendoh



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Event time window questions

Navneeth Krishnan
Thanks Sendoh. Is there a way to advance watermark even when there are no incoming events. What exactly does setAutoWatermarkInterval do?

Also I don't see the watermark displayed in flink dashboard.

Will the watermark advance only when there is data from all consuming kafka topic and partitions? I have 3 topics with 3 partitions in each topic.

Thanks.

Regards,
Navneeth

On Tue, Jan 23, 2018 at 9:32 AM, Sendoh <[hidden email]> wrote:
Hi,

you can write your own trigger and window, and implement whatever logic
there.
There are some examples
https://github.com/apache/flink/blob/1875cac03042dad4a4c47b0de8364f02fbe457c6/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/

If you don't see any event, it means window is not triggered.

It would mean Watermark is not increasing. The issue can be the timestamp is
not extracted correctly.
Or, if you miss the trigger if use the window function doesn't have it.

Cheers,

Sendoh



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: Event time window questions

Hung
setAutoWatermarkInterval configures how often the watermark is produced.
so if watermark is not proceeding, if you set shorter interval, you would
see t1, t1, t1, t1, t1 more often.
But what you would like to see is t1, t2, t3, t4....

If you want to see count 0 when there is no incoming events,0 sounds for me
it's your use case, you can check sliding window.

I think seeing watermark in UI is possible now, or you can use debug mode to
see it.

The watermark you use won't wait for all topics(partitions). It's possible
if you implement your own watermark.

Cheers,

Sendoh



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/