Flink session window not progressing

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink session window not progressing

Henrik Feldt

Hi guys,

I'm going a PoC with Flink and I was wondering if you could help me.

I've asked a question here https://stackoverflow.com/questions/55907954/flink-session-window-sink-timestamp-not-progressing with some images. However, in summary my question is this; why doesn't my session window progress?

It works great when I run it against historical data, but when I run it against a streaming data source (pub/sub) it sometimes gets stuck. In this case, it got stuck at exactly 12:00 UTC.

My window is a session window, but one where I bump the last 'edge' of the window by different amounts depending on what event type it is. Because some events never have other events after them.

You can see the problem the easiest in these graphs, specifically the one that stops at 14:00 CEST (12:00 UTC) - with green bars.

This graph shows the low-watermark progressing throughout the node in the middle (which is also a sink); and this holds for all the nodes in the graph. However, the session windowing doesn't progress, despite the low-watermark progressing.

Regards,
Henrik

Reply | Threaded
Open this post in threaded view
|

Re: Flink session window not progressing

Henrik Feldt

Thinking more about this; it might just be me who is reacting to the sink having a zero rate of output. In fact, I have about two gigs of messages left in the queue until it's up to date, so I may just be running a slow calculation (because I've run a batch job to backfill to after stream). Perhaps something is broken about sink output counts?

On 29 Apr 2019, at 19:26, Henrik Feldt wrote:

Hi guys,

I'm going a PoC with Flink and I was wondering if you could help me.

I've asked a question here https://stackoverflow.com/questions/55907954/flink-session-window-sink-timestamp-not-progressing with some images. However, in summary my question is this; why doesn't my session window progress?

It works great when I run it against historical data, but when I run it against a streaming data source (pub/sub) it sometimes gets stuck. In this case, it got stuck at exactly 12:00 UTC.

My window is a session window, but one where I bump the last 'edge' of the window by different amounts depending on what event type it is. Because some events never have other events after them.

You can see the problem the easiest in these graphs, specifically the one that stops at 14:00 CEST (12:00 UTC) - with green bars.

This graph shows the low-watermark progressing throughout the node in the middle (which is also a sink); and this holds for all the nodes in the graph. However, the session windowing doesn't progress, despite the low-watermark progressing.

Regards,
Henrik

Reply | Threaded
Open this post in threaded view
|

Re: Flink session window not progressing

Konstantin Knauf-2
Hi Henrik,

yes, the output count of a sink (and the input count of sources) is always zero, because only Flink internal traffic is reflected in these metrics. There is a Jira issue to change this [1].

Cheers,

Konstantin




On Mon, Apr 29, 2019 at 7:29 PM Henrik Feldt <[hidden email]> wrote:

Thinking more about this; it might just be me who is reacting to the sink having a zero rate of output. In fact, I have about two gigs of messages left in the queue until it's up to date, so I may just be running a slow calculation (because I've run a batch job to backfill to after stream). Perhaps something is broken about sink output counts?

On 29 Apr 2019, at 19:26, Henrik Feldt wrote:

Hi guys,

I'm going a PoC with Flink and I was wondering if you could help me.

I've asked a question here https://stackoverflow.com/questions/55907954/flink-session-window-sink-timestamp-not-progressing with some images. However, in summary my question is this; why doesn't my session window progress?

It works great when I run it against historical data, but when I run it against a streaming data source (pub/sub) it sometimes gets stuck. In this case, it got stuck at exactly 12:00 UTC.

My window is a session window, but one where I bump the last 'edge' of the window by different amounts depending on what event type it is. Because some events never have other events after them.

You can see the problem the easiest in these graphs, specifically the one that stops at 14:00 CEST (12:00 UTC) - with green bars.

This graph shows the low-watermark progressing throughout the node in the middle (which is also a sink); and this holds for all the nodes in the graph. However, the session windowing doesn't progress, despite the low-watermark progressing.

Regards,
Henrik



--

Konstantin Knauf | Solutions Architect

+49 160 91394525


Planned Absences: -



Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--

Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen