Watermark advancement in late side output

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Watermark advancement in late side output

orips
Let's say I have an event-time stream with a window and a side output for late data, and in the side output of the late data, I further assign timestamps and do windowing - what is the watermark situation here?

The main stream has its own watermark advancement but the side output has its own. Do they maintain separate watermarks? Or they intermingle?

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Watermark advancement in late side output

Timo Walther
Hi Ori,

first of all, watermarks are sent to all side outputs (this is tested
here [1]). Thus, operators in the side output branch of the pipeline
will work similar to operators in the main branch.

When calling `assignTimestampsAndWatermarks`, the inserted operator will
erase incoming watermarks and only emit self-generated ones. The logic
can be found here [2]. Thus, downstream operators in the side output
will only consider the newly assigned one (+ the end watermark Long.MAX).

I hope this helps.

Regards,
Timo

[1]
https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SideOutputITCase.java
[2]
https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/TimestampsAndWatermarksOperator.java#L114

On 21.09.20 12:21, Ori Popowski wrote:
> Let's say I have an event-time stream with a window and a side output
> for late data, and in the side output of the late data, I further assign
> timestamps and do windowing - what is the watermark situation here?
>
> The main stream has its own watermark advancement but the side output
> has its own. Do they maintain separate watermarks? Or they intermingle?
>
> Thanks

Reply | Threaded
Open this post in threaded view
|

Re: Watermark advancement in late side output

orips