Strange behaviour of windows

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange behaviour of windows

Dawid Wysakowicz
Hi,

I have recently experimented a bit with windowing and event-time mechanism in flink and either I do not understand how should it work or there is some kind of a bug.

I have prepared two Source Functions. One that emits watermark itself and one that does not, but I have prepared a TimestampExtractor that should produce same results that the previous Source Function, at least from my point of view.

Afterwards I've prepared a simple summing over an EventTimeTriggered Sliding Window.

What I expected is a sum of 3*(t_sum) property of Event regardless of the sleep time in Source Function. That is how the EventTimeSourceFunction works, but for the SourceFunction it depends on the sleep and does not equals 3*(t_sum).

I have done some debugging and for the SourceFunction the output of ExtractTimestampsOperator does not chain to the aggregator operator(the property output.allOutputs is empty).

Do I understand the mechanism correctly and should my code work as I described? If not could you please explain a little bit? The code I've attached to this email.

I would be grateful.

Regards
Dawid Wysakowicz



FlinkWindows.scala (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Strange behaviour of windows

Dawid Wysakowicz
Forgot to mention. I've checked it both on 0.10 and current master.

2015-12-07 20:32 GMT+01:00 Dawid Wysakowicz <[hidden email]>:
Hi,

I have recently experimented a bit with windowing and event-time mechanism in flink and either I do not understand how should it work or there is some kind of a bug.

I have prepared two Source Functions. One that emits watermark itself and one that does not, but I have prepared a TimestampExtractor that should produce same results that the previous Source Function, at least from my point of view.

Afterwards I've prepared a simple summing over an EventTimeTriggered Sliding Window.

What I expected is a sum of 3*(t_sum) property of Event regardless of the sleep time in Source Function. That is how the EventTimeSourceFunction works, but for the SourceFunction it depends on the sleep and does not equals 3*(t_sum).

I have done some debugging and for the SourceFunction the output of ExtractTimestampsOperator does not chain to the aggregator operator(the property output.allOutputs is empty).

Do I understand the mechanism correctly and should my code work as I described? If not could you please explain a little bit? The code I've attached to this email.

I would be grateful.

Regards
Dawid Wysakowicz



Reply | Threaded
Open this post in threaded view
|

Re: Strange behaviour of windows

Aljoscha Krettek
Hi,
an important concept of the Flink API is that transformations do not modify the original stream (or dataset) but return a new stream with the modifications in place. In your example the result of the extractTimestamps() call should be used for further processing. I attached your source code with the required modifications.

Other than that, I think you understood the watermarks quite well. :D

Let us know if you need more information.

Cheers,
Aljoscha


> On 07 Dec 2015, at 20:34, Dawid Wysakowicz <[hidden email]> wrote:
>
> Forgot to mention. I've checked it both on 0.10 and current master.
>
> 2015-12-07 20:32 GMT+01:00 Dawid Wysakowicz <[hidden email]>:
> Hi,
>
> I have recently experimented a bit with windowing and event-time mechanism in flink and either I do not understand how should it work or there is some kind of a bug.
>
> I have prepared two Source Functions. One that emits watermark itself and one that does not, but I have prepared a TimestampExtractor that should produce same results that the previous Source Function, at least from my point of view.
>
> Afterwards I've prepared a simple summing over an EventTimeTriggered Sliding Window.
>
> What I expected is a sum of 3*(t_sum) property of Event regardless of the sleep time in Source Function. That is how the EventTimeSourceFunction works, but for the SourceFunction it depends on the sleep and does not equals 3*(t_sum).
>
> I have done some debugging and for the SourceFunction the output of ExtractTimestampsOperator does not chain to the aggregator operator(the property output.allOutputs is empty).
>
> Do I understand the mechanism correctly and should my code work as I described? If not could you please explain a little bit? The code I've attached to this email.
>
> I would be grateful.
>
> Regards
> Dawid Wysakowicz
>
>
>


FlinkWindows.scala (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Strange behaviour of windows

Dawid Wysakowicz
Thanks for the explanation. That was really stupid mistake from my side. By the way, I really like the whole idea and API. Really good job!

Regards
Dawid

2015-12-08 12:30 GMT+01:00 Aljoscha Krettek <[hidden email]>:
Hi,
an important concept of the Flink API is that transformations do not modify the original stream (or dataset) but return a new stream with the modifications in place. In your example the result of the extractTimestamps() call should be used for further processing. I attached your source code with the required modifications.

Other than that, I think you understood the watermarks quite well. :D

Let us know if you need more information.

Cheers,
Aljoscha


> On 07 Dec 2015, at 20:34, Dawid Wysakowicz <[hidden email]> wrote:
>
> Forgot to mention. I've checked it both on 0.10 and current master.
>
> 2015-12-07 20:32 GMT+01:00 Dawid Wysakowicz <[hidden email]>:
> Hi,
>
> I have recently experimented a bit with windowing and event-time mechanism in flink and either I do not understand how should it work or there is some kind of a bug.
>
> I have prepared two Source Functions. One that emits watermark itself and one that does not, but I have prepared a TimestampExtractor that should produce same results that the previous Source Function, at least from my point of view.
>
> Afterwards I've prepared a simple summing over an EventTimeTriggered Sliding Window.
>
> What I expected is a sum of 3*(t_sum) property of Event regardless of the sleep time in Source Function. That is how the EventTimeSourceFunction works, but for the SourceFunction it depends on the sleep and does not equals 3*(t_sum).
>
> I have done some debugging and for the SourceFunction the output of ExtractTimestampsOperator does not chain to the aggregator operator(the property output.allOutputs is empty).
>
> Do I understand the mechanism correctly and should my code work as I described? If not could you please explain a little bit? The code I've attached to this email.
>
> I would be grateful.
>
> Regards
> Dawid Wysakowicz
>
>
>