If I chain two windows, what event-time would the second window have?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

If I chain two windows, what event-time would the second window have?

Yassin Marzouki
Hi all,

Say I assign timestamps to a stream and then apply a transformation like this:

stream.keyBy(0).timeWindow(Time.hours(5)).reduce(count).timeWindowAll(Time.days(1)).apply(transformation)

Now, when the first window is applied, events are aggregated based on their timestamps, but I don't understand what timestamp will be assigned to the aggregated result of the reduce operation for the second window to process it. Could you please explain it? Thank you.

Best,
Yassine
Reply | Threaded
Open this post in threaded view
|

Re: If I chain two windows, what event-time would the second window have?

Kostas Kloudas
Hi Yassine,

When the WindowFunction is applied to the content of a window, the timestamp of the resulting record
is the window.maxTimestamp, which is the endOfWindow-1.

You can imaging if you have a Tumbling window from 0 to 2000, the result will have a timestamp of 1999.
Window boundaries are closed in the start and open at the end timestamp, or [start, end).

If you want to play around, I would suggest checking out the tests in the WindowOperatorTest class.

There you can do experiments and figure out how Flink’s windowOperator works internally and what is the 
interplay between windowAssingers, triggers, and the windowOperator.

Hope this helps,
Kostas

On Jul 27, 2016, at 8:41 AM, Yassin Marzouki <[hidden email]> wrote:

Hi all,

Say I assign timestamps to a stream and then apply a transformation like this:

stream.keyBy(0).timeWindow(Time.hours(5)).reduce(count).timeWindowAll(Time.days(1)).apply(transformation)

Now, when the first window is applied, events are aggregated based on their timestamps, but I don't understand what timestamp will be assigned to the aggregated result of the reduce operation for the second window to process it. Could you please explain it? Thank you.

Best,
Yassine

Reply | Threaded
Open this post in threaded view
|

Re: If I chain two windows, what event-time would the second window have?

Yassin Marzouki
Hi Kostas,

Thank you very much for the explanation.

Best,
Yassine

On Wed, Jul 27, 2016 at 1:09 PM, Kostas Kloudas <[hidden email]> wrote:
Hi Yassine,

When the WindowFunction is applied to the content of a window, the timestamp of the resulting record
is the window.maxTimestamp, which is the endOfWindow-1.

You can imaging if you have a Tumbling window from 0 to 2000, the result will have a timestamp of 1999.
Window boundaries are closed in the start and open at the end timestamp, or [start, end).

If you want to play around, I would suggest checking out the tests in the WindowOperatorTest class.

There you can do experiments and figure out how Flink’s windowOperator works internally and what is the 
interplay between windowAssingers, triggers, and the windowOperator.

Hope this helps,
Kostas

On Jul 27, 2016, at 8:41 AM, Yassin Marzouki <[hidden email]> wrote:

Hi all,

Say I assign timestamps to a stream and then apply a transformation like this:

stream.keyBy(0).timeWindow(Time.hours(5)).reduce(count).timeWindowAll(Time.days(1)).apply(transformation)

Now, when the first window is applied, events are aggregated based on their timestamps, but I don't understand what timestamp will be assigned to the aggregated result of the reduce operation for the second window to process it. Could you please explain it? Thank you.

Best,
Yassine