Understanding Sliding Windows

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Understanding Sliding Windows

Piyush Shrivastava
Hi all,
I wanted to know how exactly sliding windows produce results in Flink.
Suppose I create a sliding window of 5 minutes which is refreshed in every 10 seconds:

.timeWindow(Time.minutes(5), Time.seconds(10))

So in every 10 seconds we are looking at data from the past 5 minutes. But what happens before the initial 5 minutes have passed?
Suppose we start the computation at 10:00. At 10:05 we will get the result for 10:00-10:05. But what are the results which we get in between this? i.e. at 10:00:10, 10:00:20 and so on.
Basically why do Flink start producing results before the initial threshold has passed? What do these results signify?
 
Reply | Threaded
Open this post in threaded view
|

Re: Understanding Sliding Windows

Dominik Choma
Piyush,

You created sliding window witch is triggered every 10 seconds
Flink fires up this window every 10 seconds, without waiting at 5 min buffer to be filled up
It seems to me that first argument is rather "maximum data buffer retention" than " the initial threshold"

Dominik



Dominik

2016-04-26 12:16 GMT+02:00 Piyush Shrivastava <[hidden email]>:
Hi all,
I wanted to know how exactly sliding windows produce results in Flink.
Suppose I create a sliding window of 5 minutes which is refreshed in every 10 seconds:

.timeWindow(Time.minutes(5), Time.seconds(10))

So in every 10 seconds we are looking at data from the past 5 minutes. But what happens before the initial 5 minutes have passed?
Suppose we start the computation at 10:00. At 10:05 we will get the result for 10:00-10:05. But what are the results which we get in between this? i.e. at 10:00:10, 10:00:20 and so on.
Basically why do Flink start producing results before the initial threshold has passed? What do these results signify?
 

Reply | Threaded
Open this post in threaded view
|

Re: Understanding Sliding Windows

Piyush Shrivastava
Hello Dominik,

Thanks for the information. Since my window is getting triggered every 10 seconds, the results I am getting before 5 minutes would be irrelevant as I need to consider data coming in every 5 minutes. Is there a way I can skip the results that are output before the first 5 minutes?


On Tuesday, 26 April 2016 8:54 PM, Dominik Choma <[hidden email]> wrote:


Piyush,

You created sliding window witch is triggered every 10 seconds
Flink fires up this window every 10 seconds, without waiting at 5 min buffer to be filled up
It seems to me that first argument is rather "maximum data buffer retention" than " the initial threshold"

Dominik



Dominik

2016-04-26 12:16 GMT+02:00 Piyush Shrivastava <[hidden email]>:
Hi all,
I wanted to know how exactly sliding windows produce results in Flink.
Suppose I create a sliding window of 5 minutes which is refreshed in every 10 seconds:

.timeWindow(Time.minutes(5), Time.seconds(10))

So in every 10 seconds we are looking at data from the past 5 minutes. But what happens before the initial 5 minutes have passed?
Suppose we start the computation at 10:00. At 10:05 we will get the result for 10:00-10:05. But what are the results which we get in between this? i.e. at 10:00:10, 10:00:20 and so on.
Basically why do Flink start producing results before the initial threshold has passed? What do these results signify?
 



Reply | Threaded
Open this post in threaded view
|

Re: Understanding Sliding Windows

Aljoscha Krettek
Hi,
there is no way to skip the first 5 minutes since Flink doesn't know where your "time" begins. Elements are just put into window "buckets" that are emitted at the appropriate time.

Cheers,
Aljoscha

On Wed, 27 Apr 2016 at 07:01 Piyush Shrivastava <[hidden email]> wrote:
Hello Dominik,

Thanks for the information. Since my window is getting triggered every 10 seconds, the results I am getting before 5 minutes would be irrelevant as I need to consider data coming in every 5 minutes. Is there a way I can skip the results that are output before the first 5 minutes?
On Tuesday, 26 April 2016 8:54 PM, Dominik Choma <[hidden email]> wrote:


Piyush,

You created sliding window witch is triggered every 10 seconds
Flink fires up this window every 10 seconds, without waiting at 5 min buffer to be filled up
It seems to me that first argument is rather "maximum data buffer retention" than " the initial threshold"

Dominik



Dominik

2016-04-26 12:16 GMT+02:00 Piyush Shrivastava <[hidden email]>:
Hi all,
I wanted to know how exactly sliding windows produce results in Flink.
Suppose I create a sliding window of 5 minutes which is refreshed in every 10 seconds:

.timeWindow(Time.minutes(5), Time.seconds(10))

So in every 10 seconds we are looking at data from the past 5 minutes. But what happens before the initial 5 minutes have passed?
Suppose we start the computation at 10:00. At 10:05 we will get the result for 10:00-10:05. But what are the results which we get in between this? i.e. at 10:00:10, 10:00:20 and so on.
Basically why do Flink start producing results before the initial threshold has passed? What do these results signify?