TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1

LINZ, Arnaud

Hello,

 

I’ve switched my Flink version from 0.10.1 to 1.0 and I have a regression in some  of my unit tests.

 

To narrow the problem, here is what I’ve figured out:

 

-          I use a simple Streaming application with a source defined as “fromElements("Element 1", "Element 2", "Element 3")

-          I use a simple time window function with a 3 second window : timeWindowAll(Time.seconds(3))   

-          I use an apply() function and counts the total number of elements I get with a global counter

 

With the previous version, I got all three elements because, not because they are  triggered under 3 seconds, but because the source ends

With the 1.0 version, I don’t get any elements, and that’s annoying because as the source ends the application ends even if I sleep 5 seconds after the execute() method.

 

(If I replace fromElement with fromCollection with a 10000 element list and Time.second(3) with Time.millisecond(1), I get a random number of elements)

 

Is this behavior wanted ? If yes, how do I get my last elements now ?

 

Best regards,

Arnaud

 

 

 




L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message, merci de le détruire et d'avertir l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company that sent this message cannot therefore be held liable for its content nor attachments. Any unauthorized use or dissemination is prohibited. If you are not the intended recipient of this message, then please delete it and notify the sender.
Reply | Threaded
Open this post in threaded view
|

Re: TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1

Till Rohrmann

Hi Arnaud,

with version 1.0 the behaviour for window triggering in case of a finite stream was slightly changed. If you use event time, then all unfinished windows are triggered in case that your stream ends. This can be motivated by the fact that the end of a stream is equivalent to no elements will arrive until the maximum time (infinity) has been reached. This knowledge, allows you to emit a Long.MaxValue watermark when an event time stream is finished, which will trigger all lingering windows.

In contrast to event time, you cannot say the same about a finished processing time stream. There we don’t have logical time but the actual processing time we use to reason about windows. When a stream finishes, then we cannot fast forward the processing time to a point where the windows will fire. This can only happen if we keep the operators alive until the wall clock tells us that it’s time to fire the windows. However, there is no such feature implemented yet in Flink.

I hope this helps you to understand the failing test cases.

Cheers,
Till


On Mon, Mar 14, 2016 at 1:14 PM, LINZ, Arnaud <[hidden email]> wrote:

Hello,

 

I’ve switched my Flink version from 0.10.1 to 1.0 and I have a regression in some  of my unit tests.

 

To narrow the problem, here is what I’ve figured out:

 

-          I use a simple Streaming application with a source defined as “fromElements("Element 1", "Element 2", "Element 3")

-          I use a simple time window function with a 3 second window : timeWindowAll(Time.seconds(3))   

-          I use an apply() function and counts the total number of elements I get with a global counter

 

With the previous version, I got all three elements because, not because they are  triggered under 3 seconds, but because the source ends

With the 1.0 version, I don’t get any elements, and that’s annoying because as the source ends the application ends even if I sleep 5 seconds after the execute() method.

 

(If I replace fromElement with fromCollection with a 10000 element list and Time.second(3) with Time.millisecond(1), I get a random number of elements)

 

Is this behavior wanted ? If yes, how do I get my last elements now ?

 

Best regards,

Arnaud

 

 

 




L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message, merci de le détruire et d'avertir l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company that sent this message cannot therefore be held liable for its content nor attachments. Any unauthorized use or dissemination is prohibited. If you are not the intended recipient of this message, then please delete it and notify the sender.

Reply | Threaded
Open this post in threaded view
|

RE: TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1

LINZ, Arnaud

Hi,

 

All right… I find this new behavior dangerous since you’ll always miss the last elements of a source that does not last forever if you use processing time windows.

I’ve created a source wrapper that sleeps at the end of the last element so that unit test that use processing time work.

 

Cheers,

Arnaud

 

 

De : Till Rohrmann [mailto:[hidden email]]
Envoyé : lundi 14 mars 2016 15:11
À : [hidden email]
Objet : Re: TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1

 

Hi Arnaud,

with version 1.0 the behaviour for window triggering in case of a finite stream was slightly changed. If you use event time, then all unfinished windows are triggered in case that your stream ends. This can be motivated by the fact that the end of a stream is equivalent to no elements will arrive until the maximum time (infinity) has been reached. This knowledge, allows you to emit a Long.MaxValue watermark when an event time stream is finished, which will trigger all lingering windows.

In contrast to event time, you cannot say the same about a finished processing time stream. There we don’t have logical time but the actual processing time we use to reason about windows. When a stream finishes, then we cannot fast forward the processing time to a point where the windows will fire. This can only happen if we keep the operators alive until the wall clock tells us that it’s time to fire the windows. However, there is no such feature implemented yet in Flink.

I hope this helps you to understand the failing test cases.

Cheers,
Till

 

On Mon, Mar 14, 2016 at 1:14 PM, LINZ, Arnaud <[hidden email]> wrote:

Hello,

 

I’ve switched my Flink version from 0.10.1 to 1.0 and I have a regression in some  of my unit tests.

 

To narrow the problem, here is what I’ve figured out:

 

-          I use a simple Streaming application with a source defined as “fromElements("Element 1", "Element 2", "Element 3")

-          I use a simple time window function with a 3 second window : timeWindowAll(Time.seconds(3))   

-          I use an apply() function and counts the total number of elements I get with a global counter

 

With the previous version, I got all three elements because, not because they are  triggered under 3 seconds, but because the source ends

With the 1.0 version, I don’t get any elements, and that’s annoying because as the source ends the application ends even if I sleep 5 seconds after the execute() method.

 

(If I replace fromElement with fromCollection with a 10000 element list and Time.second(3) with Time.millisecond(1), I get a random number of elements)

 

Is this behavior wanted ? If yes, how do I get my last elements now ?

 

Best regards,

Arnaud

 

 

 

 



L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message, merci de le détruire et d'avertir l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company that sent this message cannot therefore be held liable for its content nor attachments. Any unauthorized use or dissemination is prohibited. If you are not the intended recipient of this message, then please delete it and notify the sender.