Question Streaming API - window(win_length,slide_step)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Question Streaming API - window(win_length,slide_step)

Camelia-Elena Ciolac
Hello,

This time I have a question related to the new Streaming API available in Flink 0.7.0-incubating and which allows automatic sliding a window over data, while aggregating values of interest in a reduce / reduceGroup for example.Let's consider:

someDatastream.window(win_length, slide_step).reduceGroup(.....)

Great, with this scenario in mind, my question is :

From inside the reduceGroup function how can I obtain the java.util.Date date corresponding to the first moment of the window?

I understand that the window sliding is done transparently to the developer, so how can we get this information about the current interval of dates contained in the window?

My intention is to perform some computation only if the first date of the window, as it is positioned in each sliding movement, satisfies a condition.

Best regards,
Camelia

Reply | Threaded
Open this post in threaded view
|

Re: Question Streaming API - window(win_length,slide_step)

Gyula Fóra-2
Hello,

Nice question, honestly we haven’t thought about this functionality yet but it makes a very a good point. (I am also forwarding this to the dev-list, it’s more fitting there)

In the current API there is no straightforward way of getting this information. What you could do at the moment is to attach a timestamp to the tuples at some previous operator and get the timestamp of the first tuple in the group.

This unfortunately would not always provide the correct time because windowing(when based on system time) is now based on the time when the tuple enters the window-reducer (this is also something to think about).

What we can do to add this functionality is to add another function interface: AbstractWindowFunction which would extend the AbstractRichFunction which would provide some methods of getting specific time information about the current window.
We will definitely do something like this for the next release.

By the way there will be a complete rework of the windowing semantics for the next release with very advanced features, it will be out for testing in a few weeks :)

Regards,
Gyula

On 19 Nov 2014, at 10:55, Camelia-Elena Ciolac <[hidden email]> wrote:

Hello,

This time I have a question related to the new Streaming API available in Flink 0.7.0-incubating and which allows automatic sliding a window over data, while aggregating values of interest in a reduce / reduceGroup for example.Let's consider:

someDatastream.window(win_length, slide_step).reduceGroup(.....)

Great, with this scenario in mind, my question is :

From inside the reduceGroup function how can I obtain the java.util.Date date corresponding to the first moment of the window?

I understand that the window sliding is done transparently to the developer, so how can we get this information about the current interval of dates contained in the window?

My intention is to perform some computation only if the first date of the window, as it is positioned in each sliding movement, satisfies a condition.

Best regards,
Camelia