Sorting in a WindowedDataStream

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Sorting in a WindowedDataStream

Niklas Semmler
Hello there,

What functions should be used to aggregate (unordered) tuples for every
window in a WindowedDataStream to a (ordered) list?

Neither foldWindow nor reduceWindow seems to be applicable, and
aggregate does not, to my understanding, take user-defined functions.

To get started with flink I am experimenting with the
WindowedDataStream. My goal is to naively compute the top-k from a
stream of key-value tuples.

From: Input over SocketTextStream
previous window, (a, 1), (b, 3), (c, 2), next window

To:
[previous window], [(b, 3), (c, 2), (a, 1)], [next window]

Is this behavior not possible?

Best,
Niklas

--
PhD Student / Research Assistant
INET, TU Berlin
Room 4.029
Marchstr 23
10587 Berlin
Tel: +49 30 314 78752
Reply | Threaded
Open this post in threaded view
|

Re: Sorting in a WindowedDataStream

Márton Balassi
Dear Niklas,

To do that you can use WindowedDataStream.mapWindow(). This gives you an iterator to all the records in the window and you can do whatever you wish with them.

One thing to note if sorting windows of the stream might add considerable latency to your job. 

Best,

Marton

On Tue, Apr 14, 2015 at 12:12 PM, Niklas Semmler <[hidden email]> wrote:
Hello there,

What functions should be used to aggregate (unordered) tuples for every window in a WindowedDataStream to a (ordered) list?

Neither foldWindow nor reduceWindow seems to be applicable, and aggregate does not, to my understanding, take user-defined functions.

To get started with flink I am experimenting with the WindowedDataStream. My goal is to naively compute the top-k from a stream of key-value tuples.

From: Input over SocketTextStream
previous window, (a, 1), (b, 3), (c, 2), next window

To:
[previous window], [(b, 3), (c, 2), (a, 1)], [next window]

Is this behavior not possible?

Best,
Niklas

--
PhD Student / Research Assistant
INET, TU Berlin
Room 4.029
Marchstr 23
10587 Berlin
Tel: <a href="tel:%2B49%2030%20314%2078752" value="+493031478752" target="_blank">+49 30 314 78752