WindowedStream aggregation methods pre-aggregate?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

WindowedStream aggregation methods pre-aggregate?

Elias Levy
Can someone confirm whether the org.apache.flink.streaming.api.scala.WindowedStream methods other than "apply" (e.g. "sum") perform pre-aggregation?  The API docs are silent on this.


Reply | Threaded
Open this post in threaded view
|

Re: WindowedStream aggregation methods pre-aggregate?

Fabian Hueske-2
Hi Elias,

yes, reduce, fold, and the aggregation functions (sum, min, max, minBy, maxBy) on WindowedStream preform eager aggregation, i.e., the functions are apply for each value that enters the window and the state of the window will consist of a single value. In case you need access to the Window object (e.g., to include the start or end time), there are overloaded versions of apply that take a ReduceFunction or FoldFunction and an additional WindowFunction. These versions eagerly apply the Reduce or FoldFunction and finally the WindowFunction when the window is closed on the aggregated value (the iterator will serve a single value).

Cheers, Fabian

2016-05-28 0:48 GMT+02:00 Elias Levy <[hidden email]>:
Can someone confirm whether the org.apache.flink.streaming.api.scala.WindowedStream methods other than "apply" (e.g. "sum") perform pre-aggregation?  The API docs are silent on this.