Hi: Apache Flink documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/index.html) indicates that a reduce function on a KeyedStream as follows: A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and emits the new value. A reduce function that creates a stream of partial sums:
The KeyedStream is not windowed, so when does the reduce function kick in to produce the DataStream (ie, is there a default time out, or collection size that triggers it, since we have not defined any window on it). Thanks Mans |
Hi,
I would interpret this as: the reduce produces an output for every new reduce call, emitting the updated value. There is no need for a window because it kicks in on every single invocation. Best, Stefan
|
Hi Stefan: Thanks for your response. A follow up question - In a streaming environment, we invoke the operation reduce and then output results to the sink. Does this mean reduce will be executed once on every trigger per partition with all the items in each partition ? Thanks On Wednesday, January 3, 2018 2:46 AM, Stefan Richter <[hidden email]> wrote: Hi, I would interpret this as: the reduce produces an output for every new reduce call, emitting the updated value. There is no need for a window because it kicks in on every single invocation. Best, Stefan
|
Hi, the ReduceFunction holds the last emitted record as state. When a new record arrives, it reduces the new record and last emitted record, updates its state, and emits the new result.2018-01-03 18:46 GMT+01:00 M Singh <[hidden email]>:
|
Hi Fabian: Thanks for your answer - it is starting to make sense to me now. On Thursday, January 4, 2018 12:58 AM, Fabian Hueske <[hidden email]> wrote: Hi, the ReduceFunction holds the last emitted record as state. When a new record arrives, it reduces the new record and last emitted record, updates its state, and emits the new result.2018-01-03 18:46 GMT+01:00 M Singh <[hidden email]>:
|
Free forum by Nabble | Edit this page |