I have a cluster environment, I need aggregate dataStream on it.
I`m wonder whether I can aggregate in local server first, then aggregate in
global.
When I aggregate dataStream in global, the Network IO will increase fast.
I just want decrease the Network IO, So I need aggregate in local server
first.
How can I do it.
DataStream<String> dataIn....
dataIn.map().filter().assignTimestampsAndWatermarks().keyBy().window().Fold()
--
Sent from:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/