On 29 June 2017 at 5:11:58 PM, Ahmad Hassan ([hidden email]) wrote:
Any thoughts on this problem please?
Hi All,I am collecting millions of events per 24hour for 'N' number of products where 'N' can be 50k. I use the following fold mechanism with sliding window:final DataStream<WindowStats> eventStream = inputStream
.keyBy(TENANT, CATEGORY)
.window(SlidingProcessingTimeWindows. of(Time.hour(24,Time.minute(5 )))
.fold(new WindowStats(), newProductAggregationMapper(), newProductAggregationWindowFunction()); In WindowStats class, I keep a map of HashMap<String productID, ProductMetric ProductMetric>> which keeps products event count and other various metrics. So for 50k products I will have 50k entries in the map within WindowStats instance instead of millions of Events as fold function will process them as the event arrives.My question is, if I set (env.enableCheckpointing(1000)), then the WindowStats instance for each existing window will automatically be checkpointed and restored on recovery? If not then how can I better a implement above usecase to store the state of WindowStats object within fold operation please?Thanks for all the help.Best Regards,
Free forum by Nabble | Edit this page |