|
Hi All,
Given one stream source which generates 20k events/sec, and I need to
aggregate the element count using sliding window of 1 hour size.
The problem is, the window may buffer too many elements (which may
cause a lot of block I/O because of checkpointing?), and in fact it
does not necessary to store them for one hour, because the elements
should get folded incrementally. But unlike Tumbling Window, the
sliding window would save elements for next window, right?
So I am considering kind of workaround, should I chain two window like below:
.timeWindow(Time.minutes(1))
...
.timeWindow(Time.hours(1), Time.minutes(1))
Here the first window generate 1 minute aggregation units and the
second window provides the sliding output.
Any suggestions? Thanks.
|