Looking over the code, I see that Flink creates a TimeWindow object each time the WindowAssigner is created. I have not yet tested this, but I am wondering if this can become problematic if you have a very long sliding window with a small slide, such as a 24 hour window with a 1 minute slide. It seems this would create 1,440 TimeWindow objects per event. Event a low event rates this would seem to result in an explosion of TimeWindow objects: at 1,000 events per second, you'd be creating 1,440,000 TImeWindow objects. After 24 hours you'd have nearly 125 billion TM objects that would just begin to be purged.
Does this analysis seem right? I suppose that means you should not use long length sliding window with small slides. |
Hi Elias! There is a feature pending that uses an optimized version for aligned time windows. In that case, elements would go into a single window pane, and the full window would be composed of all panes it spans (in the case of sliding windows). That should help a lot in those cases. The default window mechanism does it that way, because is supports unaligned windows (where each key has a different window start and endpoint) and it supports completely custom window assigners. Greetings, Stephan On Tue, May 3, 2016 at 4:07 AM, Elias Levy <[hidden email]> wrote:
|
Hi, even with the optimized operator for aligned time windows I would advice against using long sliding windows with a small slide. The system will internally create a lot of "buckets", i.e. each sliding window is treated separately and the element is put into 1,440 buckets, in your case. With a moderate amount of different keys this can very quickly lead to a lot of created window buckets. You can think of it in terms of write amplification. If you have tumbling windows you basically have no amplification, if you have sliding windows you have window processing overhead for every slide. Cheers, Aljoscha On Tue, 3 May 2016 at 09:05 Stephan Ewen <[hidden email]> wrote:
|
Just had a quick chat with Aljoscha... The first version of the aligned window code will still duplicate the elements, later versions should be able to get rid of that. On Tue, May 3, 2016 at 11:10 AM, Aljoscha Krettek <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |