Hello, This may be premature optimization for memory usage but here is my question : I have to do an app that will have to monitor sessions of (millions of) users. I don’t know when the session starts nor ends, nor a reasonable maximum duration. I want to have a maximum duration (timeout) of 24H. However I’d like to be able to PURGE sessions that ended as soon as possible to free memory. I use a Trigger to trig my WindowFunction for each input EVENT. It will (when relevant) output a SESSION (with start and end timestamps). I use the Evictor in
order to remove the EVENT used to build the SESSION (they have a Boolean field “removable” set to true from the WindowFunction so that the Evictor knows it can remove them in the evictAfter method)… That way at least I can clean the content of the windows. However From what I’m seeing it looks like the window instance will still stay alive (even if empty) until it reaches its maximum duration (24 hours) even if the
session it represents lasted 2 minutes: at the end of the day I might have millions of sessions in memory when in reality only thousands are really alive at a given time. That might also really slow down the backups and restores of the application if it needs
to store millions of empty windows. I’m aware that my need looks like the session windows. But the session window works mainly by merging windows that overlap within a session gap. My session gap
can be multiple hours long, so I’m afraid that it would not help me… So my question is : is there a way to inform the “Trigger” that the windows has no more elements and that it can be PURGED. Or a way for a WindowFunction to “kill”
the window it’s being applied on ? Of course my window might be re-created if new events arrive later for the same key. My other option is to simply use a flatmap operator that will hold an HashMap of sessions, that way I might be able to clean it up when I close my sessions, but
I think it would be prettier to rely on Flink’s Windows ;-) Thanks in advance, |
Hi Gwenhael, have you considered to use a ProcessFunction? With a ProcessFunction you have direct access to state (that you register yourself) and you can register timers that trigger a callback function when they expire.So you can cleanup state when you receive an element or when a timer expires. 2017-06-23 18:46 GMT+02:00 Gwenhael Pasquiers <[hidden email]>:
|
Thanks ! I didn’t know of this function and indeed it seems to match my needs better than Windows. And I’ll be able to clear my state once it’s empty (and re-create it when necessary). B.R. From: Fabian Hueske [mailto:[hidden email]]
Hi Gwenhael, have you considered to use a ProcessFunction? With a ProcessFunction you have direct access to state (that you register yourself) and you can register timers that trigger a callback function when they expire. So you can cleanup state when you receive an element or when a timer expires. Best, Fabian 2017-06-23 18:46 GMT+02:00 Gwenhael Pasquiers <[hidden email]>:
|
Free forum by Nabble | Edit this page |