MapSate within Aggregate function

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

MapSate within Aggregate function

hassahma
Hi,

We have SlidingProcessingTimeWindows running with AggregateFunction and Window Function. How we use MapState within AggregateFunction to keep storing incoming elements as we receive Millions of elements over 24 running sliding windows ?

If we don't do that then AggregateFunction state grows bigger and checkpointing takes very long and fail to checkpoint frequently.

Best,
Reply | Threaded
Open this post in threaded view
|

Re: MapSate within Aggregate function

Congxian Qiu
Hi Ahmad

AFAIK, Flink currently does not support stores incoming elements to a MapState, maybe the window function[1] can be help

Best,
Congxian


Ahmad Hassan <[hidden email]> 于2019年7月25日周四 下午5:58写道:
Hi,

We have SlidingProcessingTimeWindows running with AggregateFunction and Window Function. How we use MapState within AggregateFunction to keep storing incoming elements as we receive Millions of elements over 24 running sliding windows ?

If we don't do that then AggregateFunction state grows bigger and checkpointing takes very long and fail to checkpoint frequently.

Best,
Reply | Threaded
Open this post in threaded view
|

Re: MapSate within Aggregate function

hassahma
Hi Congzian,

My understanding is that if I use AggregateFunction and have Million of unique elements coming in for the duration of 24hour, then the state of AggregateFunction will grow huge with those million entries and the checkpointing would take longer and longer. I thought if i could use MapState to store incoming elements then MapState only load individual keys and the checkpoint size will remain small. 

Best,

On Fri, 26 Jul 2019 at 05:16, Congxian Qiu <[hidden email]> wrote:
Hi Ahmad

AFAIK, Flink currently does not support stores incoming elements to a MapState, maybe the window function[1] can be help

Best,
Congxian


Ahmad Hassan <[hidden email]> 于2019年7月25日周四 下午5:58写道:
Hi,

We have SlidingProcessingTimeWindows running with AggregateFunction and Window Function. How we use MapState within AggregateFunction to keep storing incoming elements as we receive Millions of elements over 24 running sliding windows ?

If we don't do that then AggregateFunction state grows bigger and checkpointing takes very long and fail to checkpoint frequently.

Best,