Is it OK to have very many session windows?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Is it OK to have very many session windows?

Vadim Vararu
HI guys,

Is it okay to have very many (tens of thousands or hundreds of thousand)
of session windows?


Thanks, Vadim.

Reply | Threaded
Open this post in threaded view
|

Re: Is it OK to have very many session windows?

Timo Walther
Hi Vadim,

this of course depends on your use case. The question is how large is
your state per pane and how much memory is available for Flink?
Are you using incremental aggregates such that only the aggregated value
per pane has to be kept in memory?

Regards,
Timo


Am 20/02/17 um 16:34 schrieb Vadim Vararu:
> HI guys,
>
> Is it okay to have very many (tens of thousands or hundreds of
> thousand) of session windows?
>
>
> Thanks, Vadim.


Reply | Threaded
Open this post in threaded view
|

Re: Is it OK to have very many session windows?

Vadim Vararu

It's something like:

DataStreamSource<String> stream = env.addSource(getKafkaConsumer(parameterTool));

stream
        .map(getEventToDomainMapper())
        .keyBy(getKeySelector())
        .window(ProcessingTimeSessionWindows.withGap(Time.minutes(90)))
        .reduce(getReducer())
        .map(getToJsonMapper())
        .addSink(getKafkaProducer(parameterTool));

Each new event may be reduced against the existent state from the window, so normally it's okay to have only 1 row in memory.

I'm new to Flink and have not yet reached the "incremental aggregates" but, if i understand correctly, it fits my case.

Vadim.



On 20.02.2017 17:47, Timo Walther wrote:
Hi Vadim,

this of course depends on your use case. The question is how large is your state per pane and how much memory is available for Flink?
Are you using incremental aggregates such that only the aggregated value per pane has to be kept in memory?

Regards,
Timo


Am 20/02/17 um 16:34 schrieb Vadim Vararu:
HI guys,

Is it okay to have very many (tens of thousands or hundreds of thousand) of session windows?


Thanks, Vadim.



Reply | Threaded
Open this post in threaded view
|

Re: Is it OK to have very many session windows?

Stephan Ewen
With pre-aggregation (which the Reduce does), Flink can handle many windows and many keys, as long as you have the memory and storage to support that.
Your case should work.


On Mon, Feb 20, 2017 at 4:58 PM, Vadim Vararu <[hidden email]> wrote:

It's something like:

DataStreamSource<String> stream = env.addSource(getKafkaConsumer(parameterTool));

stream
        .map(getEventToDomainMapper())
        .keyBy(getKeySelector())
        .window(ProcessingTimeSessionWindows.withGap(Time.minutes(90)))
        .reduce(getReducer())
        .map(getToJsonMapper())
        .addSink(getKafkaProducer(parameterTool));

Each new event may be reduced against the existent state from the window, so normally it's okay to have only 1 row in memory.

I'm new to Flink and have not yet reached the "incremental aggregates" but, if i understand correctly, it fits my case.

Vadim.




On 20.02.2017 17:47, Timo Walther wrote:
Hi Vadim,

this of course depends on your use case. The question is how large is your state per pane and how much memory is available for Flink?
Are you using incremental aggregates such that only the aggregated value per pane has to be kept in memory?

Regards,
Timo


Am 20/02/17 um 16:34 schrieb Vadim Vararu:
HI guys,

Is it okay to have very many (tens of thousands or hundreds of thousand) of session windows?


Thanks, Vadim.