Understanding Max Parallelism

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Understanding Max Parallelism

Aeden Jameson
I'm trying to get my head around the impact of setting max parallelism.

* Does max parallelism primarily serve as a reservation for future
increases to parallelism? The reservation being the ability to restore
from checkpoints and savepoints after increases to parallelism.

* Does it serve as a runtime suggestion for how many instances of an
operator the job could spin up? Or is it just a reservation like I
asked above?

* It also appears to impact the distribution of key groups among
subtasks from what I've read and seen from testing. Is that
understanding correct?

* What are the other important implications?


Thank you,
Aeden
Reply | Threaded
Open this post in threaded view
|

Re: Understanding Max Parallelism

Dawid Wysakowicz-2
Hi Aeden,

The maxParallelism option defines the number of key groups that will be
created within the keyed state and thus define the maximum parallelism
that a Flink keyed job can scale up to as each key group must be
atomically assigned to a single task. You can read more on how the
rescaling works in this blogpost[1].

Following up on your other questions it is mainly a reservation as of
now, but it will definitely be a cap in case of a reactive/auto scaling
because of the above.

Best,

Dawid

[1] https://flink.apache.org/features/2017/07/04/flink-rescalable-state.html

On 18/03/2021 17:40, Aeden Jameson wrote:

> I'm trying to get my head around the impact of setting max parallelism.
>
> * Does max parallelism primarily serve as a reservation for future
> increases to parallelism? The reservation being the ability to restore
> from checkpoints and savepoints after increases to parallelism.
>
> * Does it serve as a runtime suggestion for how many instances of an
> operator the job could spin up? Or is it just a reservation like I
> asked above?
>
> * It also appears to impact the distribution of key groups among
> subtasks from what I've read and seen from testing. Is that
> understanding correct?
>
> * What are the other important implications?
>
>
> Thank you,
> Aeden


OpenPGP_signature (855 bytes) Download Attachment