Hi Aeden,
The maxParallelism option defines the number of key groups that will be
created within the keyed state and thus define the maximum parallelism
that a Flink keyed job can scale up to as each key group must be
atomically assigned to a single task. You can read more on how the
rescaling works in this blogpost[1].
Following up on your other questions it is mainly a reservation as of
now, but it will definitely be a cap in case of a reactive/auto scaling
because of the above.
Best,
Dawid
[1]
https://flink.apache.org/features/2017/07/04/flink-rescalable-state.htmlOn 18/03/2021 17:40, Aeden Jameson wrote:
> I'm trying to get my head around the impact of setting max parallelism.
>
> * Does max parallelism primarily serve as a reservation for future
> increases to parallelism? The reservation being the ability to restore
> from checkpoints and savepoints after increases to parallelism.
>
> * Does it serve as a runtime suggestion for how many instances of an
> operator the job could spin up? Or is it just a reservation like I
> asked above?
>
> * It also appears to impact the distribution of key groups among
> subtasks from what I've read and seen from testing. Is that
> understanding correct?
>
> * What are the other important implications?
>
>
> Thank you,
> Aeden