** Help need w.r.t parallelism settings in flink **

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

** Help need w.r.t parallelism settings in flink **

Akshay Iyangar

Hi 

So we are running a beam pipeline that uses flink as its execution engine. We are currently on flink1.8

So per the flink documentation I see that there is an option that allows u to set

 

Parallelism 

and 

maxParallelism.

 

We actually want to set both so that we can dynamically scale the pipeline if there is back pressure on it.

 

I want to clarify a few doubts I had –

 

  1. Does the maxParallelism only work for stateful operators or does it work for all the operators?

 

  1. Also is the setting either or ? Like we can only set parallelism or maxParallelism? Or is it that we can set both?

 

  1. Because, Currently I have the parallelism at 8 and maxparallelism at 32 but the UI only shows me that it is using parallelism of 8. It would be great if someone helps me understand the exact behavior of these parameters.

 

Thanks 

Akshay Iyangar

 

Reply | Threaded
Open this post in threaded view
|

Re: ** Help need w.r.t parallelism settings in flink **

Zhu Zhu
Hi Akshay,

For your questions,
1. One main purpose of maxParallelism is to decide the count of keyGroup. keyGroup is the bucket for keys when doing keyBy partitioning.
So a larger maxParallelism indicates a finer granularity for key distribution. No matter it's a stateful operator or not.

2. You can only set the parallelism and Flink can automatically decide the maxParallelism for it. 
But it is recommended to set the maxParallelism to a fixed proper value.

3. The parallelism is the actually parallelism used. 
maxParallelism is the upper bound limit of parallelism when you tries to change the parallelism via manually rescaling.

Thanks,
Zhu Zhu

Akshay Iyangar <[hidden email]> 于2019年9月27日周五 上午4:20写道:

Hi 

So we are running a beam pipeline that uses flink as its execution engine. We are currently on flink1.8

So per the flink documentation I see that there is an option that allows u to set

 

Parallelism 

and 

maxParallelism.

 

We actually want to set both so that we can dynamically scale the pipeline if there is back pressure on it.

 

I want to clarify a few doubts I had –

 

  1. Does the maxParallelism only work for stateful operators or does it work for all the operators?

 

  1. Also is the setting either or ? Like we can only set parallelism or maxParallelism? Or is it that we can set both?

 

  1. Because, Currently I have the parallelism at 8 and maxparallelism at 32 but the UI only shows me that it is using parallelism of 8. It would be great if someone helps me understand the exact behavior of these parameters.

 

Thanks 

Akshay Iyangar