How can TMs distribute evenly over Flink on YARN cluster?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How can TMs distribute evenly over Flink on YARN cluster?

Qi Kang
Hi,


We got 3 Flink jobs running on a 10-node YARN cluster. The jobs were submitted in a per-job flavor, with same parallelism (10) and number of slots per TM (2).

We originally assumed that TMs should automatically spread across the cluster, but what came out was just the opposite: All 5 TMs from one job simply went into one single node, thus leaving 7 nodes (almost) idle, and 3 nodes under pressure.

Is there some way to have those TMs evenly distributed? Many thanks.



Reply | Threaded
Open this post in threaded view
|

Re: How can TMs distribute evenly over Flink on YARN cluster?

Yang Wang
Hi Qi Kang,

If you means to spread out all taskmanager evenly across the yarn cluster, it is a pity that flink could do nothing. 
Each per-job flink cluster is an individual application on the yarn cluster, they do not know the existence of others.

Could share the yarn version? If it is above hadoop-3.x, then you should set the 
yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled=false
to avoid assign multiple containers to one nodemanager in a hearbeat.


Best,
Yang

Qi Kang <[hidden email]> 于2019年8月26日周一 下午4:52写道:
Hi,


We got 3 Flink jobs running on a 10-node YARN cluster. The jobs were submitted in a per-job flavor, with same parallelism (10) and number of slots per TM (2).

We originally assumed that TMs should automatically spread across the cluster, but what came out was just the opposite: All 5 TMs from one job simply went into one single node, thus leaving 7 nodes (almost) idle, and 3 nodes under pressure.

Is there some way to have those TMs evenly distributed? Many thanks.



Reply | Threaded
Open this post in threaded view
|

Re: How can TMs distribute evenly over Flink on YARN cluster?

Qi Kang
Hi Yang,


Many thanks for your detailed explanation. We are using Hadoop 2.6.5, so setting multiple-assignments-enabled parameter is not an option. 

BTW, do you prefer using YARN session cluster rather than per-job cluster under this situation? These YARN nodes are almost dedicated to Flink jobs, so no other services are involved. 


On Aug 26, 2019, at 18:21, Yang Wang <[hidden email]> wrote:

Hi Qi Kang,

If you means to spread out all taskmanager evenly across the yarn cluster, it is a pity that flink could do nothing. 
Each per-job flink cluster is an individual application on the yarn cluster, they do not know the existence of others.

Could share the yarn version? If it is above hadoop-3.x, then you should set the 
yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled=false
to avoid assign multiple containers to one nodemanager in a hearbeat.


Best,
Yang

Qi Kang <[hidden email]> 于2019年8月26日周一 下午4:52写道:
Hi,


We got 3 Flink jobs running on a 10-node YARN cluster. The jobs were submitted in a per-job flavor, with same parallelism (10) and number of slots per TM (2).

We originally assumed that TMs should automatically spread across the cluster, but what came out was just the opposite: All 5 TMs from one job simply went into one single node, thus leaving 7 nodes (almost) idle, and 3 nodes under pressure.

Is there some way to have those TMs evenly distributed? Many thanks.




Reply | Threaded
Open this post in threaded view
|

Re: How can TMs distribute evenly over Flink on YARN cluster?

Yang Wang
Hi Qi,

If you want to get better isolation between different flink jobs and multi-tenant support, 

i suggest you to use the per-job mode. Each flink job is a yarn application, 

and yarn use cgroup to limit the resource used by each application. 



Best,

Yang


Qi Kang <[hidden email]> 于2019年8月26日周一 下午9:02写道:
Hi Yang,


Many thanks for your detailed explanation. We are using Hadoop 2.6.5, so setting multiple-assignments-enabled parameter is not an option. 

BTW, do you prefer using YARN session cluster rather than per-job cluster under this situation? These YARN nodes are almost dedicated to Flink jobs, so no other services are involved. 


On Aug 26, 2019, at 18:21, Yang Wang <[hidden email]> wrote:

Hi Qi Kang,

If you means to spread out all taskmanager evenly across the yarn cluster, it is a pity that flink could do nothing. 
Each per-job flink cluster is an individual application on the yarn cluster, they do not know the existence of others.

Could share the yarn version? If it is above hadoop-3.x, then you should set the 
yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled=false
to avoid assign multiple containers to one nodemanager in a hearbeat.


Best,
Yang

Qi Kang <[hidden email]> 于2019年8月26日周一 下午4:52写道:
Hi,


We got 3 Flink jobs running on a 10-node YARN cluster. The jobs were submitted in a per-job flavor, with same parallelism (10) and number of slots per TM (2).

We originally assumed that TMs should automatically spread across the cluster, but what came out was just the opposite: All 5 TMs from one job simply went into one single node, thus leaving 7 nodes (almost) idle, and 3 nodes under pressure.

Is there some way to have those TMs evenly distributed? Many thanks.