Hi,
We got 3 Flink jobs running on a 10-node YARN cluster. The jobs were submitted in a per-job flavor, with same parallelism (10) and number of slots per TM (2). We originally assumed that TMs should automatically spread across the cluster, but what came out was just the opposite: All 5 TMs from one job simply went into one single node, thus leaving 7 nodes (almost) idle, and 3 nodes under pressure. Is there some way to have those TMs evenly distributed? Many thanks. |
Hi Qi Kang, If you means to spread out all taskmanager evenly across the yarn cluster, it is a pity that flink could do nothing. Each per-job flink cluster is an individual application on the yarn cluster, they do not know the existence of others. Could share the yarn version? If it is above hadoop-3.x, then you should set the yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled=false to avoid assign multiple containers to one nodemanager in a hearbeat. Best, Yang Qi Kang <[hidden email]> 于2019年8月26日周一 下午4:52写道: Hi, |
Hi Yang,
Many thanks for your detailed explanation. We are using Hadoop 2.6.5, so setting multiple-assignments-enabled parameter is not an option. BTW, do you prefer using YARN session cluster rather than per-job cluster under this situation? These YARN nodes are almost dedicated to Flink jobs, so no other services are involved.
|
Hi Qi, If you want to get better isolation between different flink jobs and multi-tenant support, i suggest you to use the per-job mode. Each flink job is a yarn application, and yarn use cgroup to limit the resource used by each application. Best, Yang Qi Kang <[hidden email]> 于2019年8月26日周一 下午9:02写道:
|
Free forum by Nabble | Edit this page |