If your requirement is that O_i will be executed in the same slot as P_i, then you have to add the corresponding
JobVerticesto aCoLocationGroup. At the moment this is not really exposed but you could try to get theJobGraphfrom theStreamGraph.getJobGraphand then useJobGraph.getVerticesto get theJobVertices. Then you have to find out whichJobVerticesaccommodate your operators. Once this is done, you can colocate them via theJobVertex.setStrictlyCoLocatedWithmethod. This might solve your problem, but I haven’t tested it myself.
Hoping someone with actual knowledge of the task to slot allocation logic can chime in here with a solution :)
— Ken
On Apr 18, 2018, at 9:10 AM, PedroMrChaves <[hidden email]> wrote:Hello,
I have a job that has one async operational node (i.e. implements
AsyncFunction). This Operational node will spawn multiple threads that
perform heavy tasks (cpu bound).
I have a Flink Standalone cluster deployed on two machines of 32 cores and
128 gb of RAM, each machine has one task manager and one Job Manager. When I
deploy the job, all of the subtasks from the async operational node end up
on the same machine, which causes it to have a much higher cpu load then the
other.
I've researched ways to overcome this issue, but I haven't found a solution
to my problem.
Ideally, the subtasks would be evenly split across both machines.
Can this problem be solved somehow?
Regards,
Pedro Chaves.
-----
Best Regards,
Pedro Chaves
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
| Free forum by Nabble | Edit this page |