If your requirement is that O_i will be executed in the same slot as P_i, then you have to add the corresponding
JobVertices
to aCoLocationGroup
. At the moment this is not really exposed but you could try to get theJobGraph
from theStreamGraph.getJobGraph
and then useJobGraph.getVertices
to get theJobVertices
. Then you have to find out whichJobVertices
accommodate your operators. Once this is done, you can colocate them via theJobVertex.setStrictlyCoLocatedWith
method. This might solve your problem, but I haven’t tested it myself.
Hoping someone with actual knowledge of the task to slot allocation logic can chime in here with a solution :)
— Ken
On Apr 18, 2018, at 9:10 AM, PedroMrChaves <[hidden email]> wrote:Hello,
I have a job that has one async operational node (i.e. implements
AsyncFunction). This Operational node will spawn multiple threads that
perform heavy tasks (cpu bound).
I have a Flink Standalone cluster deployed on two machines of 32 cores and
128 gb of RAM, each machine has one task manager and one Job Manager. When I
deploy the job, all of the subtasks from the async operational node end up
on the same machine, which causes it to have a much higher cpu load then the
other.
I've researched ways to overcome this issue, but I haven't found a solution
to my problem.
Ideally, the subtasks would be evenly split across both machines.
Can this problem be solved somehow?
Regards,
Pedro Chaves.
-----
Best Regards,
Pedro Chaves
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Free forum by Nabble | Edit this page |