Flink 1.5 job distribution over cluster nodes

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink 1.5 job distribution over cluster nodes

scarmeli
Hi,

We have 4 jobs with parallelism 3 that are running over 3 task manager with 4 slots per each , each task manager runs on a different VM ,

On Flink 1.3.2 the jobs were evenly distributed per node each job took one task slot of each task manager .

 After upgrading to flink 1.5 , each job is running on a single task manager (with a carry over to another if there are no slots left) 

The jobs are not evenly by load which cause some task managers  to consume more resources (CPU/memory) than other task managers. 

Is there a way to return to an even distribution? 

Thanks,

Shachar


Reply | Threaded
Open this post in threaded view
|

Re: Flink 1.5 job distribution over cluster nodes

scarmeli
Answered in another mailing list

Hi Shachar,

with Flink 1.5 we added resource elasticity. This means that Flink is now
able to allocate new containers on a cluster management framework like Yarn
or Mesos. Due to these changes (which also apply to the standalone mode),
Flink no longer reasons about a fixed set of TaskManagers because if needed
it will start new containers (does not work in standalone mode). Therefore,
it is hard for the system to make any decisions about spreading slots
belonging to a single job out across multiple TMs. It gets even harder when
you consider that some jobs like yours might benefit from such a strategy
whereas others would benefit from co-locating its slots. It gets even more
complicated if you want to do scheduling wrt to multiple jobs which the
system does not have full knowledge about because they are submitted
sequentially. Therefore, Flink currently assumes that slots requests can be
fulfilled by any TaskManager.

Cheers,
Till



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink 1.5 job distribution over cluster nodes

Vishal Santoshi
I think there is something to be said about making this distribution more flexible.  A stand alone cluster, still the distribution mechanism for many a installations suffers horribly with the above approach. A healthy cluster requires resources wot be used equitably is possible. I have some pipes that are CPU intense and some not so much. Slots being the primary unit of parallelism does not differentiate that much between a stress inducing pipe or other wise and thus having them spread out becomes essential to avoid skews.

Maybe a simple setting could have forked different approaches to slot distribution. standalone = true for example.  I have a n node cluster with 2 nodes being heavily stressed ( LA approaching the slot size ) b'coz of this new setup and that does not seem right. Ot is still a physical node that that these process run on with local drives and so on.






On Wed, Jul 18, 2018, 7:54 AM scarmeli <[hidden email]> wrote:
Answered in another mailing list

Hi Shachar,

with Flink 1.5 we added resource elasticity. This means that Flink is now
able to allocate new containers on a cluster management framework like Yarn
or Mesos. Due to these changes (which also apply to the standalone mode),
Flink no longer reasons about a fixed set of TaskManagers because if needed
it will start new containers (does not work in standalone mode). Therefore,
it is hard for the system to make any decisions about spreading slots
belonging to a single job out across multiple TMs. It gets even harder when
you consider that some jobs like yours might benefit from such a strategy
whereas others would benefit from co-locating its slots. It gets even more
complicated if you want to do scheduling wrt to multiple jobs which the
system does not have full knowledge about because they are submitted
sequentially. Therefore, Flink currently assumes that slots requests can be
fulfilled by any TaskManager.

Cheers,
Till



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink 1.5 job distribution over cluster nodes

Vishal Santoshi
For example state size / logs etc all these are now one one physical node ( or couple rather then spread out )  for that one pipe where we desire to have a large state.  We have decreased the slots ( pretty artificial a set up ) to give a node less stress. I wish there was a RFC from the wider community on this, IMveryHO.

On Wed, Jul 18, 2018 at 9:43 AM, Vishal Santoshi <[hidden email]> wrote:
I think there is something to be said about making this distribution more flexible.  A stand alone cluster, still the distribution mechanism for many a installations suffers horribly with the above approach. A healthy cluster requires resources wot be used equitably is possible. I have some pipes that are CPU intense and some not so much. Slots being the primary unit of parallelism does not differentiate that much between a stress inducing pipe or other wise and thus having them spread out becomes essential to avoid skews.

Maybe a simple setting could have forked different approaches to slot distribution. standalone = true for example.  I have a n node cluster with 2 nodes being heavily stressed ( LA approaching the slot size ) b'coz of this new setup and that does not seem right. Ot is still a physical node that that these process run on with local drives and so on.






On Wed, Jul 18, 2018, 7:54 AM scarmeli <[hidden email]> wrote:
Answered in another mailing list

Hi Shachar,

with Flink 1.5 we added resource elasticity. This means that Flink is now
able to allocate new containers on a cluster management framework like Yarn
or Mesos. Due to these changes (which also apply to the standalone mode),
Flink no longer reasons about a fixed set of TaskManagers because if needed
it will start new containers (does not work in standalone mode). Therefore,
it is hard for the system to make any decisions about spreading slots
belonging to a single job out across multiple TMs. It gets even harder when
you consider that some jobs like yours might benefit from such a strategy
whereas others would benefit from co-locating its slots. It gets even more
complicated if you want to do scheduling wrt to multiple jobs which the
system does not have full knowledge about because they are submitted
sequentially. Therefore, Flink currently assumes that slots requests can be
fulfilled by any TaskManager.

Cheers,
Till



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/