Task managers run on separate nodes in a cluster

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Task managers run on separate nodes in a cluster

Martin Eden
Hi all,

We're using Flink 1.3.2 with DCOS / Mesos.

We have a 3 node cluster and are running the Flink DCOS package (Flink Mesos framework) configured with 3 Task Managers.

Our goal is to run each of them on separate hosts for better load balancing but it seems the task managers end up running on the same host.

Looked around the docs and DCOS Flink package but could not find any placement policy or anything of the sorts.

Is there anything like that?

We are also planning to upgrade to the latest Flink version. Is something like that supported in this newer version?

Thanks,
M
Reply | Threaded
Open this post in threaded view
|

Re: Task managers run on separate nodes in a cluster

vino yang
Hi Martin,

Till has done most of the work of Flink on Mesos. Ping Till for you.

Thanks, vino.

Martin Eden <[hidden email]> 于2018年9月12日周三 下午11:21写道:
Hi all,

We're using Flink 1.3.2 with DCOS / Mesos.

We have a 3 node cluster and are running the Flink DCOS package (Flink Mesos framework) configured with 3 Task Managers.

Our goal is to run each of them on separate hosts for better load balancing but it seems the task managers end up running on the same host.

Looked around the docs and DCOS Flink package but could not find any placement policy or anything of the sorts.

Is there anything like that?

We are also planning to upgrade to the latest Flink version. Is something like that supported in this newer version?

Thanks,
M
Reply | Threaded
Open this post in threaded view
|

Re: Task managers run on separate nodes in a cluster

Martin Eden
Thanks Vino!

On Fri, Sep 14, 2018 at 3:37 AM vino yang <[hidden email]> wrote:
Hi Martin,

Till has done most of the work of Flink on Mesos. Ping Till for you.

Thanks, vino.

Martin Eden <[hidden email]> 于2018年9月12日周三 下午11:21写道:
Hi all,

We're using Flink 1.3.2 with DCOS / Mesos.

We have a 3 node cluster and are running the Flink DCOS package (Flink Mesos framework) configured with 3 Task Managers.

Our goal is to run each of them on separate hosts for better load balancing but it seems the task managers end up running on the same host.

Looked around the docs and DCOS Flink package but could not find any placement policy or anything of the sorts.

Is there anything like that?

We are also planning to upgrade to the latest Flink version. Is something like that supported in this newer version?

Thanks,
M
Reply | Threaded
Open this post in threaded view
|

Re: Task managers run on separate nodes in a cluster

Till Rohrmann
Hi Martin,

Flink supports the mesos.constraints.hard.hostattribute to specify task constraints based on agent attributes [1]. I think you could use them to control the task placement.


Cheers,
Till

On Fri, Sep 14, 2018 at 3:08 PM Martin Eden <[hidden email]> wrote:
Thanks Vino!

On Fri, Sep 14, 2018 at 3:37 AM vino yang <[hidden email]> wrote:
Hi Martin,

Till has done most of the work of Flink on Mesos. Ping Till for you.

Thanks, vino.

Martin Eden <[hidden email]> 于2018年9月12日周三 下午11:21写道:
Hi all,

We're using Flink 1.3.2 with DCOS / Mesos.

We have a 3 node cluster and are running the Flink DCOS package (Flink Mesos framework) configured with 3 Task Managers.

Our goal is to run each of them on separate hosts for better load balancing but it seems the task managers end up running on the same host.

Looked around the docs and DCOS Flink package but could not find any placement policy or anything of the sorts.

Is there anything like that?

We are also planning to upgrade to the latest Flink version. Is something like that supported in this newer version?

Thanks,
M
Reply | Threaded
Open this post in threaded view
|

Re: Task managers run on separate nodes in a cluster

Martin Eden
Hi Till,

I was able to use mesos.constraints.hard.hostattribute to run all task managers on a particular host in my cluster.

However, after looking a bit at the code, I'm not sure we can use mesos.constraints.hard.hostattribute for load balancing Flink task managers evenly across hosts in a Mesos cluster.

This is because under the hood it uses the fenzo host attribute value constraint while we would need the fenzo balanced host attribute constraint.

The LaunchableMesosWorker sets the constraints via the com.netflix.fenzo.TaskRequest and all of these hard constraints must be satisfied by a host for the task scheduler to assign this task to that host. Since the current implementation always return the static constraint value configured i.e. what is after ":", see org.apache.flink.mesos.runtime.clusterframework.MesosTaskManagerParameters#addHostAttrValueConstraint, I don't see how we can use it to load balance unless the constraint value would be dynamic based on the some property of the mesos task request.

Am I correct in my assumptions?

Any other way of load balancing? 
Maybe by not even using the DCOS Flink package (mesos flink framework) at all?
Any plans to add support for the fenzo balanced host attribute constraint?

Thanks,




On Fri, Sep 14, 2018 at 5:46 PM Till Rohrmann <[hidden email]> wrote:
Hi Martin,

Flink supports the mesos.constraints.hard.hostattribute to specify task constraints based on agent attributes [1]. I think you could use them to control the task placement.


Cheers,
Till

On Fri, Sep 14, 2018 at 3:08 PM Martin Eden <[hidden email]> wrote:
Thanks Vino!

On Fri, Sep 14, 2018 at 3:37 AM vino yang <[hidden email]> wrote:
Hi Martin,

Till has done most of the work of Flink on Mesos. Ping Till for you.

Thanks, vino.

Martin Eden <[hidden email]> 于2018年9月12日周三 下午11:21写道:
Hi all,

We're using Flink 1.3.2 with DCOS / Mesos.

We have a 3 node cluster and are running the Flink DCOS package (Flink Mesos framework) configured with 3 Task Managers.

Our goal is to run each of them on separate hosts for better load balancing but it seems the task managers end up running on the same host.

Looked around the docs and DCOS Flink package but could not find any placement policy or anything of the sorts.

Is there anything like that?

We are also planning to upgrade to the latest Flink version. Is something like that supported in this newer version?

Thanks,
M
Reply | Threaded
Open this post in threaded view
|

Re: Task managers run on separate nodes in a cluster

Renjie Liu
Hi, Martin:
I think a better solution would be to set the number of cores of each container equals to that of a physical server if this mesos cluster is dedicated to your flink cluster.

On Mon, Sep 17, 2018 at 5:28 AM Martin Eden <[hidden email]> wrote:
Hi Till,

I was able to use mesos.constraints.hard.hostattribute to run all task managers on a particular host in my cluster.

However, after looking a bit at the code, I'm not sure we can use mesos.constraints.hard.hostattribute for load balancing Flink task managers evenly across hosts in a Mesos cluster.

This is because under the hood it uses the fenzo host attribute value constraint while we would need the fenzo balanced host attribute constraint.

The LaunchableMesosWorker sets the constraints via the com.netflix.fenzo.TaskRequest and all of these hard constraints must be satisfied by a host for the task scheduler to assign this task to that host. Since the current implementation always return the static constraint value configured i.e. what is after ":", see org.apache.flink.mesos.runtime.clusterframework.MesosTaskManagerParameters#addHostAttrValueConstraint, I don't see how we can use it to load balance unless the constraint value would be dynamic based on the some property of the mesos task request.

Am I correct in my assumptions?

Any other way of load balancing? 
Maybe by not even using the DCOS Flink package (mesos flink framework) at all?
Any plans to add support for the fenzo balanced host attribute constraint?

Thanks,




On Fri, Sep 14, 2018 at 5:46 PM Till Rohrmann <[hidden email]> wrote:
Hi Martin,

Flink supports the mesos.constraints.hard.hostattribute to specify task constraints based on agent attributes [1]. I think you could use them to control the task placement.


Cheers,
Till

On Fri, Sep 14, 2018 at 3:08 PM Martin Eden <[hidden email]> wrote:
Thanks Vino!

On Fri, Sep 14, 2018 at 3:37 AM vino yang <[hidden email]> wrote:
Hi Martin,

Till has done most of the work of Flink on Mesos. Ping Till for you.

Thanks, vino.

Martin Eden <[hidden email]> 于2018年9月12日周三 下午11:21写道:
Hi all,

We're using Flink 1.3.2 with DCOS / Mesos.

We have a 3 node cluster and are running the Flink DCOS package (Flink Mesos framework) configured with 3 Task Managers.

Our goal is to run each of them on separate hosts for better load balancing but it seems the task managers end up running on the same host.

Looked around the docs and DCOS Flink package but could not find any placement policy or anything of the sorts.

Is there anything like that?

We are also planning to upgrade to the latest Flink version. Is something like that supported in this newer version?

Thanks,
M
--
Liu, Renjie
Software Engineer, MVAD
Reply | Threaded
Open this post in threaded view
|

Re: Task managers run on separate nodes in a cluster

Till Rohrmann
Hi Martin,

I'm not aware that the community is actively working on enabling the balanced host attribute constraint. If you wanna give it a try, then I'm happy to review your contribution.

Cheers,
Till 

On Mon, Sep 17, 2018 at 5:28 AM Renjie Liu <[hidden email]> wrote:
Hi, Martin:
I think a better solution would be to set the number of cores of each container equals to that of a physical server if this mesos cluster is dedicated to your flink cluster.

On Mon, Sep 17, 2018 at 5:28 AM Martin Eden <[hidden email]> wrote:
Hi Till,

I was able to use mesos.constraints.hard.hostattribute to run all task managers on a particular host in my cluster.

However, after looking a bit at the code, I'm not sure we can use mesos.constraints.hard.hostattribute for load balancing Flink task managers evenly across hosts in a Mesos cluster.

This is because under the hood it uses the fenzo host attribute value constraint while we would need the fenzo balanced host attribute constraint.

The LaunchableMesosWorker sets the constraints via the com.netflix.fenzo.TaskRequest and all of these hard constraints must be satisfied by a host for the task scheduler to assign this task to that host. Since the current implementation always return the static constraint value configured i.e. what is after ":", see org.apache.flink.mesos.runtime.clusterframework.MesosTaskManagerParameters#addHostAttrValueConstraint, I don't see how we can use it to load balance unless the constraint value would be dynamic based on the some property of the mesos task request.

Am I correct in my assumptions?

Any other way of load balancing? 
Maybe by not even using the DCOS Flink package (mesos flink framework) at all?
Any plans to add support for the fenzo balanced host attribute constraint?

Thanks,




On Fri, Sep 14, 2018 at 5:46 PM Till Rohrmann <[hidden email]> wrote:
Hi Martin,

Flink supports the mesos.constraints.hard.hostattribute to specify task constraints based on agent attributes [1]. I think you could use them to control the task placement.


Cheers,
Till

On Fri, Sep 14, 2018 at 3:08 PM Martin Eden <[hidden email]> wrote:
Thanks Vino!

On Fri, Sep 14, 2018 at 3:37 AM vino yang <[hidden email]> wrote:
Hi Martin,

Till has done most of the work of Flink on Mesos. Ping Till for you.

Thanks, vino.

Martin Eden <[hidden email]> 于2018年9月12日周三 下午11:21写道:
Hi all,

We're using Flink 1.3.2 with DCOS / Mesos.

We have a 3 node cluster and are running the Flink DCOS package (Flink Mesos framework) configured with 3 Task Managers.

Our goal is to run each of them on separate hosts for better load balancing but it seems the task managers end up running on the same host.

Looked around the docs and DCOS Flink package but could not find any placement policy or anything of the sorts.

Is there anything like that?

We are also planning to upgrade to the latest Flink version. Is something like that supported in this newer version?

Thanks,
M
--
Liu, Renjie
Software Engineer, MVAD
Reply | Threaded
Open this post in threaded view
|

Re: Task managers run on separate nodes in a cluster

Martin Eden
Thanks for the feedback Liu and Till.
@Liu Yeah that would work but unfortunately we run other services on the cluster so it's not really an option.
@Till Will have a look and see how much time I can dedicate to this.
M

On Mon, Sep 17, 2018 at 7:21 AM Till Rohrmann <[hidden email]> wrote:
Hi Martin,

I'm not aware that the community is actively working on enabling the balanced host attribute constraint. If you wanna give it a try, then I'm happy to review your contribution.

Cheers,
Till 

On Mon, Sep 17, 2018 at 5:28 AM Renjie Liu <[hidden email]> wrote:
Hi, Martin:
I think a better solution would be to set the number of cores of each container equals to that of a physical server if this mesos cluster is dedicated to your flink cluster.

On Mon, Sep 17, 2018 at 5:28 AM Martin Eden <[hidden email]> wrote:
Hi Till,

I was able to use mesos.constraints.hard.hostattribute to run all task managers on a particular host in my cluster.

However, after looking a bit at the code, I'm not sure we can use mesos.constraints.hard.hostattribute for load balancing Flink task managers evenly across hosts in a Mesos cluster.

This is because under the hood it uses the fenzo host attribute value constraint while we would need the fenzo balanced host attribute constraint.

The LaunchableMesosWorker sets the constraints via the com.netflix.fenzo.TaskRequest and all of these hard constraints must be satisfied by a host for the task scheduler to assign this task to that host. Since the current implementation always return the static constraint value configured i.e. what is after ":", see org.apache.flink.mesos.runtime.clusterframework.MesosTaskManagerParameters#addHostAttrValueConstraint, I don't see how we can use it to load balance unless the constraint value would be dynamic based on the some property of the mesos task request.

Am I correct in my assumptions?

Any other way of load balancing? 
Maybe by not even using the DCOS Flink package (mesos flink framework) at all?
Any plans to add support for the fenzo balanced host attribute constraint?

Thanks,




On Fri, Sep 14, 2018 at 5:46 PM Till Rohrmann <[hidden email]> wrote:
Hi Martin,

Flink supports the mesos.constraints.hard.hostattribute to specify task constraints based on agent attributes [1]. I think you could use them to control the task placement.


Cheers,
Till

On Fri, Sep 14, 2018 at 3:08 PM Martin Eden <[hidden email]> wrote:
Thanks Vino!

On Fri, Sep 14, 2018 at 3:37 AM vino yang <[hidden email]> wrote:
Hi Martin,

Till has done most of the work of Flink on Mesos. Ping Till for you.

Thanks, vino.

Martin Eden <[hidden email]> 于2018年9月12日周三 下午11:21写道:
Hi all,

We're using Flink 1.3.2 with DCOS / Mesos.

We have a 3 node cluster and are running the Flink DCOS package (Flink Mesos framework) configured with 3 Task Managers.

Our goal is to run each of them on separate hosts for better load balancing but it seems the task managers end up running on the same host.

Looked around the docs and DCOS Flink package but could not find any placement policy or anything of the sorts.

Is there anything like that?

We are also planning to upgrade to the latest Flink version. Is something like that supported in this newer version?

Thanks,
M
--
Liu, Renjie
Software Engineer, MVAD