Limit max cpu usage per TaskManager

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Limit max cpu usage per TaskManager

Lu Niu
Hi,

When run flink application in yarn mode, is there a way to limit maximum cpu usage per TaskManager? 

I tried this application with just source and sink operator. parallelism of source is 60 and parallelism of sink is 1. When running in default config, there are 60 TaskManager assigned. I notice one TaskManager process cpu usage could be 200% white the rest below 50%.

When I set -yn = 2 (default is 1), I notice # of TaskManger dropped down to 30. and one TaskManger process cpu usage could be 600% while the rest below 50%. 

Tried to set yarn.containers.vcores = 2,  all tasks are in start state forever, application is not able to turn to running state.

Best
Lu
Reply | Threaded
Open this post in threaded view
|

Re: Limit max cpu usage per TaskManager

vino yang
Hi Lu,

When using Flink on YARN, it will rely on YARN's resource management capabilities, and Flink cannot currently limit CPU usage.

Also, what version of Flink do you use? As far as I know, since Flink 1.8, the -yn parameter will not work.

Best,
Vino

Lu Niu <[hidden email]> 于2019年11月6日周三 下午1:29写道:
Hi,

When run flink application in yarn mode, is there a way to limit maximum cpu usage per TaskManager? 

I tried this application with just source and sink operator. parallelism of source is 60 and parallelism of sink is 1. When running in default config, there are 60 TaskManager assigned. I notice one TaskManager process cpu usage could be 200% white the rest below 50%.

When I set -yn = 2 (default is 1), I notice # of TaskManger dropped down to 30. and one TaskManger process cpu usage could be 600% while the rest below 50%. 

Tried to set yarn.containers.vcores = 2,  all tasks are in start state forever, application is not able to turn to running state.

Best
Lu
Reply | Threaded
Open this post in threaded view
|

Re: Limit max cpu usage per TaskManager

Victor Wong

Hi Lu,

 

You can check out which operator thread causes the high CPU usage, and set a unique slot sharing group name [1] to it to prevent too many operator threads running in the same TM.

Hope this will be helpful😊

 

[1]. https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/#task-chaining-and-resource-groups

 

Best,

Victor

 

From: Vino Yang <[hidden email]>
Date: Wednesday, 6 November 2019 at 4:26 PM
To: Lu Niu <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Limit max cpu usage per TaskManager

 

Hi Lu,

 

When using Flink on YARN, it will rely on YARN's resource management capabilities, and Flink cannot currently limit CPU usage.

Also, what version of Flink do you use? As far as I know, since Flink 1.8, the -yn parameter will not work.

 

Best,

Vino

 

Lu Niu <[hidden email]> 2019116日周三 下午1:29写道:

Hi,

 

When run flink application in yarn mode, is there a way to limit maximum cpu usage per TaskManager? 

 

I tried this application with just source and sink operator. parallelism of source is 60 and parallelism of sink is 1. When running in default config, there are 60 TaskManager assigned. I notice one TaskManager process cpu usage could be 200% white the rest below 50%.

 

When I set -yn = 2 (default is 1), I notice # of TaskManger dropped down to 30. and one TaskManger process cpu usage could be 600% while the rest below 50%. 

 

Tried to set yarn.containers.vcores = 2,  all tasks are in start state forever, application is not able to turn to running state.

 

Best

Lu

Reply | Threaded
Open this post in threaded view
|

Re: Limit max cpu usage per TaskManager

Yang Wang
If you want to limit the TaskManager container cpu usage, it is based on your yarn cluster configuration.
By default, yarn only uses cpu share. You need to set `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true`
in yarn-site.xml of all yarn node managers.


Best,
Yang

Victor Wong <[hidden email]> 于2019年11月6日周三 下午5:02写道:

Hi Lu,

 

You can check out which operator thread causes the high CPU usage, and set a unique slot sharing group name [1] to it to prevent too many operator threads running in the same TM.

Hope this will be helpful😊

 

[1]. https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/#task-chaining-and-resource-groups

 

Best,

Victor

 

From: Vino Yang <[hidden email]>
Date: Wednesday, 6 November 2019 at 4:26 PM
To: Lu Niu <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Limit max cpu usage per TaskManager

 

Hi Lu,

 

When using Flink on YARN, it will rely on YARN's resource management capabilities, and Flink cannot currently limit CPU usage.

Also, what version of Flink do you use? As far as I know, since Flink 1.8, the -yn parameter will not work.

 

Best,

Vino

 

Lu Niu <[hidden email]> 2019116日周三 下午1:29写道:

Hi,

 

When run flink application in yarn mode, is there a way to limit maximum cpu usage per TaskManager? 

 

I tried this application with just source and sink operator. parallelism of source is 60 and parallelism of sink is 1. When running in default config, there are 60 TaskManager assigned. I notice one TaskManager process cpu usage could be 200% white the rest below 50%.

 

When I set -yn = 2 (default is 1), I notice # of TaskManger dropped down to 30. and one TaskManger process cpu usage could be 600% while the rest below 50%. 

 

Tried to set yarn.containers.vcores = 2,  all tasks are in start state forever, application is not able to turn to running state.

 

Best

Lu

Reply | Threaded
Open this post in threaded view
|

Re: Limit max cpu usage per TaskManager

Lu Niu
Hi, 

Thanks for replying! Basically I want to limit cpu usage so that different application don't affect each other. What's current best practice? Looks `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true` is one way. How to set how many cpu resources to use? is it "yarn.containers.vcores" ? 

it should be -ys not -yn in original post, sorry for the typo.  

Best
Lu

On Wed, Nov 6, 2019 at 1:41 AM Yang Wang <[hidden email]> wrote:
If you want to limit the TaskManager container cpu usage, it is based on your yarn cluster configuration.
By default, yarn only uses cpu share. You need to set `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true`
in yarn-site.xml of all yarn node managers.


Best,
Yang

Victor Wong <[hidden email]> 于2019年11月6日周三 下午5:02写道:

Hi Lu,

 

You can check out which operator thread causes the high CPU usage, and set a unique slot sharing group name [1] to it to prevent too many operator threads running in the same TM.

Hope this will be helpful😊

 

[1]. https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/#task-chaining-and-resource-groups

 

Best,

Victor

 

From: Vino Yang <[hidden email]>
Date: Wednesday, 6 November 2019 at 4:26 PM
To: Lu Niu <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Limit max cpu usage per TaskManager

 

Hi Lu,

 

When using Flink on YARN, it will rely on YARN's resource management capabilities, and Flink cannot currently limit CPU usage.

Also, what version of Flink do you use? As far as I know, since Flink 1.8, the -yn parameter will not work.

 

Best,

Vino

 

Lu Niu <[hidden email]> 2019116日周三 下午1:29写道:

Hi,

 

When run flink application in yarn mode, is there a way to limit maximum cpu usage per TaskManager? 

 

I tried this application with just source and sink operator. parallelism of source is 60 and parallelism of sink is 1. When running in default config, there are 60 TaskManager assigned. I notice one TaskManager process cpu usage could be 200% white the rest below 50%.

 

When I set -yn = 2 (default is 1), I notice # of TaskManger dropped down to 30. and one TaskManger process cpu usage could be 600% while the rest below 50%. 

 

Tried to set yarn.containers.vcores = 2,  all tasks are in start state forever, application is not able to turn to running state.

 

Best

Lu

Reply | Threaded
Open this post in threaded view
|

Re: Limit max cpu usage per TaskManager

Yang Wang
Hi Lu Niu,

Yes, you could use `yarn.containers.vcores` to set the vcores of taskmanager. However, it could not
guarantee that the application do not affect each other. By default, the yarn cluster are using cgroup
share. That means a taskmanager could use more cpu than it allocated. When the machine is heavy,
linux kernel will use cpu share as weight to control different processes.

If you want to limit the taskmanager could only use as it allocated, the 
`yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true` is the only way. Yarn
nodemanager will set cpu quota for each taskmanager.




Best,
Yang

Lu Niu <[hidden email]> 于2019年11月7日周四 上午1:15写道:
Hi, 

Thanks for replying! Basically I want to limit cpu usage so that different application don't affect each other. What's current best practice? Looks `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true` is one way. How to set how many cpu resources to use? is it "yarn.containers.vcores" ? 

it should be -ys not -yn in original post, sorry for the typo.  

Best
Lu

On Wed, Nov 6, 2019 at 1:41 AM Yang Wang <[hidden email]> wrote:
If you want to limit the TaskManager container cpu usage, it is based on your yarn cluster configuration.
By default, yarn only uses cpu share. You need to set `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true`
in yarn-site.xml of all yarn node managers.


Best,
Yang

Victor Wong <[hidden email]> 于2019年11月6日周三 下午5:02写道:

Hi Lu,

 

You can check out which operator thread causes the high CPU usage, and set a unique slot sharing group name [1] to it to prevent too many operator threads running in the same TM.

Hope this will be helpful😊

 

[1]. https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/#task-chaining-and-resource-groups

 

Best,

Victor

 

From: Vino Yang <[hidden email]>
Date: Wednesday, 6 November 2019 at 4:26 PM
To: Lu Niu <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Limit max cpu usage per TaskManager

 

Hi Lu,

 

When using Flink on YARN, it will rely on YARN's resource management capabilities, and Flink cannot currently limit CPU usage.

Also, what version of Flink do you use? As far as I know, since Flink 1.8, the -yn parameter will not work.

 

Best,

Vino

 

Lu Niu <[hidden email]> 2019116日周三 下午1:29写道:

Hi,

 

When run flink application in yarn mode, is there a way to limit maximum cpu usage per TaskManager? 

 

I tried this application with just source and sink operator. parallelism of source is 60 and parallelism of sink is 1. When running in default config, there are 60 TaskManager assigned. I notice one TaskManager process cpu usage could be 200% white the rest below 50%.

 

When I set -yn = 2 (default is 1), I notice # of TaskManger dropped down to 30. and one TaskManger process cpu usage could be 600% while the rest below 50%. 

 

Tried to set yarn.containers.vcores = 2,  all tasks are in start state forever, application is not able to turn to running state.

 

Best

Lu

Reply | Threaded
Open this post in threaded view
|

Re: Limit max cpu usage per TaskManager

Rong Rong
 Hi Lu,

Yang is right. enabling cgroup isolation is probably the one you are looking for to control how Flink utilize the CPU resources.
One more idea is to enable DominantResourceCalculator[1] which I think you've probably done so already.

Found an interesting read[2] about this if you would like to dig deeper.

Thanks,
Rong


--
Rong

On Fri, Nov 8, 2019 at 3:51 AM Yang Wang <[hidden email]> wrote:
Hi Lu Niu,

Yes, you could use `yarn.containers.vcores` to set the vcores of taskmanager. However, it could not
guarantee that the application do not affect each other. By default, the yarn cluster are using cgroup
share. That means a taskmanager could use more cpu than it allocated. When the machine is heavy,
linux kernel will use cpu share as weight to control different processes.

If you want to limit the taskmanager could only use as it allocated, the 
`yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true` is the only way. Yarn
nodemanager will set cpu quota for each taskmanager.




Best,
Yang

Lu Niu <[hidden email]> 于2019年11月7日周四 上午1:15写道:
Hi, 

Thanks for replying! Basically I want to limit cpu usage so that different application don't affect each other. What's current best practice? Looks `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true` is one way. How to set how many cpu resources to use? is it "yarn.containers.vcores" ? 

it should be -ys not -yn in original post, sorry for the typo.  

Best
Lu

On Wed, Nov 6, 2019 at 1:41 AM Yang Wang <[hidden email]> wrote:
If you want to limit the TaskManager container cpu usage, it is based on your yarn cluster configuration.
By default, yarn only uses cpu share. You need to set `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true`
in yarn-site.xml of all yarn node managers.


Best,
Yang

Victor Wong <[hidden email]> 于2019年11月6日周三 下午5:02写道:

Hi Lu,

 

You can check out which operator thread causes the high CPU usage, and set a unique slot sharing group name [1] to it to prevent too many operator threads running in the same TM.

Hope this will be helpful😊

 

[1]. https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/#task-chaining-and-resource-groups

 

Best,

Victor

 

From: Vino Yang <[hidden email]>
Date: Wednesday, 6 November 2019 at 4:26 PM
To: Lu Niu <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Limit max cpu usage per TaskManager

 

Hi Lu,

 

When using Flink on YARN, it will rely on YARN's resource management capabilities, and Flink cannot currently limit CPU usage.

Also, what version of Flink do you use? As far as I know, since Flink 1.8, the -yn parameter will not work.

 

Best,

Vino

 

Lu Niu <[hidden email]> 2019116日周三 下午1:29写道:

Hi,

 

When run flink application in yarn mode, is there a way to limit maximum cpu usage per TaskManager? 

 

I tried this application with just source and sink operator. parallelism of source is 60 and parallelism of sink is 1. When running in default config, there are 60 TaskManager assigned. I notice one TaskManager process cpu usage could be 200% white the rest below 50%.

 

When I set -yn = 2 (default is 1), I notice # of TaskManger dropped down to 30. and one TaskManger process cpu usage could be 600% while the rest below 50%. 

 

Tried to set yarn.containers.vcores = 2,  all tasks are in start state forever, application is not able to turn to running state.

 

Best

Lu