1.1.4 on YARN - vcores change?

Posted by Shannon Carey on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/1-1-4-on-YARN-vcores-change-tp11016.html

Did anything change in 1.1.4 with regard to YARN & vcores?

I'm getting this error when deploying 1.1.4 to my test cluster. Only the Flink version changed.
java.lang.RuntimeException: Couldn't deploy Yarn cluster
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:384)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:591)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:465)
Caused by: org.apache.flink.configuration.IllegalConfigurationException: The number of virtual cores per node were configured with 8 but Yarn only has 4 virtual cores available. Please note that the number of virtual cores is set to the number of task slots by default unless configured in the Flink config with 'yarn.containers.vcores.'
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.isReadyForDeployment(AbstractYarnClusterDescriptor.java:273)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:393)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:381)
	... 2 more

When I run: ./bin/yarn-session.sh –q
It shows 8 vCores on each machine:

NodeManagers in the ClusterClient 3|Property         |Value          

+---------------------------------------+

|NodeID           |ip-10-2-…:8041 

|Memory           |12288 MB         

|vCores           |8                

|HealthReport     |                 

|Containers       |0                

+---------------------------------------+

|NodeID           |ip-10-2-…:8041 

|Memory           |12288 MB         

|vCores           |8                

|HealthReport     |                 

|Containers       |0                

+---------------------------------------+

|NodeID           |ip-10-2-…:8041 

|Memory           |12288 MB         

|vCores           |8                

|HealthReport     |                 

|Containers       |0                

+---------------------------------------+

Summary: totalMemory 36864 totalCores 24

Queue: default, Current Capacity: 0.0 Max Capacity: 1.0 Applications: 0


I'm running:
./bin/yarn-session.sh –n 3 --jobManagerMemory 1504 --taskManagerMemory 10764 --slots 8 —detached

I have not specified any value for "yarn.containers.vcores" in my config.

I switched to –n 5 and —slots 4, and halved the taskManagerMemory, which allowed the cluster to start.

However, in the YARN "Nodes" UI I see "VCores Used: 2" and "VCores Avail: 6" on all three nodes. And if I look at one of the Containers, it says, "Resource: 5408 Memory, 1 VCores". I don't understand what's happening here.

Thanks…