Did anything change in 1.1.4 with regard to YARN & vcores?
I'm getting this error when deploying 1.1.4 to my test cluster. Only the Flink version changed.[0mjava.lang.RuntimeException: Couldn't deploy Yarn cluster [0m at org.apache.flink.yarn. AbstractYarnClusterDescriptor. deploy( AbstractYarnClusterDescriptor. java:384) [0m at org.apache.flink.yarn.cli. FlinkYarnSessionCli.run( FlinkYarnSessionCli.java:591) [0m at org.apache.flink.yarn.cli. FlinkYarnSessionCli.main( FlinkYarnSessionCli.java:465) [0mCaused by: org.apache.flink. configuration. IllegalConfigurationException: The number of virtual cores per node were configured with 8 but Yarn only has 4 virtual cores available. Please note that the number of virtual cores is set to the number of task slots by default unless configured in the Flink config with 'yarn.containers.vcores.' [0m at org.apache.flink.yarn. AbstractYarnClusterDescriptor. isReadyForDeployment( AbstractYarnClusterDescriptor. java:273) [0m at org.apache.flink.yarn. AbstractYarnClusterDescriptor. deployInternal( AbstractYarnClusterDescriptor. java:393) [0m at org.apache.flink.yarn. AbstractYarnClusterDescriptor. deploy( AbstractYarnClusterDescriptor. java:381) [0m ... 2 more
When I run: ./bin/yarn-session.sh –qIt shows 8 vCores on each machine:
NodeManagers in the ClusterClient 3|Property |Value
+-----------------------------
----------+ |NodeID |ip-10-2-…:8041
|Memory |12288 MB
|vCores |8
|HealthReport |
|Containers |0
+-----------------------------
----------+ |NodeID |ip-10-2-…:8041
|Memory |12288 MB
|vCores |8
|HealthReport |
|Containers |0
+-----------------------------
----------+ |NodeID |ip-10-2-…:8041
|Memory |12288 MB
|vCores |8
|HealthReport |
|Containers |0
+-----------------------------
----------+ Summary: totalMemory 36864 totalCores 24
Queue: default, Current Capacity: 0.0 Max Capacity: 1.0 Applications: 0
I'm running:./bin/yarn-session.sh –n 3 --jobManagerMemory 1504 --taskManagerMemory 10764 --slots 8 —detached
I have not specified any value for "yarn.containers.vcores" in my config.
I switched to –n 5 and —slots 4, and halved the taskManagerMemory, which allowed the cluster to start.
However, in the YARN "Nodes" UI I see "VCores Used: 2" and "VCores Avail: 6" on all three nodes. And if I look at one of the Containers, it says, "Resource: 5408 Memory, 1 VCores". I don't understand what's happening here.
Thanks…
Free forum by Nabble | Edit this page |