Flink jobs getting finished because of "Could not allocate the required slot within slot request timeout"

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink jobs getting finished because of "Could not allocate the required slot within slot request timeout"

mars
This post was updated on .
Hi All,

 I have an EMR Cluster with one Master Node and 3 worker Nodes ( it has auto
scaling enabled and the max no.of worker nodes can go up to 8).

I have 3 Spark Jobs that are running currently on the Cluster.

I submitted 3 Flink Jobs and all of them finished as the slots are not
available error.

In flink-conf.xml i have

jobmanager.heap.mb: 4096
taskmanager.heap.mb: 4096

And the Master node has 16 vcores and 64Gb Memory and each worker node has 4
vcores and 16GB Memory.

And when i am submitting the flink job i am passing the arg (-p 2) which
should set the parallelism to 2.

And YARN UI is showing the following stats

Containers Running : 7
Memory Used          : 21.63GB
Memory Total          : 36GB
vCores Used           : 7
VCores Total           : 12
Active Nodes          : 3

I cannot figure out why the slots cannot be allocated to Flink Jobs. First
of all even with 3 Active Nodes there are still 5 VCores available and more
over for this Cluster Auto Scaling is enabled and EMR should allocate up to
8 Nodes i.e 5 more new nodes should be allocated if required.

Appreciate any insights.

Also i cannot find the task manager logs on any of the nodes.

Thanks
Sateesh





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink jobs getting finished because of "Could not allocate the required slot within slot request timeout"

Zhu Zhu
Hi Sateesh,

Would you check Flink jobmanager log to see whether it has sent container requests to YARN RM?
If the request is sent but not fulfilled, you will need to check the YARN RM logs or the YARN cluster 
resources at that time to see whether that container request is fulfillable.
The resources for a requested container can be found in Flink JM log.

Thanks,
Zhu Zhu

mars <[hidden email]> 于2020年7月29日周三 下午10:52写道:
Hi All,

 I have an EMR Cluster with one Master Node and 3 worker Nodes ( it has auto
scaling enabled and the max no.of worker nodes can go up to 8).

I have 3 Spark Jobs that are running currently on the Cluster.

I submitted 3 Flink Jobs and all of them finished as the slots are not
available error.

In flink-conf.xml i have

jobmanager.heap.mb: 4096
taskmanager.heap.mb: 4096

And the Master node has 16 vcores and 64Gb Memory and each worker node has 4
vcores and 16GB Memory.

And when i am submitting the flink job i am passing the arg (-p 2) which
should set the parallelism to 2.

And YARN UI is showing the following stats

Containers Running : 7
Memory Used          : 21.63GB
Memory Total          : 36GB
vCores Used           : 7
VCores Total           : 12
Active Nodes          : 3

I cannot figure out why the slots cannot be allocated to Flink Jobs. First
of all even with 3 Active Nodes there are still 5 VCores available and more
over for this Cluster Auto Scaling is enabled and EMR should allocate up to
8 Nodes i.e 5 more new nodes should be allocated is required.

Appreciate any insights.

Also i cannot find the task manager logs on any of the nodes.

Thanks
Sateesh





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/