NoResourceAvailableException and JobNotFound Errors

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

NoResourceAvailableException and JobNotFound Errors

Prasanna kumar
Hi , 

I am running flink locally in my machine with following configurations. 

# The RPC port where the JobManager is reachable.

jobmanager.rpc.port: 6123


# The heap size for the JobManager JVM

jobmanager.heap.size: 1024m


# The heap size for the TaskManager JVM

taskmanager.heap.size: 1024m


# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.

taskmanager.numberOfTaskSlots: 8


# The parallelism used for programs that did not specify and other parallelism.

parallelism.default: 1


When i run my program i end up getting 

Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not allocate all requires slots within timeout of 300000 ms. Slots required: 1, slots allocated: 0

at org.apache.flink.runtime.executiongraph.ExecutionGraph.lambda$scheduleEager$3(ExecutionGraph.java:991)

at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)

at java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)

at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)


JobManager Logs
2020-06-02 23:25:09,992 ERROR org.apache.flink.runtime.rest.handler.job.JobDetailsHandler   - Exception occurred in REST handler.
org.apache.flink.runtime.rest.NotFoundException: Job be3d6b9751b6e9c509b9bedeb581a72e not found
Caused by: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (be3d6b9751b6e9c509b9bedeb581a72e)
	at org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGatewayFuture(Dispatcher.java:766)
	at org.apache.flink.runtime.dispatcher.Dispatcher.requestJob(Dispatcher.java:485)
 Finally its shutdown 
2020-06-02 23:30:05,427 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job c23a172cda6cc659296af6452ff57f45.
2020-06-02 23:30:05,427 INFO  org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore  - Shutting down
2020-06-02 23:30:05,428 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Job c23a172cda6cc659296af6452ff57f45 reached globally terminal state FAILED.
2020-06-02 23:30:05,449 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - Stopping the JobMaster for job Flink Streaming Single Environment(c23a172cda6cc659296af6452ff57f45).
2020-06-02 23:30:05,450 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - Close ResourceManager connection 9da4590b1bbc3c104e70e270988db461: JobManager is shutting down..
2020-06-02 23:30:05,450 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Suspending SlotPool.
2020-06-02 23:30:05,450 INFO  org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Disconnect job manager [hidden email]://flink@localhost:6123/user/jobmanager_0 for job c23a172cda6cc659296af6452ff57f45 from the resource manager.
2020-06-02 23:30:05,451 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Stopping SlotPool.
2020-06-02 23:30:05,451 INFO  org.apache.flink.runtime.jobmaster.JobManagerRunner           - JobManagerRunner already shutdown.

Thanks,
Prasanna.
Reply | Threaded
Open this post in threaded view
|

Re: NoResourceAvailableException and JobNotFound Errors

Zhu Zhu
Hi Prasanna,

The job failed because it fails to acquire enough slots to run tasks.
Did you launch any task manager?

The JobNotFound exception is thrown because someone(possibly Flink UI) sends a query for a job that does not exist in the Flink cluster.
From the log you attached, the job id of your job is c23a172cda6cc659296af6452ff57f45, but the REST request is get the info of job be3d6b9751b6e9c509b9bedeb581a72e.

Thanks,
Zhu Zhu


Prasanna kumar <[hidden email]> 于2020年6月3日周三 上午2:16写道:
Hi , 

I am running flink locally in my machine with following configurations. 

# The RPC port where the JobManager is reachable.

jobmanager.rpc.port: 6123


# The heap size for the JobManager JVM

jobmanager.heap.size: 1024m


# The heap size for the TaskManager JVM

taskmanager.heap.size: 1024m


# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.

taskmanager.numberOfTaskSlots: 8


# The parallelism used for programs that did not specify and other parallelism.

parallelism.default: 1


When i run my program i end up getting 

Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not allocate all requires slots within timeout of 300000 ms. Slots required: 1, slots allocated: 0

at org.apache.flink.runtime.executiongraph.ExecutionGraph.lambda$scheduleEager$3(ExecutionGraph.java:991)

at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)

at java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)

at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)


JobManager Logs
2020-06-02 23:25:09,992 ERROR org.apache.flink.runtime.rest.handler.job.JobDetailsHandler   - Exception occurred in REST handler.
org.apache.flink.runtime.rest.NotFoundException: Job be3d6b9751b6e9c509b9bedeb581a72e not found
Caused by: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (be3d6b9751b6e9c509b9bedeb581a72e)
	at org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGatewayFuture(Dispatcher.java:766)
	at org.apache.flink.runtime.dispatcher.Dispatcher.requestJob(Dispatcher.java:485)
 Finally its shutdown 
2020-06-02 23:30:05,427 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job c23a172cda6cc659296af6452ff57f45.
2020-06-02 23:30:05,427 INFO  org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore  - Shutting down
2020-06-02 23:30:05,428 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Job c23a172cda6cc659296af6452ff57f45 reached globally terminal state FAILED.
2020-06-02 23:30:05,449 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - Stopping the JobMaster for job Flink Streaming Single Environment(c23a172cda6cc659296af6452ff57f45).
2020-06-02 23:30:05,450 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - Close ResourceManager connection 9da4590b1bbc3c104e70e270988db461: JobManager is shutting down..
2020-06-02 23:30:05,450 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Suspending SlotPool.
2020-06-02 23:30:05,450 INFO  org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Disconnect job manager [hidden email]://flink@localhost:6123/user/jobmanager_0 for job c23a172cda6cc659296af6452ff57f45 from the resource manager.
2020-06-02 23:30:05,451 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Stopping SlotPool.
2020-06-02 23:30:05,451 INFO  org.apache.flink.runtime.jobmaster.JobManagerRunner           - JobManagerRunner already shutdown.

Thanks,
Prasanna.