Could not find job with id XXXXX

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Could not find job with id XXXXX

Hung
This post was updated on .
Hi Flink users,

Suddenly I discovered this "Could not find job with id". What would be the possible causes for this?
It would be good to know the Job name of that job id but I cannot neither go to web UI nor use ./bin/flink list

2016-11-16 16:26:21,276 WARN  org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler     - Error while handling request
org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job with id 2acfc2ac66c527b6f169406f15219aac
        at org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
        at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:88)
        at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:84)
        at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:44)


I cannot run ./flink list and the Task managers cannot find the Job manager as well. Looks like the Job manager is done?

org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway
        at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:127)
        at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:644)
        at org.apache.flink.client.CliFrontend.getJobManagerGateway(CliFrontend.java:868)
        at org.apache.flink.client.CliFrontend.list(CliFrontend.java:387)
        at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1008)
        at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1048)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:107)
        at scala.concurrent.Await.result(package.scala)
        at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:125)
        ... 5 more

Best,

Sendoh
Reply | Threaded
Open this post in threaded view
|

Re: Could not find job with id XXXXX

Ufuk Celebi
That's most probably due to a job that has already terminated, but you still have a browser open querying the job manager for the job.

The log level for this has been recently reduced to DEBUG (for both upcoming Flink 1.1.4 and 1.2.0). If you are not explicitly missing a job in the web UI that you can ignore this.

– Ufuk

On 17 November 2016 at 13:33:14, Sendoh ([hidden email]) wrote:

> Hi Flink users,
>  
> Suddenly I discovered this "Could not find job with id". What would be the
> possible causes for this?
> It would be good to know the Job name of that job id but I cannot neither go
> to web UI nor use ./bin/flink list
>  
> 2016-11-16 16:26:21,276 WARN
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler - Error while
> handling request
> org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job
> with id 2acfc2ac66c527b6f169406f15219aac
> at
> org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)  
> at
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:88)  
> at
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:84)  
> at
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:44)  
>  
>  
> I cannot run ./flink list and the Task managers cannot find the Job manager
> as well. Looks like the Job manager is done?
>  
> org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could  
> not
> retrieve the leader gateway
> at
> org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:127)  
> at
> org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:644)  
> at
> org.apache.flink.client.CliFrontend.getJobManagerGateway(CliFrontend.java:868)  
> at org.apache.flink.client.CliFrontend.list(CliFrontend.java:387)
> at
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1008)  
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1048)
> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
> [10000 milliseconds]
> at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)  
> at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)  
> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
> at
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)  
> at scala.concurrent.Await$.result(package.scala:107)
> at scala.concurrent.Await.result(package.scala)
> at
> org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:125)  
> ... 5 more
>  
>  
>  
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Could-not-find-job-with-id-XXXXX-tp10170.html 
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.  
>  

Reply | Threaded
Open this post in threaded view
|

Re: Could not find job with id XXXXX

Hung
Thank you for your reply.
It sounds for me should not be the error that causing job manager down? Or it can?
Currently we use 1.1.3.

Best,

Sendoh

Reply | Threaded
Open this post in threaded view
|

Re: Could not find job with id XXXXX

Ufuk Celebi
No that should not cause the job manager to fail.

Do you have the complete job manager logs available to further look into this?

– Ufuk

On 17 November 2016 at 13:39:47, Sendoh ([hidden email]) wrote:

> Thank you for your reply.
> It sounds for me should not be the error that causing job manager down? Or
> it can?
> Currently we use 1.1.3.
>  
> Best,
>  
> Sendoh
>  
>  
>  
>  
>  
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Could-not-find-job-with-id-XXXXX-tp10170p10172.html 
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.  
>