Job Stuck in cancel state

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Job Stuck in cancel state

Janardhan Reddy
HI,

I cancelled a restarting job from flink UI and the job is stuck in cancelling state. (Fixed delay restart strategy was configured for the job). The following error message is present in taskmanager logs.

akka.remote.OversizedPayloadException: Discarding oversized payload sent to Actor[akka.tcp://flink@10.200.7.245:42589/user/jobmanager#-146176374]: max allowed size 10485760 bytes, actual size of encoded class org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage was 20670224 bytes.


Does the leader session message here denote the job cancel message which is sent to the job manager decorated with leader session id ? 

Thread dump of taskmanager:


Attaching to process ID 28948, please wait...

Debugger attached successfully.

Server compiler detected.

JVM version is 25.101-b13

Deadlock Detection:


java.lang.RuntimeException: Unable to deduce type of thread from address 0x00007f96e56c9800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:169)

at sun.jvm.hotspot.runtime.Threads.first(Threads.java:153)

at sun.jvm.hotspot.runtime.DeadlockDetector.createThreadTable(DeadlockDetector.java:149)

at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:56)

at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:39)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:62)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:45)

at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)

at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)

at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)

at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)

at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)

at sun.tools.jstack.JStack.main(JStack.java:106)

Caused by: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007f96e56c9800

at sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62)

at sun.jvm.hotspot.runtime.VirtualConstructor.instantiateWrapperFor(VirtualConstructor.java:80)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:165)

... 17 more

Can't print deadlocks:Unable to deduce type of thread from address 0x00007f96e56c9800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)

Exception in thread "main" java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)

at sun.tools.jstack.JStack.main(JStack.java:106)

Caused by: java.lang.RuntimeException: Unable to deduce type of thread from address 0x00007f96e56c9800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:169)

at sun.jvm.hotspot.runtime.Threads.first(Threads.java:153)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:75)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:45)

at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)

at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)

at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)

at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)

at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)

... 6 more

Caused by: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007f96e56c9800

at sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62)

at sun.jvm.hotspot.runtime.VirtualConstructor.instantiateWrapperFor(VirtualConstructor.java:80)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:165)

... 14 more

Reply | Threaded
Open this post in threaded view
|

Re: Job Stuck in cancel state

Fabian Hueske-2
Hi Janardhan,

to sure what's going wrong here. Maybe Till (in CC) has an idea?

Best, Fabian

2016-09-19 19:45 GMT+02:00 Janardhan Reddy <[hidden email]>:
HI,

I cancelled a restarting job from flink UI and the job is stuck in cancelling state. (Fixed delay restart strategy was configured for the job). The following error message is present in taskmanager logs.

akka.remote.OversizedPayloadException: Discarding oversized payload sent to Actor[akka.tcp://flink@10.200.7.245:42589/user/jobmanager#-146176374]: max allowed size 10485760 bytes, actual size of encoded class org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage was 20670224 bytes.


Does the leader session message here denote the job cancel message which is sent to the job manager decorated with leader session id ? 

Thread dump of taskmanager:


Attaching to process ID 28948, please wait...

Debugger attached successfully.

Server compiler detected.

JVM version is 25.101-b13

Deadlock Detection:


java.lang.RuntimeException: Unable to deduce type of thread from address 0x00007f96e56c9800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:169)

at sun.jvm.hotspot.runtime.Threads.first(Threads.java:153)

at sun.jvm.hotspot.runtime.DeadlockDetector.createThreadTable(DeadlockDetector.java:149)

at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:56)

at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:39)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:62)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:45)

at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)

at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)

at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)

at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)

at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)

at sun.tools.jstack.JStack.main(JStack.java:106)

Caused by: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007f96e56c9800

at sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62)

at sun.jvm.hotspot.runtime.VirtualConstructor.instantiateWrapperFor(VirtualConstructor.java:80)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:165)

... 17 more

Can't print deadlocks:Unable to deduce type of thread from address 0x00007f96e56c9800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)

Exception in thread "main" java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)

at sun.tools.jstack.JStack.main(JStack.java:106)

Caused by: java.lang.RuntimeException: Unable to deduce type of thread from address 0x00007f96e56c9800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:169)

at sun.jvm.hotspot.runtime.Threads.first(Threads.java:153)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:75)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:45)

at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)

at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)

at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)

at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)

at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)

... 6 more

Caused by: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007f96e56c9800

at sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62)

at sun.jvm.hotspot.runtime.VirtualConstructor.instantiateWrapperFor(VirtualConstructor.java:80)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:165)

... 14 more


Reply | Threaded
Open this post in threaded view
|

Re: Job Stuck in cancel state

Stephan Ewen
Hi Janardhan!

I think you gave us the stack trace of the JPS process, not the Flink process. Can you post the stack trace of the TaskManager?

Thanks,
Stephan


On Mon, Sep 19, 2016 at 8:15 PM, Fabian Hueske <[hidden email]> wrote:
Hi Janardhan,

to sure what's going wrong here. Maybe Till (in CC) has an idea?

Best, Fabian

2016-09-19 19:45 GMT+02:00 Janardhan Reddy <[hidden email]>:
HI,

I cancelled a restarting job from flink UI and the job is stuck in cancelling state. (Fixed delay restart strategy was configured for the job). The following error message is present in taskmanager logs.

akka.remote.OversizedPayloadException: Discarding oversized payload sent to Actor[akka.tcp://flink@10.200.7.245:42589/user/jobmanager#-146176374]: max allowed size 10485760 bytes, actual size of encoded class org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage was 20670224 bytes.


Does the leader session message here denote the job cancel message which is sent to the job manager decorated with leader session id ? 

Thread dump of taskmanager:


Attaching to process ID 28948, please wait...

Debugger attached successfully.

Server compiler detected.

JVM version is 25.101-b13

Deadlock Detection:


java.lang.RuntimeException: Unable to deduce type of thread from address 0x00007f96e56c9800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:169)

at sun.jvm.hotspot.runtime.Threads.first(Threads.java:153)

at sun.jvm.hotspot.runtime.DeadlockDetector.createThreadTable(DeadlockDetector.java:149)

at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:56)

at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:39)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:62)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:45)

at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)

at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)

at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)

at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)

at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)

at sun.tools.jstack.JStack.main(JStack.java:106)

Caused by: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007f96e56c9800

at sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62)

at sun.jvm.hotspot.runtime.VirtualConstructor.instantiateWrapperFor(VirtualConstructor.java:80)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:165)

... 17 more

Can't print deadlocks:Unable to deduce type of thread from address 0x00007f96e56c9800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)

Exception in thread "main" java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)

at sun.tools.jstack.JStack.main(JStack.java:106)

Caused by: java.lang.RuntimeException: Unable to deduce type of thread from address 0x00007f96e56c9800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:169)

at sun.jvm.hotspot.runtime.Threads.first(Threads.java:153)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:75)

at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:45)

at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)

at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)

at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)

at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)

at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)

... 6 more

Caused by: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007f96e56c9800

at sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62)

at sun.jvm.hotspot.runtime.VirtualConstructor.instantiateWrapperFor(VirtualConstructor.java:80)

at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:165)

... 14 more