Hi all,
We have running several Flink jobs on k8s with flink 1.11.3 and recently we notice that some of them are constantly restarting with the following exception. After restart, everything is working as expected. Could this be a bug? 2021-05-25 17:04:42 org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot instantiate user function. at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperatorFactory(StreamConfig.java:275) at org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(OperatorChain.java:126) at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:459) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:526) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: unexpected exception type at java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1750) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1280) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:576) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:562) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:550) at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:511) at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperatorFactory(StreamConfig.java:260) ... 6 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor282.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.lang.invoke.SerializedLambda.readResolve(SerializedLambda.java:230) at sun.reflect.GeneratedMethodAccessor281.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1274) ... 23 more Caused by: java.lang.NoClassDefFoundError: Could not initialize com.my.organization.MyPerfectlyWorkingJob ... 31 more |
Hi Georgi, I don't think it's a bug in Flink. It sounds like some problem with dependencies or jars in your job. Can you explain a bit more what do you mean by: > that some of them are constantly restarting with the following exception. After restart, everything is working as expected constantly restarting, but after a restart everything is working? Best, Piotrek wt., 25 maj 2021 o 16:12 Georgi Stoyanov <[hidden email]> napisał(a): Hi all, |
Hi Piotr, thank you for the fast reply.
The job is restarting in the same flink session and fails with that exception. When I delete the pods (we are using the google cdr, so I just kubectl delete FlinkCluster …) and the yaml
is applied again, it’s working as expected. It looks to me that it’s jar problem, since I just notice it started to fail with a class from a internal common library, not only the jobs java.lang.NoClassDefFoundError: Could not initialize com.my.organization.core.cfg.PropertiesConfigurationClass From: Piotr Nowojski <[hidden email]>
Hi Georgi, I don't think it's a bug in Flink. It sounds like some problem with dependencies or jars in your job. Can you explain a bit more what do you mean by: > that some of them are constantly restarting with the following exception. After restart, everything is working as expected constantly restarting, but after a restart everything is working? Best, Piotrek wt., 25 maj 2021 o 16:12 Georgi Stoyanov <[hidden email]> napisał(a):
|
Hi, Maybe before deleting the pods, you could look inside them and inspect your job's jar? What classes does it have inside it? The job's jar should be in a local directory. Or maybe even first inspect the jar before submitting it? Best, Piotrek wt., 25 maj 2021 o 17:40 Georgi Stoyanov <[hidden email]> napisał(a):
|
Free forum by Nabble | Edit this page |