Hi Alex,Is it possible that the data has been corrupted?
Or have you confirmed that the avro version is consistent in different Flink versions?
Also, if you don't upgrade Flink and still use version 1.3.1, can it be recovered?
Thanks, vino.2018-07-25 8:32 GMT+08:00 Alex Vinnik <[hidden email]>:Vino,Upgraded flink to Hadoop 2.8.1$ docker exec -it flink-jobmanager cat /var/log/flink/flink.log | grep entrypoint | grep 'Hadoop version'2018-07-25T00:19:46.142+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Hadoop version: 2.8.1but job still fails to start
Ideas?Caused by: org.apache.flink.util.FlinkException: Failed to submit job d84cccd3bffcba1f243352a5e5ef99a9.at org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:254)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:247)at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:162)at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)at akka.actor.Actor$class.aroundReceive(Actor.scala:502)at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)at akka.actor.ActorCell.invoke(ActorCell.scala:495)at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)at akka.dispatch.Mailbox.run(Mailbox.scala:224)at akka.dispatch.Mailbox.exec(Mailbox.scala:234)... 4 moreCaused by: org.apache.flink.runtime.client.JobExecutionException: Could not set up JobManagerat org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:169)at org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:885)at org.apache.flink.runtime.dispatcher.Dispatcher.createJobManagerRunner(Dispatcher.java:287)at org.apache.flink.runtime.dispatcher.Dispatcher.runJob(Dispatcher.java:277)at org.apache.flink.runtime.dispatcher.Dispatcher.persistAndRunJob(Dispatcher.java:262)at org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:249)... 21 moreCaused by: org.apache.flink.runtime.client.JobExecutionException: Cannot initialize task 'DataSink (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@a3123a9)': Deserializing the OutputFormat (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@a3123a9) failed: unread block dataat org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:220)at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:100)at org.apache.flink.runtime.jobmaster.JobMaster.createExecutionGraph(JobMaster.java:1150)at org.apache.flink.runtime.jobmaster.JobMaster.createAndRestoreExecutionGraph(JobMaster.java:1130)at org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:298)at org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:151)... 26 moreCaused by: java.lang.Exception: Deserializing the OutputFormat (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@a3123a9) failed: unread block dataat org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:63)at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:216)... 31 moreCaused by: java.lang.IllegalStateException: unread block dataat java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2781)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1603)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2285)at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2209)at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2067)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1571)at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:488)at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:475)at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:463)at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:424)at org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:288)at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:60)... 32 moreOn Tue, Jul 24, 2018 at 10:32 AM vino yang <[hidden email]> wrote:Hi Alex,Based on your log information, the potential reason is Hadoop version. To troubleshoot the exception comes from different Hadoop version. I suggest you match the both side of Hadoop version.You can :1. Upgrade the Hadoop version which Flink Cluster depends on, Flink's official website provides the binary binding Hadoop 2.8.[1]2. downgrade your fat jar's Hadoop client dependency's version to match Flink Cluster's hadoop dependency's version.Thanks, vino.2018-07-24 22:59 GMT+08:00 Alex Vinnik <[hidden email]>:Hi Till,Thanks for responding. Below is entrypoint logs. One thing I noticed that "Hadoop version: 2.7.3". My fat jar is built with hadoop 2.8 client. Could it be a reason for that error? If so how can i use same hadoop version 2.8 on flink server side? BTW job runs fine locally reading from the same s3a buckets when executed using createLocalEnvironment via java -jar my-fat.jar --input s3a://foo --output s3a://barRegarding java version. The job is submitted via Flink UI, so it should not be a problem.Thanks a lot in advance.2018-07-24T12:09:38.083+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO --------------------------------------------------------------------------------2018-07-24T12:09:38.085+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Starting StandaloneSessionClusterEntrypoint (Version: 1.5.0, Rev:c61b108, Date:24.05.2018 @ 14:54:44 UTC)2018-07-24T12:09:38.085+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO OS current user: flink2018-07-24T12:09:38.844+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Current Hadoop/Kerberos user: flink2018-07-24T12:09:38.844+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.171-b112018-07-24T12:09:38.844+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Maximum heap size: 1963 MiBytes2018-07-24T12:09:38.844+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO JAVA_HOME: /docker-java-home/jre2018-07-24T12:09:38.851+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Hadoop version: 2.7.32018-07-24T12:09:38.851+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO JVM Options:2018-07-24T12:09:38.851+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO -Xms2048m2018-07-24T12:09:38.851+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO -Xmx2048m2018-07-24T12:09:38.851+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO -Dorg.apache.flink.kinesis.shaded.com.amazonaws.sdk.disableCertChecking2018-07-24T12:09:38.851+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO -Dcom.amazonaws.sdk.disableCertChecking2018-07-24T12:09:38.851+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=50152018-07-24T12:09:38.851+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties2018-07-24T12:09:38.851+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml2018-07-24T12:09:38.851+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Program Arguments:2018-07-24T12:09:38.852+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO --configDir2018-07-24T12:09:38.852+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO /opt/flink/conf2018-07-24T12:09:38.852+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO --executionMode2018-07-24T12:09:38.853+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO cluster2018-07-24T12:09:38.853+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO --host2018-07-24T12:09:38.853+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO cluster2018-07-24T12:09:38.853+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Classpath: /opt/flink/lib/flink-metrics-datadog-1.5.0.jar:/opt/flink/lib/flink-python_2.11-1.5.0.jar:/opt/flink/lib/flink-s3-fs-presto-1.5.0.jar:/opt/flink/lib/flink-shaded-hadoop2-uber-1.5.0.jar:/opt/flink/lib/log4j-1.2.17.jar:/opt/flink/lib/slf4j-log4j12-1.7.7.jar:/opt/flink/lib/flink-dist_2.11-1.5.0.jar:::2018-07-24T12:09:38.853+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO --------------------------------------------------------------------------------2018-07-24T12:09:38.854+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Registered UNIX signal handlers for [TERM, HUP, INT]2018-07-24T12:09:38.895+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Starting StandaloneSessionClusterEntrypoint.2018-07-24T12:09:38.895+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Install default filesystem.2018-07-24T12:09:38.927+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Install security context.2018-07-24T12:09:39.034+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Initializing cluster services.2018-07-24T12:09:39.059+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Trying to start actor system at flink-jobmanager:61232018-07-24T12:09:40.335+0000 [org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO Actor system started at akka.tcp://flink@flink-jobmanager:6123On Tue, Jul 24, 2018 at 7:16 AM Till Rohrmann <[hidden email]> wrote:Hi Alex,I'm not entirely sure what causes this problem because it is the first time I see it.First question would be if the problem also arises if using a different Hadoop version.Are you using the same Java versions on the client as well as on the server?Could you provide us with the cluster entrypoint logs?Cheers,TillOn Tue, Jul 24, 2018 at 4:56 AM Alex Vinnik <[hidden email]> wrote:
Free forum by Nabble | Edit this page |