All datanodes are bad

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

All datanodes are bad

Muhammad Ali Orakzai
Hi,

I am receiving the following exception while trying to run the terasort program on flink. My configuration is as follows:

Hadoop: 2.6.2

Flink: 0.10.1


Server 1:

Hadoop data and name nodeĀ 

Flink job and task manager


Server 2:

Flink task manager



org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed.

at org.apache.flink.client.program.Client.runBlocking(Client.java:370)

at org.apache.flink.client.program.Client.runBlocking(Client.java:348)

at org.apache.flink.client.program.Client.runBlocking(Client.java:315)

at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:70)

at org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:627)

at eastcircle.terasort.FlinkTeraSort$.main(FlinkTeraSort.scala:88)

at eastcircle.terasort.FlinkTeraSort.main(FlinkTeraSort.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:497)

at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:395)

at org.apache.flink.client.program.Client.runBlocking(Client.java:252)

at org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:676)

at org.apache.flink.client.CliFrontend.run(CliFrontend.java:326)

at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:978)

at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1028)

Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.

at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply$mcV$sp(JobManager.scala:563)

at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply(JobManager.scala:509)

at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply(JobManager.scala:509)

at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)

at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)

at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)

at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)

at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)

at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)

at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)

at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)

at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Caused by: java.io.IOException: All datanodes xxx.xxx.xx.xx:50010 are bad. Aborting...

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1206)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1004)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:548)

Reply | Threaded
Open this post in threaded view
|

Re: All datanodes are bad

Stephan Ewen
I guess the problem is pretty much like exception message says: HDFS output does not work because all datanodes are bad.

If you activate execution retries, the system would re-execute the job. If the datanodes would be well then, the job would succeed.

Greetings,
Stephan


On Wed, Dec 16, 2015 at 4:07 PM, Muhammad Ali Orakzai <[hidden email]> wrote:
Hi,

I am receiving the following exception while trying to run the terasort program on flink. My configuration is as follows:

Hadoop: 2.6.2

Flink: 0.10.1


Server 1:

Hadoop data and name nodeĀ 

Flink job and task manager


Server 2:

Flink task manager



org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed.

at org.apache.flink.client.program.Client.runBlocking(Client.java:370)

at org.apache.flink.client.program.Client.runBlocking(Client.java:348)

at org.apache.flink.client.program.Client.runBlocking(Client.java:315)

at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:70)

at org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:627)

at eastcircle.terasort.FlinkTeraSort$.main(FlinkTeraSort.scala:88)

at eastcircle.terasort.FlinkTeraSort.main(FlinkTeraSort.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:497)

at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:395)

at org.apache.flink.client.program.Client.runBlocking(Client.java:252)

at org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:676)

at org.apache.flink.client.CliFrontend.run(CliFrontend.java:326)

at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:978)

at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1028)

Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.

at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply$mcV$sp(JobManager.scala:563)

at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply(JobManager.scala:509)

at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply(JobManager.scala:509)

at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)

at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)

at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)

at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)

at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)

at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)

at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)

at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)

at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Caused by: java.io.IOException: All datanodes xxx.xxx.xx.xx:50010 are bad. Aborting...

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1206)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1004)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:548)