Hi All,
I am new to Flink. I am running wordcount streaming program in cluster. It has take more time. So I stopped that process manually. But it still in canceling, there are two subtasks in cluster one has successfully canceled but another one is still canceling. We tried to kill the process in command prompt using grep and kill command. But couldn't kill that process.Please help me to kill that process. Best Regards, Ramkumar L |
Hi, so you tried to stop flink by killing the processes? I assume you've started Flink in the standalone cluster mode? If you a kill, and a kill -9 should definitively stop Flink. Did you check the log files of the task manager? The Flink services are logging when they are receiving signals from the operating system. If the cancelling doesn't work for some reason, it'll be logged as well. In general, I would recommend cancelling the job from the CLI or web interface. In some cases, the job might not stop, for example when the user code is blocking all the time (a good example is a while(true){} loop that doesn't get stopped from a cancel() call). What is your Flink job doing? Does it contain a custom source? On Fri, May 13, 2016 at 11:50 AM, Ramkumar <[hidden email]> wrote: Hi All, |
Hi,
As you mentioned, my program contains a loop with infinite iterations. I was not able to stop by kill -9 command. Instead I killed the entire flink cluster job. I set up the cluster again with one master and three slaves. Now when I try to run the code, it is showing the following exceptions. The program finished with the following exception: org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Communication with JobManager failed: Lost connection to the JobManager. at org.apache.flink.client.program.Client.runBlocking(Client.java:381) at org.apache.flink.client.program.Client.runBlocking(Client.java:355) at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:65) at org.apache.flink.project05.WordCount.main(WordCount.java:95) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:505) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:403) at org.apache.flink.client.program.Client.runBlocking(Client.java:248) at org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:866) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:333) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1189) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1239) Caused by: org.apache.flink.runtime.client.JobExecutionException: Communication with JobManager failed: Lost connection to the JobManager. at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:141) at org.apache.flink.client.program.Client.runBlocking(Client.java:379) ... 14 more Caused by: org.apache.flink.runtime.client.JobClientActorConnectionTimeoutException: Lost connection to the JobManager. at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:244) at org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:88) at org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:68) at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) |
Could you check the logs in Cheers, On Wed, May 18, 2016 at 10:05 AM, Ramkumar <[hidden email]> wrote: Hi, |
Free forum by Nabble | Edit this page |