Hi, I am running a program that works fine locally, but when I try to run it on the cluster I get a timeout error from the client that tries to connect to the jobmanager. There is no issue with contacting the jobmanager
form the machine, as it works just fine for other stream applications. I suspect that because the stream topology is rather complex, there is an issue with deploying the schematic. I am not sure if this is a normal behavior (IMHO I would think it should not
fail just because the topology is more complex). Hence, if the error helps to identify the underlyin issue (if any) please see it below. Meanwhile, can you please educate me on how I can configure the timeout such that it won’t fail anymore. Thanks org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Communication with JobManager failed: Job submission to the JobManager timed out. at org.apache.flink.client.program.Client.runBlocking(Client.java:370) at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:96) at application.MainStreamApp.main(MainStreamApp.java:108) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:497) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:395) at org.apache.flink.client.program.Client.runBlocking(Client.java:252) at org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:676) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:326) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:978) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1028) Caused by: org.apache.flink.runtime.client.JobExecutionException: Communication with JobManager failed: Job submission to the JobManager timed out. at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:140) at org.apache.flink.client.program.Client.runBlocking(Client.java:368) ... 13 more Caused by: org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException: Job submission to the JobManager timed out. at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:255) at org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:88) at org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:68) at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Dr. Radu Tudoran Research Engineer - Big Data Expert IT R&D Division HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center Riesstrasse 25, 80992 München E-mail:
[hidden email] Mobile: +49 15209084330 Telephone: +49 891588344173
HUAWEI TECHNOLOGIES Duesseldorf GmbH This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use
of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the
sender by phone or email immediately and delete it! |
Hi Radu, did you check the JobManager logs as well? Maybe there you can see why the JobManager is failing. The timeout is configurable through the "akka.client.timeout" variable. The default value is "60 s". On Wed, Feb 10, 2016 at 7:35 PM, Radu Tudoran <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |