Hey there, For my master thesis I'm trying to set up a flink standalone cluster on 4 nodes. I've worked along the documentation which pretty neatly explains how to set it up. But when I start the cluster there is a warning and when I'm trying to run a job, there is
an error with the same message: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka.tcp://flink@MYHOSTNAME:6123/user/jobmanager#-818199108]] after [10000 ms]. Sender[null] sent message of type "org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage" Increasing the timeout didn't work. When I open the taskmanagers in web UI, all of them have the following pattern: akka.tcp://flink@MYHOSTNAME:33779/user/taskmanager Does anyone have an idea how to solve this to get the cluster working? Thanks in advance! One last thing: There isn't a user "flink" on the cluster and won't be created. So any advices without telling me I should create that user would be very appreciated! Thanks! Kind regards, |
Hi Lukas,
those are akka-internal names that you don't have to worry about. It looks like your TaskManager cannot reach the JobManager. Is 'jobmanager.rpc.address' configured correctly on the TaskManager? And is it reachable by this name? Is port 6123 allowed through the firewall? Are you sure the JobManager is running? How do you start the cluster? If you have been using start-cluster.sh (as per [1]), please also try to start the services manually to check whether there's something wrong there. Nico [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/cluster_setup.html On 04/03/18 13:57, Lukas Werner wrote: > Hey there, > > For my master thesis I'm trying to set up a flink standalone cluster on > 4 nodes. I've worked along the documentation which pretty neatly > explains how to set it up. But when I start the cluster there is a > warning and when I'm trying to run a job, there is an error with the > same message: > > akka.pattern.AskTimeoutException:Asktimed out on > [Actor[akka.tcp://flink@MYHOSTNAME:6123/user/jobmanager#-818199108]] > after [10000 ms]. Sender[null] sent message of type > "org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage" > > Increasing the timeout didn't work. When I open the taskmanagers in web > UI, all of them have the following pattern: > > akka.tcp://flink@MYHOSTNAME:33779/user/taskmanager > > Does anyone have an idea how to solve this to get the cluster working? > Thanks in advance! > > One last thing: There isn't a user "flink" on the cluster and won't be > created. So any advices without telling me I should create that user > would be very appreciated! Thanks! > > Kind regards, > Lukas > > > signature.asc (201 bytes) Download Attachment |
Hello Nico,
Thanks for helping me with this. It's really nerve-wracking, not coming further with this. No here's what I tried: I checked if port is open by using telnet and connecting to jobmanager's port, worked. Furthermore I've started the jobmanager and taskmanager individual on each node, still not working. I've found a problem. In "masters" file, there were a port configured, there was an entry like "myhost:8081". I think, that was an error. But now I've got the error, that akka timed out on distributing an operation after 10.000ms. Can you help me with this? The topics related to that won't work with my problem. Thanks in advance! Lukas -----Ursprüngliche Nachricht----- Von: Nico Kruber [mailto:[hidden email]] Gesendet: Dienstag, 6. März 2018 17:34 An: Lukas Werner <[hidden email]>; [hidden email] Betreff: Re: Akka wants to connect with username "flink" Hi Lukas, those are akka-internal names that you don't have to worry about. It looks like your TaskManager cannot reach the JobManager. Is 'jobmanager.rpc.address' configured correctly on the TaskManager? And is it reachable by this name? Is port 6123 allowed through the firewall? Are you sure the JobManager is running? How do you start the cluster? If you have been using start-cluster.sh (as per [1]), please also try to start the services manually to check whether there's something wrong there. Nico [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/cluster_setup.html On 04/03/18 13:57, Lukas Werner wrote: > Hey there, > > For my master thesis I'm trying to set up a flink standalone cluster > on > 4 nodes. I've worked along the documentation which pretty neatly > explains how to set it up. But when I start the cluster there is a > warning and when I'm trying to run a job, there is an error with the > same message: > > akka.pattern.AskTimeoutException:Asktimed out on > [Actor[akka.tcp://flink@MYHOSTNAME:6123/user/jobmanager#-818199108]] > after [10000 ms]. Sender[null] sent message of type > "org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage" > > Increasing the timeout didn't work. When I open the taskmanagers in > web UI, all of them have the following pattern: > > akka.tcp://flink@MYHOSTNAME:33779/user/taskmanager > > Does anyone have an idea how to solve this to get the cluster working? > Thanks in advance! > > One last thing: There isn't a user "flink" on the cluster and won't be > created. So any advices without telling me I should create that user > would be very appreciated! Thanks! > > Kind regards, > Lukas > > > |
Free forum by Nabble | Edit this page |