Flink JobManager is not starting when running on a standalone cluster

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink JobManager is not starting when running on a standalone cluster

HarshithBolar

Hi all,

 

We run Flink on a five node cluster – three task managers, two job managers. One of the job manager running on flink2-0 node is down and refuses to come back up, so the cluster is currently running with a single job manager. When I restart the service, I see this in the logs. Any idea what this issue might be?

 

2018-10-22 06:43:50,458 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Starting JobManager actor

2018-10-22 06:43:50,462 INFO  org.apache.flink.runtime.blob.BlobServer                      - Created BLOB server storage directory /tmp/blobStore-73e8dbe2-8fdb-4310-84d4-c9f3445723f3

2018-10-22 06:43:50,466 INFO  org.apache.flink.runtime.blob.BlobServer                      - Enabling ssl for the blob server

2018-10-22 06:43:50,482 INFO  org.apache.flink.runtime.blob.BlobServer                      - Started BLOB server at 0.0.0.0:36880 - max concurrent requests: 50 - max backlog: 1000

2018-10-22 06:43:50,501 INFO  org.apache.flink.runtime.jobmanager.MemoryArchivist           - Started memory archivist akka://flink/user/archive

2018-10-22 06:43:50,525 INFO  org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - Starting ZooKeeperLeaderRetrievalService.

2018-10-22 06:43:50,525 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Starting JobManager at akka.ssl.tcp://[hidden email]:22902/user/jobmanager.

2018-10-22 06:43:50,526 INFO  org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService  - Starting ZooKeeperLeaderElectionService org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@2805f48f.

2018-10-22 06:43:50,532 INFO  org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - Starting ZooKeeperLeaderRetrievalService.

2018-10-22 06:43:50,557 INFO  org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager  - Received leader address but not running in leader ActorSystem. Cancelling registration.

 

Thanks,

Harshith

Reply | Threaded
Open this post in threaded view
|

Re: Flink JobManager is not starting when running on a standalone cluster

miki haiat
I think it`s related to this issue 
https://issues.apache.org/jira/browse/FLINK-10011




On Mon, Oct 22, 2018 at 1:52 PM Kumar Bolar, Harshith <[hidden email]> wrote:

Hi all,

 

We run Flink on a five node cluster – three task managers, two job managers. One of the job manager running on flink2-0 node is down and refuses to come back up, so the cluster is currently running with a single job manager. When I restart the service, I see this in the logs. Any idea what this issue might be?

 

2018-10-22 06:43:50,458 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Starting JobManager actor

2018-10-22 06:43:50,462 INFO  org.apache.flink.runtime.blob.BlobServer                      - Created BLOB server storage directory /tmp/blobStore-73e8dbe2-8fdb-4310-84d4-c9f3445723f3

2018-10-22 06:43:50,466 INFO  org.apache.flink.runtime.blob.BlobServer                      - Enabling ssl for the blob server

2018-10-22 06:43:50,482 INFO  org.apache.flink.runtime.blob.BlobServer                      - Started BLOB server at 0.0.0.0:36880 - max concurrent requests: 50 - max backlog: 1000

2018-10-22 06:43:50,501 INFO  org.apache.flink.runtime.jobmanager.MemoryArchivist           - Started memory archivist akka://flink/user/archive

2018-10-22 06:43:50,525 INFO  org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - Starting ZooKeeperLeaderRetrievalService.

2018-10-22 06:43:50,525 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Starting JobManager at akka.ssl.tcp://flink@...:22902/user/jobmanager.

2018-10-22 06:43:50,526 INFO  org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService  - Starting ZooKeeperLeaderElectionService org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@2805f48f.

2018-10-22 06:43:50,532 INFO  org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - Starting ZooKeeperLeaderRetrievalService.

2018-10-22 06:43:50,557 INFO  org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager  - Received leader address but not running in leader ActorSystem. Cancelling registration.

 

Thanks,

Harshith