Cannot connect to the JobManager - Flink 1.1.3 cluster mode

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Cannot connect to the JobManager - Flink 1.1.3 cluster mode

Dominik Safaric
Hi all,

As I’ve been setting up a cluster comprised of three worker nodes and a master node, I’ve encountered the problem that the JobManager although running is unreachable. 

The master instance has access using SSH to all worker nodes. The worker nodes do not however have access via SSH to the master node. Hence, could this be the reason for the exception being thrown? Interestedly, I keep getting the same exception even when running as a local cluster. If I try to connect to the JobManager manually by executing for example bin/flink list I am however able to connect to the JobManager. 

In regard to other services, such as the state backend configured via Zookeeper, the master is able to connect to e.g. Zookeeper running on a different node of the cluster - checked by examining the ZNode created. 

Next, Flink imposes this requirement of SSH when running in cluster mode.  Since the cluster I am running has a VNET configured, could SSH be bypassed or is it a must? 

Thanks in advance,
Dominik
Reply | Threaded
Open this post in threaded view
|

Re: Cannot connect to the JobManager - Flink 1.1.3 cluster mode

Stefan Richter
Hi,

I think share the logs would be helpful to figure out the problem.

Best,
Stefan

Am 23.11.2016 um 22:05 schrieb Dominik Safaric <[hidden email]>:

Hi all,

As I’ve been setting up a cluster comprised of three worker nodes and a master node, I’ve encountered the problem that the JobManager although running is unreachable. 

The master instance has access using SSH to all worker nodes. The worker nodes do not however have access via SSH to the master node. Hence, could this be the reason for the exception being thrown? Interestedly, I keep getting the same exception even when running as a local cluster. If I try to connect to the JobManager manually by executing for example bin/flink list I am however able to connect to the JobManager. 

In regard to other services, such as the state backend configured via Zookeeper, the master is able to connect to e.g. Zookeeper running on a different node of the cluster - checked by examining the ZNode created. 

Next, Flink imposes this requirement of SSH when running in cluster mode.  Since the cluster I am running has a VNET configured, could SSH be bypassed or is it a must? 

Thanks in advance,
Dominik