Re: configuration of standalone cluster

Posted by Abhishek Jain on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/configuration-of-standalone-cluster-tp27662p27684.html

Java version: "1.8.0_112"
Java(TM) SE Runtime Environment (build 1.8.0_112-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.112-b15, mixed mode)


On Thu, 2 May 2019 at 17:18, Chesnay Schepler <[hidden email]> wrote:
Which java version are you using?

On 01/05/2019 21:31, Günter Hipler wrote:
> Hi,
>
> For the first time I'm trying to set up a standalone cluster. My
> current configuration
> 4 server (1 jobmanger and 3 taskmanager)
>
> a) starting the cluster
> swissbib@sb-ust1:/swissbib_index/apps/flink/bin$ ./start-cluster.sh
> Starting cluster.
> Starting standalonesession daemon on host sb-ust1.
> Starting taskexecutor daemon on host sb-ust2.
> Starting taskexecutor daemon on host sb-ust3.
> Starting taskexecutor daemon on host sb-ust4.
>
>
> On the taskmanager side I get the error
> 2019-05-01 21:16:32,794 WARN
> akka.remote.ReliableDeliverySupervisor                        -
> Association with remote system [akka.ssl.tcp://flink@sb-ust1:6123] has
> failed, address is now gated for [50] ms. Reason: [class [B cannot be
> cast to class [C ([B and [C are in module java.base of loader
> 'bootstrap')]
> 2019-05-01 21:16:41,932 INFO
> org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could
> not resolve ResourceManager address
> akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in
> 10000 ms: Ask timed out on
> [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/),
> Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent
> message of type "akka.actor.Identify"..
> 2019-05-01 21:17:01,960 INFO
> org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could
> not resolve ResourceManager address
> akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in
> 10000 ms: Ask timed out on
> [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/),
> Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent
> message of type "akka.actor.Identify"..
>
>
> port 6123 is allowed on the jobmanager but I haven't created a
> specialized flink - user.
>
> - Is this necessary? if yes, is it possible to define another user for
> communication purposes?
>
> I followed the documentation to setup a ssl based communication
> (https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/security-ssl.html#example-ssl-setup-standalone-and-kubernetes)
> and created a keystore as described:
>
> keytool -genkeypair -alias swissbib.internal -keystore
> internal.keystore -dname "CN=flink.internal" -storepass verysecret
> -keypass verysecret -keyalg RSA -keysize 4096
>
> and deployed the flink-conf.yaml on the whole cluster
>
> (part of flink-conf.yaml)
> security.ssl.internal.enabled: true
> security.ssl.internal.keystore:
> /swissbib_index/apps/flink/conf/internal.keystore
> security.ssl.internal.truststore:
> /swissbib_index/apps/flink/conf/internal.keystore
> security.ssl.internal.keystore-password: verysecret
> security.ssl.internal.truststore-password: verysecret
> security.ssl.internal.key-password: verysecret
>
> but this doesn't solve the problem - still no connection between
> task-managers and job-managers.
>
> - another question: which ports have to be enabled in the firewall for
> a standalone cluster?
>
> Thanks for any hints!
>
> Günter
>
>



--
Warm Regards,
Abhishek Jain