Re: configuration of standalone cluster

Posted by Chesnay Schepler on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/configuration-of-standalone-cluster-tp27662p27686.html

Flink still only works with Java 8 at the moment. It will be a while
until we properly support Java 11.

On 02/05/2019 13:58, Günter Hipler wrote:

> swissbib@sb-ust1:~$ java -version
> openjdk version "11.0.2" 2019-01-15
> OpenJDK Runtime Environment (build 11.0.2+9-Ubuntu-3ubuntu118.04.3)
> OpenJDK 64-Bit Server VM (build 11.0.2+9-Ubuntu-3ubuntu118.04.3, mixed
> mode, sharing)
> swissbib@sb-ust1:~$
>
> Is version 8 more appropriate?
>
> Günter
>
>
> On 02.05.19 13:48, Chesnay Schepler wrote:
>> Which java version are you using?
>>
>> On 01/05/2019 21:31, Günter Hipler wrote:
>>> Hi,
>>>
>>> For the first time I'm trying to set up a standalone cluster. My
>>> current configuration
>>> 4 server (1 jobmanger and 3 taskmanager)
>>>
>>> a) starting the cluster
>>> swissbib@sb-ust1:/swissbib_index/apps/flink/bin$ ./start-cluster.sh
>>> Starting cluster.
>>> Starting standalonesession daemon on host sb-ust1.
>>> Starting taskexecutor daemon on host sb-ust2.
>>> Starting taskexecutor daemon on host sb-ust3.
>>> Starting taskexecutor daemon on host sb-ust4.
>>>
>>>
>>> On the taskmanager side I get the error
>>> 2019-05-01 21:16:32,794 WARN akka.remote.ReliableDeliverySupervisor
>>> - Association with remote system [akka.ssl.tcp://flink@sb-ust1:6123]
>>> has failed, address is now gated for [50] ms. Reason: [class [B
>>> cannot be cast to class [C ([B and [C are in module java.base of
>>> loader 'bootstrap')]
>>> 2019-05-01 21:16:41,932 INFO
>>> org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not
>>> resolve ResourceManager address
>>> akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in
>>> 10000 ms: Ask timed out on
>>> [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/),
>>> Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent
>>> message of type "akka.actor.Identify"..
>>> 2019-05-01 21:17:01,960 INFO
>>> org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not
>>> resolve ResourceManager address
>>> akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in
>>> 10000 ms: Ask timed out on
>>> [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/),
>>> Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent
>>> message of type "akka.actor.Identify"..
>>>
>>>
>>> port 6123 is allowed on the jobmanager but I haven't created a
>>> specialized flink - user.
>>>
>>> - Is this necessary? if yes, is it possible to define another user
>>> for communication purposes?
>>>
>>> I followed the documentation to setup a ssl based communication
>>> (https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/security-ssl.html#example-ssl-setup-standalone-and-kubernetes)
>>> and created a keystore as described:
>>>
>>> keytool -genkeypair -alias swissbib.internal -keystore
>>> internal.keystore -dname "CN=flink.internal" -storepass verysecret
>>> -keypass verysecret -keyalg RSA -keysize 4096
>>>
>>> and deployed the flink-conf.yaml on the whole cluster
>>>
>>> (part of flink-conf.yaml)
>>> security.ssl.internal.enabled: true
>>> security.ssl.internal.keystore:
>>> /swissbib_index/apps/flink/conf/internal.keystore
>>> security.ssl.internal.truststore:
>>> /swissbib_index/apps/flink/conf/internal.keystore
>>> security.ssl.internal.keystore-password: verysecret
>>> security.ssl.internal.truststore-password: verysecret
>>> security.ssl.internal.key-password: verysecret
>>>
>>> but this doesn't solve the problem - still no connection between
>>> task-managers and job-managers.
>>>
>>> - another question: which ports have to be enabled in the firewall
>>> for a standalone cluster?
>>>
>>> Thanks for any hints!
>>>
>>> Günter
>>>
>>>
>>
>>
>