Flink 1.7.0 HA based on zookeepers

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Flink 1.7.0 HA based on zookeepers

min.tan

Hi,

 

I have a simple HA setting with Flink 1.7.0:

Node1 (active master, active slave) Node2 (standby master, active slave)

 

Step 1, start-cluster.sh from Node1, no problem

Step 2, manually kill the active master on Node1, no problem and the standby master become active

Step 3, bin/jobmanager.sh start cluster  on the Node1

Step 4, manually kill the active master on Node2, no problem and the Node1 master become active again

 

But it the overview page does not show the active task managers any more.

Is this expected or I have errors in my HA settings?

 

There is no error in the Node2 logs but there are java.net.ConnectExceptions in Node1 logs.

 

Regards,

 

Min

 

 

--------------------------Exception logs in Node1----------------------------

2019-01-11 11:32:33,222 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.ConnectException: Connection refused: /Nod2:39075

2019-01-11 11:32:33,225 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@Node2:39075] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@Node2:39075]] Caused by: [Connection refused: /Node2:39075]

2019-01-11 11:32:43,233 ERROR org.apache.flink.runtime.rest.handler.legacy.files.StaticFileServerHandler  - Could not retrieve the redirect address.

java.util.concurrent.CompletionException: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka.tcp://flink@Node2:39075/user/dispatcher#655281870]] after [10000 ms]. Sender[null] sent message of type "org.apache.flink.runtime.rpc.messages.RemoteFencedMessage".

        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)

        at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)

        at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)

        at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)

        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)

        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)

        at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:772)

        at akka.dispatch.OnComplete.internal(Future.scala:258)

        at akka.dispatch.OnComplete.internal(Future.scala:256)

        at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)

        at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)

        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)

        at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:83)

        at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)

        at scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284)

        at scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284)

        at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284)

        at akka.pattern.PromiseActorRef$.$anonfun$apply$1(AskSupport.scala:604)

        at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)

        at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:870)

        at scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:109)

        at scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:103)

        at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:868)

        at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)

        at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)

        at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)

        at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)

        at java.lang.Thread.run(Thread.java:748)

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

 

 



Check out our new brand campaign: www.ubs.com/together
E-mails can involve SUBSTANTIAL RISKS, e.g. lack of confidentiality, potential manipulation of contents and/or sender's address, incorrect recipient (misdirection), viruses etc. Based on previous e-mail correspondence with you and/or an agreement reached with you, UBS considers itself authorized to contact you via e-mail. UBS assumes no responsibility for any loss or damage resulting from the use of e-mails.
The recipient is aware of and accepts the inherent risks of using e-mails, in particular the risk that the banking relationship and confidential information relating thereto are disclosed to third parties.
UBS reserves the right to retain and monitor all messages. Messages are protected and accessed only in legally justified cases.
For information on how UBS uses and discloses personal data, how long we retain it, how we keep it secure and your data protection rights, please see our Privacy Notice http://www.ubs.com/global/en/legalinfo2/privacy.html