I have the same problem with you when running "flink-1.7.2 ON KUBERNATE HA" mode, may I ask if you have solved this problem? How? After I started the two jobmanagers normally, when I tried to kill one of them, he could not restart normally. Both jobmanagers reported this error. The specific log is as follows:
2019-06-28 09:57:57.253 [flink-akka.actor.default-dispatcher-4]
WARN akka.remote.transport.netty.NettyTransport New I/O boss #3 -
Remote connection to [null] failed with java.net.ConnectException:
Connection refused: tdh2/192.168.208.55:56529
2019-06-28
09:57:57.253 [flink-akka.actor.default-dispatcher-4] WARN
akka.remote.ReliableDeliverySupervisor
flink-akka.remote.default-remote-dispatcher-14 - Association with remote
system [akka.tcp://flink@tdh2:56529] has failed, address is now gated
for [50] ms. Reason: [Association failed with
[akka.tcp://flink@tdh2:56529]] Caused by: [Connection refused:
tdh2/192.168.208.55:56529]
2019-06-28 09:57:57.253
[flink-akka.actor.default-dispatcher-4] WARN
akka.remote.ReliableDeliverySupervisor
flink-akka.remote.default-remote-dispatcher-14 - Association with remote
system [akka.tcp://flink@tdh2:56529] has failed, address is now gated
for [50] ms. Reason: [Association failed with
[akka.tcp://flink@tdh2:56529]] Caused by: [Connection refused:
tdh2/192.168.208.55:56529]
2019-06-28 09:57:57.260
[flink-rest-server-netty-worker-thread-7] ERROR
o.a.f.r.rest.handler.legacy.files.StaticFileServerHandler - Could not
retrieve the redirect address.
java.util.concurrent.CompletionException:
akka.pattern.AskTimeoutException: Ask timed out on
[Actor[akka.tcp://flink@tdh2:56529/user/dispatcher#299521377]] after
[10000 ms]. Sender[null] sent message of type
"org.apache.flink.runtime.rpc.messages.RemoteFencedMessage".
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:772)
at akka.dispatch.OnComplete.internal(Future.scala:258)
at akka.dispatch.OnComplete.internal(Future.scala:256)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:83)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:603)
at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)
at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)
at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)
at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)
at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)
at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)
at java.lang.Thread.run(Thread.java:748)
Caused
by: akka.pattern.AskTimeoutException: Ask timed out on
[Actor[akka.tcp://flink@tdh2:56529/user/dispatcher#299521377]] after
[10000 ms]. Sender[null] sent message of type
"org.apache.flink.runtime.rpc.messages.RemoteFencedMessage".
at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)
... 9 common frames omitted