|
Scenerio
* savepoint with Cancel followed by a restore on the Job. It brings down the JM and relaunches on a different IP, thus the resolution of dns is a new IP. * The TMs deployment is not rolled ( recreated ) * Note that `flink-conf.yaml:metrics.internal.query-service.port` is hardcoded.
Remote connection to [null] failed with org.apache.flink.shaded.akka.org.jboss.netty.channel.ConnectTimeoutException: connection timed out: [dns]/ 172.17.6.135:6666
Solution: Restart the TM deployment ( though that should not be and will cause latency issues on a shared Resource Manager as k8s )
PS I am sure that a cancel/restart or restart of JM b'coz of any issue will create the same above issue ( not tested ) .
Regards
|