Hi Cam,
Flink master should not die when getting disconnected with task managers.
It may exit for cases below:
1. when the job terminated(FINISHED/FAILED/CANCELED). If you job is configured with no restart retry, a TM failure can cause the job to be FAILED.
2. JM lost HA leadership, e.g. lost connection to ZK
3. encounters other unexpected fatal errors. In this case we need to check the log to see what happens then
Thanks,
Zhu Zhu
Hello Flink experts,
We are running Flink under Kubernetes and see that Job Manager die/restarted whenever Task Manager die/restarted or couldn't get connected each other. Is there any specific configurations/parameters that we need to turn on to stop this? Or this is expected?
Thanks,
Cam