Hi,
WE are currently start to test Flink running on YARN. Till now, we've been testing on Standalone Cluster. One thing lacking in standalone is that we have to manually restart a Task Manager if it dies. I looked at
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#yarn-cluster-high-availability , and see that YARN deals with HA for Job Manager. How does it deal with a Task Manager if it dies? I would like the Task Manager to be dealt with similarly to Job Manager on failure. For example, let's say I have a cluster with two Task Managers, and one task manager dies. Will YARN restart the dead Task Manager, or would that need to be a manual restart?
What actually would happen in the above case?
Thanks,
Hayden