keep-alive job strategy
Posted by Rob on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/keep-alive-job-strategy-tp16062.html
Hello
I have set up a cluster and added taskmanagers manually with bin/taskmanager.sh start.
I noticed that if i have 5 task managers with one slot each and start a job with -p5, then if i stop a taskmanager the job will fail even if there are 4 more taskmanagers.
Is this expected (I turned off restart policy)?
So the way to ensure continuous operation of a single "job" is to have e.g. 10 TM and deploy 10 job instances to fill each of 10 slots?
Or if I have a job that does require -p3 for example, I should always have at least 3 TM alive?
Many thanks!
-Rob