(DEPRECATED) Apache Flink User Mailing List archive.

Job manager high availability job restarting

Classic

List

Threaded

2 messages Options

Giselle van Dongen

Job manager high availability job restarting

Hi!

We are running a high available Flink cluster in standalone mode with Zookeeper with 2 jobmanagers and 5 taskmanagers.

When the jobmanager is killed, the standby jobmanager takes over. But the job is also restarted.

Is this the default behavior and can we avoid job restarts (for jobmanager failure) in some way?

Thank you,

Giselle

Yang Wang

Re: Job manager high availability job restarting

I think it is the expected behavior. When the active JobManager loses leadership, the standby one

will try to take over and recover the job from the latest successful checkpoint.

The high availability just helps with leader election/retrieval and HA meta storage(e.g. job graphs, checkpoints, etc.).

It could not avoid job restarts in JobManager failures.

Best,

Yang

Giselle van Dongen <[hidden email]> 于2021年1月6日周三上午6:23写道：

Hi!

We are running a high available Flink cluster in standalone mode with Zookeeper with 2 jobmanagers and 5 taskmanagers.

When the jobmanager is killed, the standby jobmanager takes over. But the job is also restarted.

Is this the default behavior and can we avoid job restarts (for jobmanager failure) in some way?

Thank you,

Giselle