(DEPRECATED) Apache Flink User Mailing List archive.

Flink HA for Job Cluster

Classic

List

Threaded

5 messages Options

KristoffSC

Flink HA for Job Cluster

Hi,
In [1] where we can find setup for Stand Alone an YARN clusters to achieve
Job Manager's HA.

Is Standalone Cluster High Availability with a zookeeper the same approach
for Docker's Job Cluster approach with Kubernetes?

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html

Thanks,
Krzysztof

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

tison

Re: Flink HA for Job Cluster

Hi Krzysztof,

Flink doesn't provide JM HA itself yet.

For YARN deployment, you can rely on yarn.application-attempts configuration[1];

for Kubernetes deployment, Flink uses Kubernetes deployment to restart a failed JM.

Though, such standalone mode doesn't tolerate JM failure and strategies above just

restart the application, which means all tasks will be killed and restarted.

Best,

tison.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html#configuration-1

KristoffSC <[hidden email]> 于2020年2月7日周五下午11:34写道：

Hi,
In [1] where we can find setup for Stand Alone an YARN clusters to achieve
Job Manager's HA.

Is Standalone Cluster High Availability with a zookeeper the same approach
for Docker's Job Cluster approach with Kubernetes?

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html

Thanks,
Krzysztof

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Yang Wang

Re: Flink HA for Job Cluster

Just like tison has said, you could use a deployment to restart the jobmanager pod. However,

if you want to make the all jobs could recover from the checkpoint, you also need to use the

zookeeper and HDFS/S3 to store the high-availability data.

Also some Kubernetes native HA support is in plan[1]. After that, you will not depend on

zookeeper.

[1]. https://issues.apache.org/jira/browse/FLINK-12884

tison <[hidden email]> 于2020年2月10日周一上午8:59写道：

Hi Krzysztof,

Flink doesn't provide JM HA itself yet.

For YARN deployment, you can rely on yarn.application-attempts configuration[1];
for Kubernetes deployment, Flink uses Kubernetes deployment to restart a failed JM.

Though, such standalone mode doesn't tolerate JM failure and strategies above just
restart the application, which means all tasks will be killed and restarted.

Best,
tison.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html#configuration-1

KristoffSC <[hidden email]> 于2020年2月7日周五下午11:34写道：
Hi,
In [1] where we can find setup for Stand Alone an YARN clusters to achieve
Job Manager's HA.

Is Standalone Cluster High Availability with a zookeeper the same approach
for Docker's Job Cluster approach with Kubernetes?

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html

Thanks,
Krzysztof

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

KristoffSC

Re: Flink HA for Job Cluster

Thanks you both for answers.

So I just want to have this right.

I can I achieve HA for Job Cluster Docker config having the zookeeper quorum
configured like mentioned in [1] right (with s3 and zookeeper)?

I assume to modify default Job Cluster config to match the [1] setup.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/jobmanager_high_availability.html

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

KristoffSC

Re: Flink HA for Job Cluster

In reply to this post by Yang Wang