Flink HA for Job Cluster

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink HA for Job Cluster

KristoffSC
Hi,
In [1] where we can find setup for Stand Alone an YARN clusters to achieve
Job Manager's HA.

Is Standalone Cluster High Availability with a zookeeper the same approach
for Docker's Job Cluster approach with Kubernetes?

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html

Thanks,
Krzysztof



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink HA for Job Cluster

tison
Hi Krzysztof,

Flink doesn't provide JM HA itself yet.

For YARN deployment, you can rely on yarn.application-attempts configuration[1];
for Kubernetes deployment, Flink uses Kubernetes deployment to restart a failed JM.

Though, such standalone mode doesn't tolerate JM failure and strategies above just
restart the application, which means all tasks will be killed and restarted.



KristoffSC <[hidden email]> 于2020年2月7日周五 下午11:34写道:
Hi,
In [1] where we can find setup for Stand Alone an YARN clusters to achieve
Job Manager's HA.

Is Standalone Cluster High Availability with a zookeeper the same approach
for Docker's Job Cluster approach with Kubernetes?

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html

Thanks,
Krzysztof



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink HA for Job Cluster

Yang Wang
Just like tison has said, you could use a deployment to restart the jobmanager pod. However,
if you want to make the all jobs could recover from the checkpoint, you also need to use the
zookeeper and HDFS/S3 to store the high-availability data.

Also some Kubernetes native HA support is in plan[1]. After that, you will not depend on
zookeeper.


tison <[hidden email]> 于2020年2月10日周一 上午8:59写道:
Hi Krzysztof,

Flink doesn't provide JM HA itself yet.

For YARN deployment, you can rely on yarn.application-attempts configuration[1];
for Kubernetes deployment, Flink uses Kubernetes deployment to restart a failed JM.

Though, such standalone mode doesn't tolerate JM failure and strategies above just
restart the application, which means all tasks will be killed and restarted.



KristoffSC <[hidden email]> 于2020年2月7日周五 下午11:34写道:
Hi,
In [1] where we can find setup for Stand Alone an YARN clusters to achieve
Job Manager's HA.

Is Standalone Cluster High Availability with a zookeeper the same approach
for Docker's Job Cluster approach with Kubernetes?

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html

Thanks,
Krzysztof



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink HA for Job Cluster

KristoffSC
Thanks you both for answers.

So I just want to have this right.

I can I achieve HA for Job Cluster Docker config having the zookeeper quorum
configured like mentioned in [1] right (with s3 and zookeeper)?

I assume to modify default Job Cluster config to match the [1] setup.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/jobmanager_high_availability.html





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink HA for Job Cluster

KristoffSC
In reply to this post by Yang Wang
Thanks you both for answers.

So I just want to have this right.

I can I achieve HA for Job Cluster Docker config having the zookeeper quorum
configured like mentioned in [1] right (with s3 and zookeeper)?

I assume to modify default Job Cluster config to match the [1] setup.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/jobmanager_high_availability.html



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/