Re:Re: how to setup a ha flink cluster on k8s?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re:Re: how to setup a ha flink cluster on k8s?

Rock
Hi Yang Wang,

Thanks for your reply, I MAY HAVE setup a ha cluster succefully. The reason I can't setup before may be some bug about s3 in flink, after change to hdfs,I can run it suceefully.

But after about one day of running ,the job-manager will crash and can't recover automatic, I must apply the deployment of job-manager manually (and that will fix the problom,my jobs will auto start), so strange ....

Since I changed too many from the yaml in flink's doc, I really don't know where is my conf is wrong.But I have add logback to flink and let
it send log to my elasticsearch cluster,may the log can tell more......

------------------ 原始邮件 ------------------
发件人: "Yang Wang"<[hidden email]>;
发送时间: 2019年11月19日(星期二) 中午12:05
收件人: "vino yang"<[hidden email]>;
主题: Re: how to setup a ha flink cluster on k8s?

Hi Rock,

If you want to start a ha flink cluster on k8s, the simplest way is to use ZK+HDFS/S3,
just as the ha configuration on Yarn. The zookeeper-operator could help the start a zk
cluster.[1] Please share more information that why it could not work.

If you are using kubernetes per-job cluster, the job could be recovered when the jm pod
crashed and restarted.[2] The savepoint could also be used to get better recovery. 


vino yang <[hidden email]> 于2019年11月16日周六 下午5:00写道:
Hi Rock,

I searched by Google and found a blog[1] talk about how to config JM HA for Flink on k8s. Do not know whether it suitable for you or not. Please feel free to refer to it.

Best,
Vino


Rock <[hidden email]> 于2019年11月16日周六 上午11:02写道:

I'm trying to setup a flink cluster on k8s for production use.But the setup here https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/kubernetes.html  this not ha , when job-manager down and rescheduled

the metadata for running job is lost. 

 

I tried to use ha setup for zk  https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html on k8s , but can't get it right.

 

Stroing  job's metadata on k8s using pvc or other external file system should be  very easy.Is there a way to achieve it.

Reply | Threaded
Open this post in threaded view
|

Re: Re: how to setup a ha flink cluster on k8s?

Yang Wang
Hi Rock,

If you correctly set the restart strategy, i think the jobmanager will failover and relaunched again.
Also the job will be recovered, please share more logs about jobmanager if you want.


Best,
Yang

Rock <[hidden email]> 于2019年11月20日周三 下午2:57写道:
Hi Yang Wang,

Thanks for your reply, I MAY HAVE setup a ha cluster succefully. The reason I can't setup before may be some bug about s3 in flink, after change to hdfs,I can run it suceefully.

But after about one day of running ,the job-manager will crash and can't recover automatic, I must apply the deployment of job-manager manually (and that will fix the problom,my jobs will auto start), so strange ....

Since I changed too many from the yaml in flink's doc, I really don't know where is my conf is wrong.But I have add logback to flink and let
it send log to my elasticsearch cluster,may the log can tell more......

------------------ 原始邮件 ------------------
发件人: "Yang Wang"<[hidden email]>;
发送时间: 2019年11月19日(星期二) 中午12:05
收件人: "vino yang"<[hidden email]>;
主题: Re: how to setup a ha flink cluster on k8s?

Hi Rock,

If you want to start a ha flink cluster on k8s, the simplest way is to use ZK+HDFS/S3,
just as the ha configuration on Yarn. The zookeeper-operator could help the start a zk
cluster.[1] Please share more information that why it could not work.

If you are using kubernetes per-job cluster, the job could be recovered when the jm pod
crashed and restarted.[2] The savepoint could also be used to get better recovery. 


vino yang <[hidden email]> 于2019年11月16日周六 下午5:00写道:
Hi Rock,

I searched by Google and found a blog[1] talk about how to config JM HA for Flink on k8s. Do not know whether it suitable for you or not. Please feel free to refer to it.

Best,
Vino


Rock <[hidden email]> 于2019年11月16日周六 上午11:02写道:

I'm trying to setup a flink cluster on k8s for production use.But the setup here https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/kubernetes.html  this not ha , when job-manager down and rescheduled

the metadata for running job is lost. 

 

I tried to use ha setup for zk  https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html on k8s , but can't get it right.

 

Stroing  job's metadata on k8s using pvc or other external file system should be  very easy.Is there a way to achieve it.