(DEPRECATED) Apache Flink User Mailing List archive.

Job cluster and HA

Classic

List

Threaded

5 messages Options

Boris Lublinsky

Job cluster and HA

Hi,

I was experimenting with HA lately and see that it recovers successfully job, in the case of jobmanager restarts.

Now my question is whether it will work for the job cluster.

Based on the instructions https://github.com/apache/flink/blob/release-1.8/flink-container/docker/README.md

I can see https://github.com/apache/flink/blob/release-1.8/flink-container/docker/docker-entrypoint.sh that

In this case the following command is invoked:

exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$@“

Which means that if a jobManager restarts, the following is going to happen:

1. It will use HA to restore job that was running

2. A new job will be submitted, overwriting restored job and bypassing checkpoint restore.

Am I missing something here?

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

bastien dine

Re: Job cluster and HA

Hello Boris,

I think you are confused by the name of the shell script "standalone-job.sh"

Which basically means that we start a "standalone job manager" as stated in the first comment of

https://github.com/apache/flink/blob/release-1.8/flink-dist/src/main/flink-bin/bin/standalone-job.sh

This is another version of : flink-dist/src/main/flink-bin/bin/jobmanager.sh

It's not related to a job

When you configure H-A on a flink cluster, and you submit a job, Flink (i.e the jobmanager) store the state of the job in Zookeeper / HDFS

So when it crashes and comes back (with this entrypoint) it will read in ZK / HDFS and restore previous execution

Regards,

Bastien

------------------

Bastien DINE
Data Architect / Software Engineer / Sysadmin

bastiendine.io

Le ven. 24 mai 2019 à 23:22, Boris Lublinsky <[hidden email]> a écrit :

Hi,
I was experimenting with HA lately and see that it recovers successfully job, in the case of jobmanager restarts.
Now my question is whether it will work for the job cluster.
Based on the instructions https://github.com/apache/flink/blob/release-1.8/flink-container/docker/README.md
I can see https://github.com/apache/flink/blob/release-1.8/flink-container/docker/docker-entrypoint.sh that
In this case the following command is invoked:
exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$@“

Which means that if a jobManager restarts, the following is going to happen:

1. It will use HA to restore job that was running
2. A new job will be submitted, overwriting restored job and bypassing checkpoint restore.

Am I missing something here?

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

RAMALINGESWARA RAO THOTTEMPUDI

How to run Graph algorithms

Respected All,

I am a new learner of Apache Flink. I want to run existing Graph algorithms (examples) given in Flink download software with my own data. But I am not getting how to run those existing example algos on my input data. Kindly suggest me a solution.

From: "bastien dine" <[hidden email]>
To: "Boris Lublinsky" <[hidden email]>
Cc: "user" <[hidden email]>
Sent: Saturday, May 25, 2019 1:15:32 PM
Subject: Re: Job cluster and HA

Hello Boris,

I think you are confused by the name of the shell script "standalone-job.sh"

Which basically means that we start a "standalone job manager" as stated in the first comment of

https://github.com/apache/flink/blob/release-1.8/flink-dist/src/main/flink-bin/bin/standalone-job.sh

This is another version of : flink-dist/src/main/flink-bin/bin/jobmanager.sh

It's not related to a job

When you configure H-A on a flink cluster, and you submit a job, Flink (i.e the jobmanager) store the state of the job in Zookeeper / HDFS

So when it crashes and comes back (with this entrypoint) it will read in ZK / HDFS and restore previous execution

Regards,

Bastien

------------------

Bastien DINE
Data Architect / Software Engineer / Sysadmin

bastiendine.io

Le ven. 24 mai 2019 à 23:22, Boris Lublinsky <[hidden email]> a écrit :

Hi,
I was experimenting with HA lately and see that it recovers successfully job, in the case of jobmanager restarts.
Now my question is whether it will work for the job cluster.
Based on the instructions https://github.com/apache/flink/blob/release-1.8/flink-container/docker/README.md
I can see https://github.com/apache/flink/blob/release-1.8/flink-container/docker/docker-entrypoint.sh that
In this case the following command is invoked:
exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$@“

Which means that if a jobManager restarts, the following is going to happen:

1. It will use HA to restore job that was running
2. A new job will be submitted, overwriting restored job and bypassing checkpoint restore.

Am I missing something here?

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

Boris Lublinsky

Re: Job cluster and HA

In reply to this post by bastien dine

I understand this.

What I am not sure about is the sequence:

It will come back and restore from ZK/HDFS

And then it will try to start a job specified in the class, right?

Which will overwrite (potentially) the restored job.

How does it know not to start the job defined in the class, once the previous one was restored?

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

On May 25, 2019, at 2:45 AM, bastien dine <[hidden email]> wrote:

Hello Boris,

I think you are confused by the name of the shell script "standalone-job.sh"
Which basically means that we start a "standalone job manager" as stated in the first comment of
https://github.com/apache/flink/blob/release-1.8/flink-dist/src/main/flink-bin/bin/standalone-job.sh
This is another version of : flink-dist/src/main/flink-bin/bin/jobmanager.sh

It's not related to a job

When you configure H-A on a flink cluster, and you submit a job, Flink (i.e the jobmanager) store the state of the job in Zookeeper / HDFS
So when it crashes and comes back (with this entrypoint) it will read in ZK / HDFS and restore previous execution

Regards,
Bastien

------------------

Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io

Le ven. 24 mai 2019 à 23:22, Boris Lublinsky <[hidden email]> a écrit :
Hi,
I was experimenting with HA lately and see that it recovers successfully job, in the case of jobmanager restarts.
Now my question is whether it will work for the job cluster.
Based on the instructions https://github.com/apache/flink/blob/release-1.8/flink-container/docker/README.md
I can see https://github.com/apache/flink/blob/release-1.8/flink-container/docker/docker-entrypoint.sh that
In this case the following command is invoked:
exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$@“

Which means that if a jobManager restarts, the following is going to happen:

1. It will use HA to restore job that was running
2. A new job will be submitted, overwriting restored job and bypassing checkpoint restore.

Am I missing something here?

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

rmetzger0

Re: How to run Graph algorithms

In reply to this post by RAMALINGESWARA RAO THOTTEMPUDI

Hey,

this page explains how to run a Flink job: https://ci.apache.org/projects/flink/flink-docs-master/getting-started/tutorials/local_setup.html

On Sat, May 25, 2019 at 1:28 PM RAMALINGESWARA RAO THOTTEMPUDI <[hidden email]> wrote:

Respected All,
I am a new learner of Apache Flink. I want to run existing Graph algorithms (examples) given in Flink download software with my own data. But I am not getting how to run those existing example algos on my input data. Kindly suggest me a solution.
From: "bastien dine" <[hidden email]>
To: "Boris Lublinsky" <[hidden email]>
Cc: "user" <[hidden email]>
Sent: Saturday, May 25, 2019 1:15:32 PM
Subject: Re: Job cluster and HA

Hello Boris,
I think you are confused by the name of the shell script "standalone-job.sh"
Which basically means that we start a "standalone job manager" as stated in the first comment of
https://github.com/apache/flink/blob/release-1.8/flink-dist/src/main/flink-bin/bin/standalone-job.sh
This is another version of : flink-dist/src/main/flink-bin/bin/jobmanager.sh

It's not related to a job

When you configure H-A on a flink cluster, and you submit a job, Flink (i.e the jobmanager) store the state of the job in Zookeeper / HDFS
So when it crashes and comes back (with this entrypoint) it will read in ZK / HDFS and restore previous execution

Regards,
Bastien

------------------

Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io

Le ven. 24 mai 2019 à 23:22, Boris Lublinsky <[hidden email]> a écrit :
Hi,
I was experimenting with HA lately and see that it recovers successfully job, in the case of jobmanager restarts.
Now my question is whether it will work for the job cluster.
Based on the instructions https://github.com/apache/flink/blob/release-1.8/flink-container/docker/README.md
I can see https://github.com/apache/flink/blob/release-1.8/flink-container/docker/docker-entrypoint.sh that
In this case the following command is invoked:
exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$@“

Which means that if a jobManager restarts, the following is going to happen:

1. It will use HA to restore job that was running
2. A new job will be submitted, overwriting restored job and bypassing checkpoint restore.

Am I missing something here?

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/