HA and zookeeper

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

HA and zookeeper

Boris Lublinsky
For HA implementation, is zookeeper is used only for leader selection, or it also stores some data relevant for switching to backup server
Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

Reply | Threaded
Open this post in threaded view
|

Re: HA and zookeeper

Fabian Hueske-2
Hi Boris,

ZooKeeper is also used by the JobManager to store metadata about the running job.
The JM writes information like the JobGraph, JAR file, checkpoint metadata to a persistent storage (like HDFS, S3, ...) and a pointer to this information to ZooKeeper.
In case of a recovery, the new JM looks up the pointer from ZooKeeper and fetches the job metadata from the persistent storage.

Best, Fabian

Am Sa., 6. Apr. 2019 um 01:28 Uhr schrieb Boris Lublinsky <[hidden email]>:
For HA implementation, is zookeeper is used only for leader selection, or it also stores some data relevant for switching to backup server
Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

Reply | Threaded
Open this post in threaded view
|

Re: HA and zookeeper

Boris Lublinsky
Thanks.
Is there:
1.  Documentation, describing this?
2. Any proposals/work trying to store it elsewhere?

The reason for this question is kubernetes deployment, where the use of zookeeper seems an overkill, but it will not work without zookeeper, see https://jobs.zalando.com/tech/blog/running-apache-flink-on-kubernetes/?gh_src=4n3gxh1

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

On Apr 8, 2019, at 10:29 AM, Fabian Hueske <[hidden email]> wrote:

Hi Boris,

ZooKeeper is also used by the JobManager to store metadata about the running job.
The JM writes information like the JobGraph, JAR file, checkpoint metadata to a persistent storage (like HDFS, S3, ...) and a pointer to this information to ZooKeeper.
In case of a recovery, the new JM looks up the pointer from ZooKeeper and fetches the job metadata from the persistent storage.

Best, Fabian

Am Sa., 6. Apr. 2019 um 01:28 Uhr schrieb Boris Lublinsky <[hidden email]>:
For HA implementation, is zookeeper is used only for leader selection, or it also stores some data relevant for switching to backup server
Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/


Reply | Threaded
Open this post in threaded view
|

Re: HA and zookeeper

Konstantin Knauf-2
Hi Boris,

I am not aware of documentation describing this in detail. There is an open JIRA for a High Availability Service based on etcd [1].

Cheers,

Konstantin


On Mon, Apr 8, 2019 at 3:20 PM Boris Lublinsky <[hidden email]> wrote:
Thanks.
Is there:
1.  Documentation, describing this?
2. Any proposals/work trying to store it elsewhere?

The reason for this question is kubernetes deployment, where the use of zookeeper seems an overkill, but it will not work without zookeeper, see https://jobs.zalando.com/tech/blog/running-apache-flink-on-kubernetes/?gh_src=4n3gxh1

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

On Apr 8, 2019, at 10:29 AM, Fabian Hueske <[hidden email]> wrote:

Hi Boris,

ZooKeeper is also used by the JobManager to store metadata about the running job.
The JM writes information like the JobGraph, JAR file, checkpoint metadata to a persistent storage (like HDFS, S3, ...) and a pointer to this information to ZooKeeper.
In case of a recovery, the new JM looks up the pointer from ZooKeeper and fetches the job metadata from the persistent storage.

Best, Fabian

Am Sa., 6. Apr. 2019 um 01:28 Uhr schrieb Boris Lublinsky <[hidden email]>:
For HA implementation, is zookeeper is used only for leader selection, or it also stores some data relevant for switching to backup server
Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/




--

Konstantin Knauf | Solutions Architect

+49 160 91394525


Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--

Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen