ZooKeeper connection SUSPENDING

Posted by Kenzyme on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/ZooKeeper-connection-SUSPENDING-tp38779.html

Hi,

Related to https://mail-archives.apache.org/mod_mbox/flink-dev/201709.mbox/%3CCA+faj9yvPyzmmLoEWAMPgXDP6kx+0oed1Z5k4s3K9sgiCFyb=w@...%3E and https://issues.apache.org/jira/browse/FLINK-10052, I was wondering if there's a way to prevent Flink instances from failing while doing a rolling restart on ZK followers while still keeping the quorum?

This is what was shown in Flink logs while restarting ZK :
ZooKeeper connection SUSPENDING. Changes to the submitted job graphs are not monitored (temporarily).

I was able to reproduce this twice with a quorum of 5 ZK nodes while doing some ZK maintenance.

Thanks!

Kenzyme Le