HA HDFS

Posted by Steven Nelson on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/HA-HDFS-tp25947.html

I am working on a POC High Availability installation of Flink on top of Kubernetes with HDFS as a data storage location. I am not finding much documentation on doing this, or I am finding the documentation in parts and maybe getting it put together correctly. I think it falls between being an HDFS thing and a Flink thing. 

I am deploying to Kubernetes using the flink:1.7.0-hadoop27-scala_2.11 container off of docker hub. 

I think these are the things I need to do
2) Set the HADOOP_CONF_DIR environment variable to the location of that file per https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#hdfs
3) Create a flink-conf.yaml file that looks something like
        fs.default-scheme: hdfs://
        state.backend: rocksdb
        state.savepoints.dir: hdfs://flink/savepoints
        state.checkpoints.dir: hdfs://flink/checkpoints
4) Dance a little jig when it works.

Has anyone set this up? If so, am I missing anything?

-Steve