Howdy folks,
I'm attempting to get Flink running in a Kubernetes cluster with the ultimate goal of using GCS for checkpoints and savepoints. I've used the helm chart to deploy and followed this guide, modified for 1.6.0: https://data-artisans.com/blog/getting-started-with-da-platform-on-google-kubernetes-engine I've built a container putting these: flink-shaded-hadoop2-uber-1.6.0.jar gcs-connector-hadoop2-latest.jar in /opt/flink/lib I've been running WordCount.jar to test, using the input and output flags pointing at a GCS bucket. I've verified that my two jars show up in the classpath in the logs, but when I run the job it throws the following errors: flink-flink-jobmanager-9766f9b4c-kfkk5: Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'gs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded. flink-flink-jobmanager-9766f9b4c-kfkk5: at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:403) flink-flink-jobmanager-9766f9b4c-kfkk5: at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:318) flink-flink-jobmanager-9766f9b4c-kfkk5: at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298) flink-flink-jobmanager-9766f9b4c-kfkk5: at org.apache.flink.api.common.io.FileOutputFormat.initializeGlobal(FileOutputFormat.java:275) flink-flink-jobmanager-9766f9b4c-kfkk5: at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:89) flink-flink-jobmanager-9766f9b4c-kfkk5: at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:216) I've been wrestling with this a fair bit. Eventually I built a container with the core-site.xml and the GCS key in the /opt/flink/etc-hadoop directory and then set HADOOP_CONF_DIR to point there. I've discovered that I can run the container in standalone mode using the start-cluster.sh script, it works just fine. I can replicate this in kubernetes and locally using docker as well as locally. If I start the job manager and the task manager individually using their respective scripts, I get the aforementioned error. Oddly, I get issues when running locally as well, if I use the start-cluster.sh script, my wordcount test works just fine. If I start the job manager and task manager processes using their scripts, I can read the file from GCS, but I get a 403 when trying to write the output. I've no idea how to proceed with troubleshooting this further as I'm a newbie to flink. Some direction would be helpful. Cheers, Heath Albritton |
Free forum by Nabble | Edit this page |