Hello!
I am working on setting up a new flink cluster that stores it's checkpoints in an HDFS cluster deployed to the same Kubernetes cluster.
I am running into problems with the dependencies required to use the hdfs:// storage location.
The exception I am getting is
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:403)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:318)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
at org.apache.flink.runtime.state.filesystem.FsCheckpointStorage.<init>(FsCheckpointStorage.java:58)
at org.apache.flink.runtime.state.filesystem.FsStateBackend.createCheckpointStorage(FsStateBackend.java:444)
at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.<init>(CheckpointCoordinator.java:249)
... 17 more
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies.
at org.apache.flink.core.fs.UnsupportedSchemeFactory.create(UnsupportedSchemeFactory.java:64)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:399)
... 22 more
Applicable portions of my build.sbt
val flinkVersion = "1.7.0"
"org.apache.flink" %% "flink-scala" % flinkVersion % Provided,
"org.apache.flink" %% "flink-streaming-scala" % flinkVersion % Provided,
"org.apache.flink" % "flink-hadoop-fs" % flinkVersion,
"org.apache.flink" %% "flink-test-utils" % flinkVersion % Test,
("org.apache.flink" %% "flink-connector-kinesis" % flinkVersion)
.exclude("org.apache.flink", "force-shading"),
I have attempted to add
val hadoopVersion = "2.7.2"
"org.apache.hadoop" % "hadoop-hdfs" % hadoopVersion,
"org.apache.hadoop" % "hadoop-common" % hadoopVersion,
But I am getting alot of version conflicts when I try to package the jar.
Thoughts?
-Steve