Hello, After following the instructions to set the S3 filesystem in the documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#set-s3-filesystem) I encountered the following error:
The documentation goes on to say “If your job submission fails with an Exception message noting that No file system found with scheme s3 this means that no FileSystem has been configured for S3. Please check out the FileSystem Configuration section for details on how to configure this properly."After checking over the configuration the error persisted. My configuration is as follows. I am using the docker image flink:1.3.1, with command: local # flink --version Version: 1.3.1, Commit ID: 1ca6e5b # cat flink/config/flink-conf.yaml | head -n1 fs.hdfs.hadoopconf: /root/hadoop-config The rest of the content of flink-conf.yaml is identical to the release version. The following was added to /root/hadoop-config/core-site.xml, I understand this is used internally by flink as configuration for “org.apache.hadoop.fs.s3a.S3AFileSystem” I’ve removed my AWS access key and secret for obvious reasons, they are present in the actual file ;-) # cat /root/hadoop-config/core-site.xml <configuration> <property> <name>fs.s3.impl</name> <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value> </property> <property> <name>fs.s3a.buffer.dir</name> <value>/tmp</value> </property> <property> <name>fs.s3a.access.key</name> <value>MY_ACCESS_KEY</value> </property> <property> <name>fs.s3a.secret.key</name> <value>MY_SECRET_KEY</value> </property> </configuration> The JAR’s aws-java-sdk-1.7.4.jar, hadoop-aws-2.7.4.jar, httpclient-4.2.5.jar, httpcore-4.2.5.jar where added to flink/lib/ from http://apache.mirror.anlx.net/hadoop/common/hadoop-2.7.4/hadoop-2.7.4.tar.gz # ls flink/lib/ aws-java-sdk-1.7.4.jar flink-dist_2.11-1.3.1.jar flink-python_2.11-1.3.1.jar flink-shaded-hadoop2-uber-1.3.1.jar hadoop-aws-2.7.4.jar httpclient-4.2.5.jar httpcore-4.2.5.jar log4j-1.2.17.jar slf4j-log4j12-1.7.7.jar I’m using the streaming api, with the following example: // Set StreamExecutionEnvironment final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); // Set checkpoints in ms env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); // Add source (input stream) DataStream<String> dataStream = StreamUtil.getDataStream(env, params); // Sink to S3 Bucket dataStream.writeAsText("<a href="s3://test-flink/test.txt" class="">s3://test-flink/test.txt").setParallelism(1); pom.xml has the following build dependencies. <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-filesystem_2.10</artifactId> <version>1.3.1</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-aws</artifactId> <version>2.7.2</version> </dependency> Would anybody be able to spare some time to help me resolve my problem? I'm sure I’m missing something simple here. Thanks :-)
|
Shouldn't the config key be : org.apache.hadoop.fs.s3.S3FileSystem Cheers On Fri, Aug 11, 2017 at 5:38 PM, ant burton <[hidden email]> wrote:
|
In reply to this post by ant burton
Hi, The config should be fs.s3a.impl instead of fs.s3.impl Also when you are providing the S3 write path in config file or directly in code start with s3a://<path_to_write_data> Regards, Vinay Patil On Sat, Aug 12, 2017 at 6:07 AM, ant burton [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |