I'm trying to use readTextFile() to access files in S3. I have verified the s3 key and secret are clean and the s3 path is similar to com.somepath/somefile. (the names changed to protect the guilty) Any idea what I'm missing? 2021-01-13 12:12:43,836 DEBUG org.apache.flink.streaming.api.functions.source.ContinuousFileMonitoringFunction [] - Opened ContinuousFileMonitoringFunction (taskIdx= 0) for path: s3://com.somepath/somefile 2021-01-13 12:12:43,843 DEBUG org.apache.flink.fs.s3.common.AbstractS3FileSystemFactory [] - Creating S3 file system backed by Hadoop s3a file system 2021-01-13 12:12:43,844 DEBUG org.apache.flink.fs.s3.common.AbstractS3FileSystemFactory [] - Loading Hadoop configuration for Hadoop s3a file system 2021-01-13 12:12:43,926 DEBUG org.apache.flink.fs.s3hadoop.common.HadoopConfigLoader [] - Adding Flink config entry for s3.access-key as fs.s3a.access-key to Hadoop config 2021-01-13 12:12:43,926 DEBUG org.apache.flink.fs.s3hadoop.common.HadoopConfigLoader [] - Adding Flink config entry for s3.secret-key as fs.s3a.secret-key to Hadoop config 2021-01-13 12:12:43,944 DEBUG org.apache.flink.streaming.runtime.tasks.StreamTask [] - Invoking Split Reader: Custom File Source -> (Timestamps/Watermarks, Map -> Filter -> Sink: Unnamed) (1/1)#0 2021-01-13 12:12:43,944 DEBUG org.apache.flink.streaming.api.operators.BackendRestorerProcedure [] - Creating operator state backend for TimestampsAndWatermarksOperator_1cf40e099136da16c66c61032de62905_(1/1) with empty state. 2021-01-13 12:12:43,946 DEBUG org.apache.flink.streaming.api.operators.BackendRestorerProcedure [] - Creating operator state backend for StreamSink_d91236bbbed306c2379eac4982246f1f_(1/1) with empty state. 2021-01-13 12:12:43,955 DEBUG org.apache.hadoop.conf.Configuration [] - Reloading 1 existing configurations 2021-01-13 12:12:43,961 DEBUG org.apache.flink.fs.s3hadoop.S3FileSystemFactory [] - Using scheme s3://com.somepath/somefile for s3a file system backing the S3 File System 2021-01-13 12:12:43,965 DEBUG org.apache.flink.streaming.api.functions.source.ContinuousFileMonitoringFunction [] - Closed File Monitoring Source for path: s3://com.somepath/somefile. 2021-01-13 12:12:43,967 WARN org.apache.flink.runtime.taskmanager.Task [] - Source: Custom File Source (1/1)#0 (1d75ae07abbd65f296c55a61a400c59f) switched from RUNNING to FAILED. at org.apache.flink.fs.s3.common.AbstractS3FileSystemFactory.create(AbstractS3FileSystemFactory.java:163) ~[blob_p-e297dae3da73ba51c20f14193b5ae6e09694422a-293a7d95166eee9a9b2329b71764cf67:?] at org.apache.flink.core.fs.PluginFileSystemFactory.create(PluginFileSystemFactory.java:61) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:468) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:389) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.streaming.api.functions.source.ContinuousFileMonitoringFunction.run(ContinuousFileMonitoringFunction.java:196) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:215) ~[flink-dist_2.11-1.12.0.jar:1.12.0] Caused by: java.lang.NullPointerException: null uri host. This can be caused by unencoded / in the password string at java.util.Objects.requireNonNull(Objects.java:246) ~[?:?] at org.apache.hadoop.fs.s3native.S3xLoginHelper.buildFSURI(S3xLoginHelper.java:69) ~[blob_p-e297dae3da73ba51c20f14193b5ae6e09694422a-293a7d95166eee9a9b2329b71764cf67:?] at org.apache.hadoop.fs.s3a.S3AFileSystem.setUri(S3AFileSystem.java:467) ~[blob_p-e297dae3da73ba51c20f14193b5ae6e09694422a-293a7d95166eee9a9b2329b71764cf67:?] at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:234) ~[blob_p-e297dae3da73ba51c20f14193b5ae6e09694422a-293a7d95166eee9a9b2329b71764cf67:?] at org.apache.flink.fs.s3.common.AbstractS3FileSystemFactory.create(AbstractS3FileSystemFactory.java:126) ~[blob_p-e297dae3da73ba51c20f14193b5ae6e09694422a-293a7d95166eee9a9b2329b71764cf67:?] ... 7 more |
Hi Billy, I think you might be hitting the same problem as described in this thread[1]. Does your bucket meet all the name requirements as described in here[2] (e.g. have an underscore)? Best, Dawid [2] https://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html On 13/01/2021 19:20, Billy Bain wrote:
signature.asc (849 bytes) Download Attachment |
Dawid, We found the issue. Our bucket has periods in the name, com.this.bucket.fails Recreating the bucket with dashes instead of periods solved it. com-this-bucket-succeeds This seems crazy, but the bucket naming guidelines are clear. For best compatibility, we recommend that you avoid using dots (.) in bucket names, except for buckets that are used only for static website hosting. If you include dots in a bucket's name, you can't use virtual-host-style addressing over HTTPS, unless you perform your own certificate validation. This is because the security certificates used for virtual hosting of buckets don't work for buckets with dots in their names. On Thu, Jan 14, 2021 at 11:12 AM Dawid Wysakowicz <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |