Hi, New to both AWS and Flink but currently have a need to write incoming data into a S3 bucket managed via AWS Tempory credentials. I am unable to get this to work, but I am not entirely sure on the steps needed. I can write to S3 buckets that are not 'remote' and managed by STS tempory credentials fine. I am using flink 1.9.1, as this will when deployed live in EMR. My flink-conf.yml contains the following entries: fs.s3a.bucket.sky-rdk-telemetry.aws.credentials.provider: > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider fs.s3a.bucket.sky-rdk-telemetry.assumed.role.credentials.provider: org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider fs.s3a.bucket.sky-rdk-telemetry.access-key: xxxxx fs.s3a.bucket.sky-rdk-telemetry.secret-key: xxxx fs.s3a.bucket.sky-rdk-telemetry.assumed.role.arn: xxxx fs.s3a.bucket.sky-rdk-telemetry.assumed.role.session.name: xxxx And my POM contains <dependencyManagement> <dependencies> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-bom</artifactId> <version>1.11.700</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-sts</artifactId> <version>1.11.700</version> </dependency> I have put the jar flink-s3-fs-hadoop-1.9.1.jar into the plugins directory. Running my test Jar I am getting exceptions related to Class not found for org/apache/flink/fs/s3base/shaded/com/amazonaws/services/securitytoken/model/AWSSecurityTokenServiceException and poking around I see this is shaded into a package in Kinesis. I have added some rules to maven shade to rewrite the package as needed but this still doesn't help. Am I heading in the correct direction? Searching has not turned up much information that I have been able to make use of. Thanks for your time, J Sent with ProtonMail Secure Email. |
Hey, sorry for the late response. Can you provide the full exceptions(s) including stacktrace you are seeing? On Mon, Apr 20, 2020 at 3:39 PM orionemail <[hidden email]> wrote:
|
In reply to this post by orionemail
Oh shoot, I replied privately. Let me forward the responses to the list including the solution.
---------- Forwarded message --------- From: orionemail <[hidden email]> Date: Thu, Apr 23, 2020 at 3:38 PM Subject: Re: StreamingFileSink to a S3 Bucket on a remote account using STS To: Arvid Heise <[hidden email]> Hi, Thanks for your assistance. With your pointers I have been able to confirm that I have been able to write to a remote bucket managed with the tempory/STS credentials. It did indeed take a lot of fiddling to get it working and I am not quite sure what I did is safe for production. However I'll mention breifly what I did here incase it helps anyone else. 1. As suggested move the flink-s3-fs-hadoop-1.9.1.jar into the lib directory rather than inside plugins. 2. I then removed the dependancy for aws-java-sdk-sts from my pom.xml and removed shading and class relocations I originally tried. This was not enough to get this to work so I decided it would be cleaner to modify the original aws-sdk pom.xml instead. 3. Downloaded the source for the java SDK for AWS and modified aws-sdk-core and aws-sdk-sts pom.xml files to apply the shading. aws-sdk-sts/pom.xml <relocations> <relocation> <pattern>com.amazonaws</pattern> <shadedPattern>org.apache.flink.fs.s3base.shaded.com.amazonaws</shadedPattern> <includes> <include>com.amazonaws.**</include> </includes> </relocation> </relocations> aws-sdk-core/pom.xml <relocations> <relocation> <pattern>com</pattern> <shadedPattern>org.apache.flink.fs.s3base.shaded.com</shadedPattern> <!-- <includes> --> <!-- <include>com.amazonaws.**</include> --> <!-- </includes> --> </relocation> <relocation> <pattern>org</pattern> <shadedPattern>org.apache.flink.fs.shaded.hadoop3.org</shadedPattern> <!-- <includes> --> <!-- <include>com.amazonaws.**</include> --> <!-- </includes> --> <excludes> <exclude>org.apache.http.**</exclude> </excludes> </relocation> <relocation>j <pattern>org.apache.http</pattern> <shadedPattern>org.apache.flink.fs.s3base.shaded.org.apache.http</shadedPattern> </relocation> </relocations> This was enough to get my test to work. Does this look sensible, or could I have done this in a simpler way? Thanks again for your help. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Wednesday, 22 April 2020 16:02, Arvid Heise <[hidden email]> wrote:
-- Arvid Heise | Senior Java Developer Follow us @VervericaData -- Join Flink Forward - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbHRegistered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng |
Free forum by Nabble | Edit this page |