StreamingFileSink cannot get AWS S3 credentials

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

StreamingFileSink cannot get AWS S3 credentials

Taher Koitawala
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
Reply | Threaded
Open this post in threaded view
|

Re: StreamingFileSink cannot get AWS S3 credentials

Vinay Patil
Hi,

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

Regards,
Vinay Patil


On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163



To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: StreamingFileSink cannot get AWS S3 credentials

Dawid Wysakowicz-2

Hi,

I cc Kostas who should be able to help you.

Best,

Dawid

On 16/01/2019 08:51, Vinay Patil wrote:
Hi,

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

Regards,
Vinay Patil


On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163



To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: StreamingFileSink cannot get AWS S3 credentials

Dawid Wysakowicz-2
In reply to this post by Vinay Patil

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:
Hi,

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

Regards,
Vinay Patil


On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163



To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: StreamingFileSink cannot get AWS S3 credentials

Kostas Kloudas-3
Hi Taher,

So you are using the same configuration files and everything and the only thing you change is the "s3://" to "s3a://" and the sink cannot find the credentials?
Could you please provide the logs of the Task Managers?

Cheers,
Kostas

On Wed, Jan 16, 2019 at 9:13 AM Dawid Wysakowicz <[hidden email]> wrote:

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:
Hi,

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

Regards,
Vinay Patil


On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163



To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: StreamingFileSink cannot get AWS S3 credentials

Till Rohrmann
Hi Vinay,

Flink's file systems are self contained and won't respect the core-site.xml if I'm not mistaken. Instead you have to set the credentials in the flink configuration flink-conf.yaml via `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` and so on [1]. Have you tried this out?

This has been fixed with Flink 1.6.2 and 1.7.0 [2].


Cheers,
Till

On Wed, Jan 16, 2019 at 10:10 AM Kostas Kloudas <[hidden email]> wrote:
Hi Taher,

So you are using the same configuration files and everything and the only thing you change is the "s3://" to "s3a://" and the sink cannot find the credentials?
Could you please provide the logs of the Task Managers?

Cheers,
Kostas

On Wed, Jan 16, 2019 at 9:13 AM Dawid Wysakowicz <[hidden email]> wrote:

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:
Hi,

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

Regards,
Vinay Patil


On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163



To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: StreamingFileSink cannot get AWS S3 credentials

Kostas Kloudas-3
Actually Till is right.

Sorry, my fault, I did not read your second email where Vinay mentions the core-site.xml.

Cheers,
Kostas

On Wed, Jan 16, 2019 at 10:25 AM Till Rohrmann <[hidden email]> wrote:
Hi Vinay,

Flink's file systems are self contained and won't respect the core-site.xml if I'm not mistaken. Instead you have to set the credentials in the flink configuration flink-conf.yaml via `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` and so on [1]. Have you tried this out?

This has been fixed with Flink 1.6.2 and 1.7.0 [2].


Cheers,
Till

On Wed, Jan 16, 2019 at 10:10 AM Kostas Kloudas <[hidden email]> wrote:
Hi Taher,

So you are using the same configuration files and everything and the only thing you change is the "s3://" to "s3a://" and the sink cannot find the credentials?
Could you please provide the logs of the Task Managers?

Cheers,
Kostas

On Wed, Jan 16, 2019 at 9:13 AM Dawid Wysakowicz <[hidden email]> wrote:

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:
Hi,

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

Regards,
Vinay Patil


On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163



To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: StreamingFileSink cannot get AWS S3 credentials

Vinay Patil
In reply to this post by Till Rohrmann
Hi Till,

We are not providing `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` in flink-conf.yaml as we are using Profile based credentials provider. The older BucketingSink code is able to get the credentials and write to S3. We are facing this issue only with StreamingFileSink. We tried adding fs.s3a.impl to core-site.xml when the default configurations were not working.

Regards,
Vinay Patil


On Wed, Jan 16, 2019 at 2:55 PM Till Rohrmann <[hidden email]> wrote:
Hi Vinay,

Flink's file systems are self contained and won't respect the core-site.xml if I'm not mistaken. Instead you have to set the credentials in the flink configuration flink-conf.yaml via `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` and so on [1]. Have you tried this out?

This has been fixed with Flink 1.6.2 and 1.7.0 [2].


Cheers,
Till

On Wed, Jan 16, 2019 at 10:10 AM Kostas Kloudas <[hidden email]> wrote:
Hi Taher,

So you are using the same configuration files and everything and the only thing you change is the "s3://" to "s3a://" and the sink cannot find the credentials?
Could you please provide the logs of the Task Managers?

Cheers,
Kostas

On Wed, Jan 16, 2019 at 9:13 AM Dawid Wysakowicz <[hidden email]> wrote:

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:
Hi,

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

Regards,
Vinay Patil


On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163



To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: StreamingFileSink cannot get AWS S3 credentials

Till Rohrmann
The old BucketingSink was using Hadoop's S3 filesystem directly whereas the new StreamingFileSink uses Flink's own FileSystem which need to be configured via the flink-conf.yaml.

Cheers,
Till

On Wed, Jan 16, 2019 at 10:31 AM Vinay Patil <[hidden email]> wrote:
Hi Till,

We are not providing `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` in flink-conf.yaml as we are using Profile based credentials provider. The older BucketingSink code is able to get the credentials and write to S3. We are facing this issue only with StreamingFileSink. We tried adding fs.s3a.impl to core-site.xml when the default configurations were not working.

Regards,
Vinay Patil


On Wed, Jan 16, 2019 at 2:55 PM Till Rohrmann <[hidden email]> wrote:
Hi Vinay,

Flink's file systems are self contained and won't respect the core-site.xml if I'm not mistaken. Instead you have to set the credentials in the flink configuration flink-conf.yaml via `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` and so on [1]. Have you tried this out?

This has been fixed with Flink 1.6.2 and 1.7.0 [2].


Cheers,
Till

On Wed, Jan 16, 2019 at 10:10 AM Kostas Kloudas <[hidden email]> wrote:
Hi Taher,

So you are using the same configuration files and everything and the only thing you change is the "s3://" to "s3a://" and the sink cannot find the credentials?
Could you please provide the logs of the Task Managers?

Cheers,
Kostas

On Wed, Jan 16, 2019 at 9:13 AM Dawid Wysakowicz <[hidden email]> wrote:

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:
Hi,

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

Regards,
Vinay Patil


On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163



To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: StreamingFileSink cannot get AWS S3 credentials

Vinay Patil
Hi Till,

Can you please let us know the configurations that we need to set for Profile based credential provider in flink-conf.yaml

Exporting AWS_PROFILE property on EMR did not work.

Regards,
Vinay Patil


On Wed, Jan 16, 2019 at 3:05 PM Till Rohrmann <[hidden email]> wrote:
The old BucketingSink was using Hadoop's S3 filesystem directly whereas the new StreamingFileSink uses Flink's own FileSystem which need to be configured via the flink-conf.yaml.

Cheers,
Till

On Wed, Jan 16, 2019 at 10:31 AM Vinay Patil <[hidden email]> wrote:
Hi Till,

We are not providing `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` in flink-conf.yaml as we are using Profile based credentials provider. The older BucketingSink code is able to get the credentials and write to S3. We are facing this issue only with StreamingFileSink. We tried adding fs.s3a.impl to core-site.xml when the default configurations were not working.

Regards,
Vinay Patil


On Wed, Jan 16, 2019 at 2:55 PM Till Rohrmann <[hidden email]> wrote:
Hi Vinay,

Flink's file systems are self contained and won't respect the core-site.xml if I'm not mistaken. Instead you have to set the credentials in the flink configuration flink-conf.yaml via `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` and so on [1]. Have you tried this out?

This has been fixed with Flink 1.6.2 and 1.7.0 [2].


Cheers,
Till

On Wed, Jan 16, 2019 at 10:10 AM Kostas Kloudas <[hidden email]> wrote:
Hi Taher,

So you are using the same configuration files and everything and the only thing you change is the "s3://" to "s3a://" and the sink cannot find the credentials?
Could you please provide the logs of the Task Managers?

Cheers,
Kostas

On Wed, Jan 16, 2019 at 9:13 AM Dawid Wysakowicz <[hidden email]> wrote:

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:
Hi,

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

Regards,
Vinay Patil


On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163



To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: StreamingFileSink cannot get AWS S3 credentials

Till Rohrmann
I haven't configured this myself but I would guess that you need to set the parameters defined here under S3A Authentication methods [1]. If the environment variables don't work, then I would try to set the authentication properties.


Cheers,
Till

On Wed, Jan 16, 2019 at 11:09 AM Vinay Patil <[hidden email]> wrote:
Hi Till,

Can you please let us know the configurations that we need to set for Profile based credential provider in flink-conf.yaml

Exporting AWS_PROFILE property on EMR did not work.

Regards,
Vinay Patil


On Wed, Jan 16, 2019 at 3:05 PM Till Rohrmann <[hidden email]> wrote:
The old BucketingSink was using Hadoop's S3 filesystem directly whereas the new StreamingFileSink uses Flink's own FileSystem which need to be configured via the flink-conf.yaml.

Cheers,
Till

On Wed, Jan 16, 2019 at 10:31 AM Vinay Patil <[hidden email]> wrote:
Hi Till,

We are not providing `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` in flink-conf.yaml as we are using Profile based credentials provider. The older BucketingSink code is able to get the credentials and write to S3. We are facing this issue only with StreamingFileSink. We tried adding fs.s3a.impl to core-site.xml when the default configurations were not working.

Regards,
Vinay Patil


On Wed, Jan 16, 2019 at 2:55 PM Till Rohrmann <[hidden email]> wrote:
Hi Vinay,

Flink's file systems are self contained and won't respect the core-site.xml if I'm not mistaken. Instead you have to set the credentials in the flink configuration flink-conf.yaml via `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` and so on [1]. Have you tried this out?

This has been fixed with Flink 1.6.2 and 1.7.0 [2].


Cheers,
Till

On Wed, Jan 16, 2019 at 10:10 AM Kostas Kloudas <[hidden email]> wrote:
Hi Taher,

So you are using the same configuration files and everything and the only thing you change is the "s3://" to "s3a://" and the sink cannot find the credentials?
Could you please provide the logs of the Task Managers?

Cheers,
Kostas

On Wed, Jan 16, 2019 at 9:13 AM Dawid Wysakowicz <[hidden email]> wrote:

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:
Hi,

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

Regards,
Vinay Patil


On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
Hi All,
         We have implemented S3 sink in the following way:

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))
.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163



To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML
Reply | Threaded
Open this post in threaded view
|

RE: EXT :Re: StreamingFileSink cannot get AWS S3 credentials

Martin, Nick-2

Does that mean that the full set of fs.s3a.<…> configs in core-site.xml will be picked up from flink-conf.yaml by flink? Or only certain configs involved with authentication?

 

From: Till Rohrmann [mailto:[hidden email]]
Sent: Wednesday, January 16, 2019 3:43 AM
To: Vinay Patil <[hidden email]>
Cc: Kostas Kloudas <[hidden email]>; Dawid Wysakowicz <[hidden email]>; Taher Koitawala [via Apache Flink User Mailing List archive.] <ml+[hidden email]>; user <[hidden email]>
Subject: EXT :Re: StreamingFileSink cannot get AWS S3 credentials

 

I haven't configured this myself but I would guess that you need to set the parameters defined here under S3A Authentication methods [1]. If the environment variables don't work, then I would try to set the authentication properties.

 

 

Cheers,
Till

 

On Wed, Jan 16, 2019 at 11:09 AM Vinay Patil <[hidden email]> wrote:

Hi Till,

 

Can you please let us know the configurations that we need to set for Profile based credential provider in flink-conf.yaml

 

Exporting AWS_PROFILE property on EMR did not work.

 

Regards,

Vinay Patil

 

 

On Wed, Jan 16, 2019 at 3:05 PM Till Rohrmann <[hidden email]> wrote:

The old BucketingSink was using Hadoop's S3 filesystem directly whereas the new StreamingFileSink uses Flink's own FileSystem which need to be configured via the flink-conf.yaml.

 

Cheers,

Till

 

On Wed, Jan 16, 2019 at 10:31 AM Vinay Patil <[hidden email]> wrote:

Hi Till,

 

We are not providing `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` in flink-conf.yaml as we are using Profile based credentials provider. The older BucketingSink code is able to get the credentials and write to S3. We are facing this issue only with StreamingFileSink. We tried adding fs.s3a.impl to core-site.xml when the default configurations were not working.

 

Regards,

Vinay Patil

 

 

On Wed, Jan 16, 2019 at 2:55 PM Till Rohrmann <[hidden email]> wrote:

Hi Vinay,

 

Flink's file systems are self contained and won't respect the core-site.xml if I'm not mistaken. Instead you have to set the credentials in the flink configuration flink-conf.yaml via `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` and so on [1]. Have you tried this out?

 

This has been fixed with Flink 1.6.2 and 1.7.0 [2].

 

 

Cheers,

Till

 

On Wed, Jan 16, 2019 at 10:10 AM Kostas Kloudas <[hidden email]> wrote:

Hi Taher,

 

So you are using the same configuration files and everything and the only thing you change is the "s3://" to "s3a://" and the sink cannot find the credentials?

Could you please provide the logs of the Task Managers?

 

Cheers,

Kostas

 

On Wed, Jan 16, 2019 at 9:13 AM Dawid Wysakowicz <[hidden email]> wrote:

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:

Hi,

 

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

 

Regards,

Vinay Patil

 

 

On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:

Hi All,

         We have implemented S3 sink in the following way:

 

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))

.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

 

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

 

Regards,

Taher Koitawala

GS Lab Pune

+91 8407979163

 


To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML

 


Notice: This e-mail is intended solely for use of the individual or entity to which it is addressed and may contain information that is proprietary, privileged and/or exempt from disclosure under applicable law. If the reader is not the intended recipient or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. This communication may also contain data subject to U.S. export laws. If so, data subject to the International Traffic in Arms Regulation cannot be disseminated, distributed, transferred, or copied, whether incorporated or in its original form, to foreign nationals residing in the U.S. or abroad, absent the express prior approval of the U.S. Department of State. Data subject to the Export Administration Act may not be disseminated, distributed, transferred or copied contrary to U. S. Department of Commerce regulations. If you have received this communication in error, please notify the sender by reply e-mail and destroy the e-mail message and any physical copies made of the communication.
 Thank you. 
*********************



Notice: This e-mail is intended solely for use of the individual or entity to which it is addressed and may contain information that is proprietary, privileged and/or exempt from disclosure under applicable law. If the reader is not the intended recipient or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. This communication may also contain data subject to U.S. export laws. If so, data subject to the International Traffic in Arms Regulation cannot be disseminated, distributed, transferred, or copied, whether incorporated or in its original form, to foreign nationals residing in the U.S. or abroad, absent the express prior approval of the U.S. Department of State. Data subject to the Export Administration Act may not be disseminated, distributed, transferred or copied contrary to U. S. Department of Commerce regulations. If you have received this communication in error, please notify the sender by reply e-mail and destroy the e-mail message and any physical copies made of the communication.
 Thank you. 
*********************
Reply | Threaded
Open this post in threaded view
|

Re: EXT :Re: StreamingFileSink cannot get AWS S3 credentials

Stephan Ewen
Regarding configurations: According to the code [1], all config keys starting with "s3", "s3a" and "fs.s3a" are forwarded from the flink-conf.yaml to the Hadoop file systems.

Regarding profile-based authentication: Have you tried to set the credentials provider explicitly, by setting "fs.s3a.aws.credentials.provider" ?


On Thu, Jan 17, 2019 at 12:57 AM Martin, Nick <[hidden email]> wrote:

Does that mean that the full set of fs.s3a.<…> configs in core-site.xml will be picked up from flink-conf.yaml by flink? Or only certain configs involved with authentication?

 

From: Till Rohrmann [mailto:[hidden email]]
Sent: Wednesday, January 16, 2019 3:43 AM
To: Vinay Patil <[hidden email]>
Cc: Kostas Kloudas <[hidden email]>; Dawid Wysakowicz <[hidden email]>; Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]>; user <[hidden email]>
Subject: EXT :Re: StreamingFileSink cannot get AWS S3 credentials

 

I haven't configured this myself but I would guess that you need to set the parameters defined here under S3A Authentication methods [1]. If the environment variables don't work, then I would try to set the authentication properties.

 

 

Cheers,
Till

 

On Wed, Jan 16, 2019 at 11:09 AM Vinay Patil <[hidden email]> wrote:

Hi Till,

 

Can you please let us know the configurations that we need to set for Profile based credential provider in flink-conf.yaml

 

Exporting AWS_PROFILE property on EMR did not work.

 

Regards,

Vinay Patil

 

 

On Wed, Jan 16, 2019 at 3:05 PM Till Rohrmann <[hidden email]> wrote:

The old BucketingSink was using Hadoop's S3 filesystem directly whereas the new StreamingFileSink uses Flink's own FileSystem which need to be configured via the flink-conf.yaml.

 

Cheers,

Till

 

On Wed, Jan 16, 2019 at 10:31 AM Vinay Patil <[hidden email]> wrote:

Hi Till,

 

We are not providing `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` in flink-conf.yaml as we are using Profile based credentials provider. The older BucketingSink code is able to get the credentials and write to S3. We are facing this issue only with StreamingFileSink. We tried adding fs.s3a.impl to core-site.xml when the default configurations were not working.

 

Regards,

Vinay Patil

 

 

On Wed, Jan 16, 2019 at 2:55 PM Till Rohrmann <[hidden email]> wrote:

Hi Vinay,

 

Flink's file systems are self contained and won't respect the core-site.xml if I'm not mistaken. Instead you have to set the credentials in the flink configuration flink-conf.yaml via `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` and so on [1]. Have you tried this out?

 

This has been fixed with Flink 1.6.2 and 1.7.0 [2].

 

 

Cheers,

Till

 

On Wed, Jan 16, 2019 at 10:10 AM Kostas Kloudas <[hidden email]> wrote:

Hi Taher,

 

So you are using the same configuration files and everything and the only thing you change is the "s3://" to "s3a://" and the sink cannot find the credentials?

Could you please provide the logs of the Task Managers?

 

Cheers,

Kostas

 

On Wed, Jan 16, 2019 at 9:13 AM Dawid Wysakowicz <[hidden email]> wrote:

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:

Hi,

 

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

 

Regards,

Vinay Patil

 

 

On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:

Hi All,

         We have implemented S3 sink in the following way:

 

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))

.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

 

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

 

Regards,

Taher Koitawala

GS Lab Pune

+91 8407979163

 


To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML

 


Notice: This e-mail is intended solely for use of the individual or entity to which it is addressed and may contain information that is proprietary, privileged and/or exempt from disclosure under applicable law. If the reader is not the intended recipient or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. This communication may also contain data subject to U.S. export laws. If so, data subject to the International Traffic in Arms Regulation cannot be disseminated, distributed, transferred, or copied, whether incorporated or in its original form, to foreign nationals residing in the U.S. or abroad, absent the express prior approval of the U.S. Department of State. Data subject to the Export Administration Act may not be disseminated, distributed, transferred or copied contrary to U. S. Department of Commerce regulations. If you have received this communication in error, please notify the sender by reply e-mail and destroy the e-mail message and any physical copies made of the communication.
 Thank you. 
*********************



Notice: This e-mail is intended solely for use of the individual or entity to which it is addressed and may contain information that is proprietary, privileged and/or exempt from disclosure under applicable law. If the reader is not the intended recipient or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. This communication may also contain data subject to U.S. export laws. If so, data subject to the International Traffic in Arms Regulation cannot be disseminated, distributed, transferred, or copied, whether incorporated or in its original form, to foreign nationals residing in the U.S. or abroad, absent the express prior approval of the U.S. Department of State. Data subject to the Export Administration Act may not be disseminated, distributed, transferred or copied contrary to U. S. Department of Commerce regulations. If you have received this communication in error, please notify the sender by reply e-mail and destroy the e-mail message and any physical copies made of the communication.
 Thank you. 
*********************
Reply | Threaded
Open this post in threaded view
|

Re: EXT :Re: StreamingFileSink cannot get AWS S3 credentials

Vinay Patil
Hi Stephan.,

Yes, we tried setting fs.s3a.aws.credentials.provider but we are getting class not found exception for InstanceProfileCredentialsProvider because of shading issue.


Regards,
Vinay Patil


On Thu, Jan 17, 2019 at 3:02 PM Stephan Ewen <[hidden email]> wrote:
Regarding configurations: According to the code [1], all config keys starting with "s3", "s3a" and "fs.s3a" are forwarded from the flink-conf.yaml to the Hadoop file systems.

Regarding profile-based authentication: Have you tried to set the credentials provider explicitly, by setting "fs.s3a.aws.credentials.provider" ?


On Thu, Jan 17, 2019 at 12:57 AM Martin, Nick <[hidden email]> wrote:

Does that mean that the full set of fs.s3a.<…> configs in core-site.xml will be picked up from flink-conf.yaml by flink? Or only certain configs involved with authentication?

 

From: Till Rohrmann [mailto:[hidden email]]
Sent: Wednesday, January 16, 2019 3:43 AM
To: Vinay Patil <[hidden email]>
Cc: Kostas Kloudas <[hidden email]>; Dawid Wysakowicz <[hidden email]>; Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]>; user <[hidden email]>
Subject: EXT :Re: StreamingFileSink cannot get AWS S3 credentials

 

I haven't configured this myself but I would guess that you need to set the parameters defined here under S3A Authentication methods [1]. If the environment variables don't work, then I would try to set the authentication properties.

 

 

Cheers,
Till

 

On Wed, Jan 16, 2019 at 11:09 AM Vinay Patil <[hidden email]> wrote:

Hi Till,

 

Can you please let us know the configurations that we need to set for Profile based credential provider in flink-conf.yaml

 

Exporting AWS_PROFILE property on EMR did not work.

 

Regards,

Vinay Patil

 

 

On Wed, Jan 16, 2019 at 3:05 PM Till Rohrmann <[hidden email]> wrote:

The old BucketingSink was using Hadoop's S3 filesystem directly whereas the new StreamingFileSink uses Flink's own FileSystem which need to be configured via the flink-conf.yaml.

 

Cheers,

Till

 

On Wed, Jan 16, 2019 at 10:31 AM Vinay Patil <[hidden email]> wrote:

Hi Till,

 

We are not providing `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` in flink-conf.yaml as we are using Profile based credentials provider. The older BucketingSink code is able to get the credentials and write to S3. We are facing this issue only with StreamingFileSink. We tried adding fs.s3a.impl to core-site.xml when the default configurations were not working.

 

Regards,

Vinay Patil

 

 

On Wed, Jan 16, 2019 at 2:55 PM Till Rohrmann <[hidden email]> wrote:

Hi Vinay,

 

Flink's file systems are self contained and won't respect the core-site.xml if I'm not mistaken. Instead you have to set the credentials in the flink configuration flink-conf.yaml via `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` and so on [1]. Have you tried this out?

 

This has been fixed with Flink 1.6.2 and 1.7.0 [2].

 

 

Cheers,

Till

 

On Wed, Jan 16, 2019 at 10:10 AM Kostas Kloudas <[hidden email]> wrote:

Hi Taher,

 

So you are using the same configuration files and everything and the only thing you change is the "s3://" to "s3a://" and the sink cannot find the credentials?

Could you please provide the logs of the Task Managers?

 

Cheers,

Kostas

 

On Wed, Jan 16, 2019 at 9:13 AM Dawid Wysakowicz <[hidden email]> wrote:

Forgot to cc ;)

On 16/01/2019 08:51, Vinay Patil wrote:

Hi,

 

Can someone please help on this issue. We have even tried to set fs.s3a.impl in core-site.xml, still its not working.

 

Regards,

Vinay Patil

 

 

On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:

Hi All,

         We have implemented S3 sink in the following way:

 

StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema))

.withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

 

The problem we are facing is that StreamingFileSink is initializing S3AFileSystem class to write to s3 and is not able to find the s3 credentials to write data, However other flink application on the same cluster use "s3://" paths are able to write data to the same s3 bucket and folders, we are only facing this issue with StreamingFileSink.

 

Regards,

Taher Koitawala

GS Lab Pune

+91 8407979163

 


To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML

 


Notice: This e-mail is intended solely for use of the individual or entity to which it is addressed and may contain information that is proprietary, privileged and/or exempt from disclosure under applicable law. If the reader is not the intended recipient or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. This communication may also contain data subject to U.S. export laws. If so, data subject to the International Traffic in Arms Regulation cannot be disseminated, distributed, transferred, or copied, whether incorporated or in its original form, to foreign nationals residing in the U.S. or abroad, absent the express prior approval of the U.S. Department of State. Data subject to the Export Administration Act may not be disseminated, distributed, transferred or copied contrary to U. S. Department of Commerce regulations. If you have received this communication in error, please notify the sender by reply e-mail and destroy the e-mail message and any physical copies made of the communication.
 Thank you. 
*********************



Notice: This e-mail is intended solely for use of the individual or entity to which it is addressed and may contain information that is proprietary, privileged and/or exempt from disclosure under applicable law. If the reader is not the intended recipient or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. This communication may also contain data subject to U.S. export laws. If so, data subject to the International Traffic in Arms Regulation cannot be disseminated, distributed, transferred, or copied, whether incorporated or in its original form, to foreign nationals residing in the U.S. or abroad, absent the express prior approval of the U.S. Department of State. Data subject to the Export Administration Act may not be disseminated, distributed, transferred or copied contrary to U. S. Department of Commerce regulations. If you have received this communication in error, please notify the sender by reply e-mail and destroy the e-mail message and any physical copies made of the communication.
 Thank you. 
*********************