Do I still need hadoop-aws libs when using Flink 1.5 and Presto?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Do I still need hadoop-aws libs when using Flink 1.5 and Presto?

Hao Sun
I am trying to figure out how to use S3 as state storage.

Seems like I only have to do two things:
1. Put flink-s3-fs-presto to the lib
2. Configure
s3.access-key: your-access-key
s3.secret-key: your-secret-key

But I see this exception: ClassNotFoundException: NativeS3FileSystem/S3AFileSystem Not Found


Add it is suggested to add more libs.
So I am confused here, is there a step 3 needed? Isn't the presto jar is all self contained?

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Do I still need hadoop-aws libs when using Flink 1.5 and Presto?

Aljoscha Krettek
Hi,

what are you using as the FileSystem scheme? s3 or s3a?

Also, could you also post the full stack trace, please?

Best,
Aljoscha

On 2. Jun 2018, at 07:34, Hao Sun <[hidden email]> wrote:

I am trying to figure out how to use S3 as state storage.

Seems like I only have to do two things:
1. Put flink-s3-fs-presto to the lib
2. Configure
s3.access-key: your-access-key
s3.secret-key: your-secret-key

But I see this exception: ClassNotFoundException: NativeS3FileSystem/S3AFileSystem Not Found


Add it is suggested to add more libs.
So I am confused here, is there a step 3 needed? Isn't the presto jar is all self contained?

Thanks

Reply | Threaded
Open this post in threaded view
|

Re: Do I still need hadoop-aws libs when using Flink 1.5 and Presto?

Hao Sun
Thanks for pick up my question. I had s3a in the config now I removed it.
I will post a full trace soon, but want to get some questions answered to help me understand this better.

1. Can I use the presto lib with Flink 1.5 without bundled hdp? Can I use this? http://www.apache.org/dyn/closer.lua/flink/flink-1.5.0/flink-1.5.0-bin-scala_2.11.tgz
2. How do I configure presto for endpoints, encryption? The S3A file system needed core-site.yml to configure such things and S3 V4 signature. Do I have to do it for presto?
3. If yes, how to do it? Just add s3.xxx to flink-config? 
    like s3.server-side-encryption-algorithm: AES256
    s3.endpoint: 's3.amazonaws.com' other values for France regions, etc

I will post more logs when I get one. Thanks

On Tue, Jun 5, 2018 at 9:09 AM Aljoscha Krettek <[hidden email]> wrote:
Hi,

what are you using as the FileSystem scheme? s3 or s3a?

Also, could you also post the full stack trace, please?

Best,
Aljoscha


On 2. Jun 2018, at 07:34, Hao Sun <[hidden email]> wrote:

I am trying to figure out how to use S3 as state storage.

Seems like I only have to do two things:
1. Put flink-s3-fs-presto to the lib
2. Configure
s3.access-key: your-access-key
s3.secret-key: your-secret-key

But I see this exception: ClassNotFoundException: NativeS3FileSystem/S3AFileSystem Not Found


Add it is suggested to add more libs.
So I am confused here, is there a step 3 needed? Isn't the presto jar is all self contained?

Thanks

Reply | Threaded
Open this post in threaded view
|

Re: Do I still need hadoop-aws libs when using Flink 1.5 and Presto?

Aljoscha Krettek
Hi,

sorry, yes, you don't have to add any of the Hadoop dependencies. Everything that's needed comes in the presto s3 jar.

You should use "s3:" as the prefix, the Presto S3 filesystem will not be used if you use s3a. And yes, you add config values to the flink config as s3.xxx.

Best,
Aljoscha

On 5. Jun 2018, at 18:23, Hao Sun <[hidden email]> wrote:

Thanks for pick up my question. I had s3a in the config now I removed it.
I will post a full trace soon, but want to get some questions answered to help me understand this better.

1. Can I use the presto lib with Flink 1.5 without bundled hdp? Can I use this? http://www.apache.org/dyn/closer.lua/flink/flink-1.5.0/flink-1.5.0-bin-scala_2.11.tgz
2. How do I configure presto for endpoints, encryption? The S3A file system needed core-site.yml to configure such things and S3 V4 signature. Do I have to do it for presto?
3. If yes, how to do it? Just add s3.xxx to flink-config? 
    like s3.server-side-encryption-algorithm: AES256
    s3.endpoint: 's3.amazonaws.com' other values for France regions, etc

I will post more logs when I get one. Thanks

On Tue, Jun 5, 2018 at 9:09 AM Aljoscha Krettek <[hidden email]> wrote:
Hi,

what are you using as the FileSystem scheme? s3 or s3a?

Also, could you also post the full stack trace, please?

Best,
Aljoscha


On 2. Jun 2018, at 07:34, Hao Sun <[hidden email]> wrote:

I am trying to figure out how to use S3 as state storage.

Seems like I only have to do two things:
1. Put flink-s3-fs-presto to the lib
2. Configure
s3.access-key: your-access-key
s3.secret-key: your-secret-key

But I see this exception: ClassNotFoundException: NativeS3FileSystem/S3AFileSystem Not Found


Add it is suggested to add more libs.
So I am confused here, is there a step 3 needed? Isn't the presto jar is all self contained?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: Do I still need hadoop-aws libs when using Flink 1.5 and Presto?

Hao Sun
I do not have the S3A lib requirement anymore, but I got a new error.

org.apache.flink.fs.s3presto.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied

Here are more logs:

Thanks

On Tue, Jun 5, 2018 at 9:39 AM Aljoscha Krettek <[hidden email]> wrote:
Hi,

sorry, yes, you don't have to add any of the Hadoop dependencies. Everything that's needed comes in the presto s3 jar.

You should use "s3:" as the prefix, the Presto S3 filesystem will not be used if you use s3a. And yes, you add config values to the flink config as s3.xxx.

Best,
Aljoscha


On 5. Jun 2018, at 18:23, Hao Sun <[hidden email]> wrote:

Thanks for pick up my question. I had s3a in the config now I removed it.
I will post a full trace soon, but want to get some questions answered to help me understand this better.

1. Can I use the presto lib with Flink 1.5 without bundled hdp? Can I use this? http://www.apache.org/dyn/closer.lua/flink/flink-1.5.0/flink-1.5.0-bin-scala_2.11.tgz
2. How do I configure presto for endpoints, encryption? The S3A file system needed core-site.yml to configure such things and S3 V4 signature. Do I have to do it for presto?
3. If yes, how to do it? Just add s3.xxx to flink-config? 
    like s3.server-side-encryption-algorithm: AES256
    s3.endpoint: 's3.amazonaws.com' other values for France regions, etc

I will post more logs when I get one. Thanks

On Tue, Jun 5, 2018 at 9:09 AM Aljoscha Krettek <[hidden email]> wrote:
Hi,

what are you using as the FileSystem scheme? s3 or s3a?

Also, could you also post the full stack trace, please?

Best,
Aljoscha


On 2. Jun 2018, at 07:34, Hao Sun <[hidden email]> wrote:

I am trying to figure out how to use S3 as state storage.

Seems like I only have to do two things:
1. Put flink-s3-fs-presto to the lib
2. Configure
s3.access-key: your-access-key
s3.secret-key: your-secret-key

But I see this exception: ClassNotFoundException: NativeS3FileSystem/S3AFileSystem Not Found


Add it is suggested to add more libs.
So I am confused here, is there a step 3 needed? Isn't the presto jar is all self contained?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: Do I still need hadoop-aws libs when using Flink 1.5 and Presto?

Hao Sun
also a follow up question. Can I use all properties here? Should I remove `hive.` for all the keys?
https://prestodb.io/docs/current/connector/hive.html#hive-configuration-properties

More specifically how I configure sse for s3?

On Tue, Jun 5, 2018 at 11:33 AM Hao Sun <[hidden email]> wrote:
I do not have the S3A lib requirement anymore, but I got a new error.

org.apache.flink.fs.s3presto.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied

Here are more logs:

Thanks

On Tue, Jun 5, 2018 at 9:39 AM Aljoscha Krettek <[hidden email]> wrote:
Hi,

sorry, yes, you don't have to add any of the Hadoop dependencies. Everything that's needed comes in the presto s3 jar.

You should use "s3:" as the prefix, the Presto S3 filesystem will not be used if you use s3a. And yes, you add config values to the flink config as s3.xxx.

Best,
Aljoscha


On 5. Jun 2018, at 18:23, Hao Sun <[hidden email]> wrote:

Thanks for pick up my question. I had s3a in the config now I removed it.
I will post a full trace soon, but want to get some questions answered to help me understand this better.

1. Can I use the presto lib with Flink 1.5 without bundled hdp? Can I use this? http://www.apache.org/dyn/closer.lua/flink/flink-1.5.0/flink-1.5.0-bin-scala_2.11.tgz
2. How do I configure presto for endpoints, encryption? The S3A file system needed core-site.yml to configure such things and S3 V4 signature. Do I have to do it for presto?
3. If yes, how to do it? Just add s3.xxx to flink-config? 
    like s3.server-side-encryption-algorithm: AES256
    s3.endpoint: 's3.amazonaws.com' other values for France regions, etc

I will post more logs when I get one. Thanks

On Tue, Jun 5, 2018 at 9:09 AM Aljoscha Krettek <[hidden email]> wrote:
Hi,

what are you using as the FileSystem scheme? s3 or s3a?

Also, could you also post the full stack trace, please?

Best,
Aljoscha


On 2. Jun 2018, at 07:34, Hao Sun <[hidden email]> wrote:

I am trying to figure out how to use S3 as state storage.

Seems like I only have to do two things:
1. Put flink-s3-fs-presto to the lib
2. Configure
s3.access-key: your-access-key
s3.secret-key: your-secret-key

But I see this exception: ClassNotFoundException: NativeS3FileSystem/S3AFileSystem Not Found


Add it is suggested to add more libs.
So I am confused here, is there a step 3 needed? Isn't the presto jar is all self contained?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: Do I still need hadoop-aws libs when using Flink 1.5 and Presto?

Hao Sun
After I added these to my flink-conf.yml, everything works now.

s3.sse.enabled: true
s3.sse.type: S3

Thanks for the help!
In general I also want to know what config keys for presto-s3 I can use.


On Tue, Jun 5, 2018 at 11:43 AM Hao Sun <[hidden email]> wrote:
also a follow up question. Can I use all properties here? Should I remove `hive.` for all the keys?
https://prestodb.io/docs/current/connector/hive.html#hive-configuration-properties

More specifically how I configure sse for s3?

On Tue, Jun 5, 2018 at 11:33 AM Hao Sun <[hidden email]> wrote:
I do not have the S3A lib requirement anymore, but I got a new error.

org.apache.flink.fs.s3presto.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied

Here are more logs:

Thanks

On Tue, Jun 5, 2018 at 9:39 AM Aljoscha Krettek <[hidden email]> wrote:
Hi,

sorry, yes, you don't have to add any of the Hadoop dependencies. Everything that's needed comes in the presto s3 jar.

You should use "s3:" as the prefix, the Presto S3 filesystem will not be used if you use s3a. And yes, you add config values to the flink config as s3.xxx.

Best,
Aljoscha


On 5. Jun 2018, at 18:23, Hao Sun <[hidden email]> wrote:

Thanks for pick up my question. I had s3a in the config now I removed it.
I will post a full trace soon, but want to get some questions answered to help me understand this better.

1. Can I use the presto lib with Flink 1.5 without bundled hdp? Can I use this? http://www.apache.org/dyn/closer.lua/flink/flink-1.5.0/flink-1.5.0-bin-scala_2.11.tgz
2. How do I configure presto for endpoints, encryption? The S3A file system needed core-site.yml to configure such things and S3 V4 signature. Do I have to do it for presto?
3. If yes, how to do it? Just add s3.xxx to flink-config? 
    like s3.server-side-encryption-algorithm: AES256
    s3.endpoint: 's3.amazonaws.com' other values for France regions, etc

I will post more logs when I get one. Thanks

On Tue, Jun 5, 2018 at 9:09 AM Aljoscha Krettek <[hidden email]> wrote:
Hi,

what are you using as the FileSystem scheme? s3 or s3a?

Also, could you also post the full stack trace, please?

Best,
Aljoscha


On 2. Jun 2018, at 07:34, Hao Sun <[hidden email]> wrote:

I am trying to figure out how to use S3 as state storage.

Seems like I only have to do two things:
1. Put flink-s3-fs-presto to the lib
2. Configure
s3.access-key: your-access-key
s3.secret-key: your-secret-key

But I see this exception: ClassNotFoundException: NativeS3FileSystem/S3AFileSystem Not Found


Add it is suggested to add more libs.
So I am confused here, is there a step 3 needed? Isn't the presto jar is all self contained?

Thanks