Access to S3 from YARN on EC2

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Access to S3 from YARN on EC2

Ashutosh Kumar
I have setup a 3 node YARN based cluster on EC2. I am running flink in cluster mode. I added these lines in core-site.xml

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>accesskey</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>secret key</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Also I added this line in conf/flink-conf.yaml
   fs.hdfs.hadoopconf: /usr/local/hadoop/etc/hadoop


However I am getting class not found error while accessing s3 through s3n. I am using flink 1.0.0.

Caused by: org.apache.flink.runtime.JobException: Creating the input splits caused an error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:172)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:696)
        at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1023)
        ... 25 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
        at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
        at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:311)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:450)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:57)
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:156)
        ... 27 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
        ... 34 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
        ... 35 more


Thanks
Ashutosh

Reply | Threaded
Open this post in threaded view
|

Re: Access to S3 from YARN on EC2

rmetzger0
Hi,

did you check if the "org.apache.hadoop.fs.s3native.NativeS3FileSystem" class is in the flink-dist.jar in the lib/ folder?


On Sun, Mar 20, 2016 at 10:19 AM, Ashutosh Kumar <[hidden email]> wrote:
I have setup a 3 node YARN based cluster on EC2. I am running flink in cluster mode. I added these lines in core-site.xml

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>accesskey</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>secret key</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Also I added this line in conf/flink-conf.yaml
   fs.hdfs.hadoopconf: /usr/local/hadoop/etc/hadoop


However I am getting class not found error while accessing s3 through s3n. I am using flink 1.0.0.

Caused by: org.apache.flink.runtime.JobException: Creating the input splits caused an error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:172)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:696)
        at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1023)
        ... 25 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
        at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
        at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:311)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:450)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:57)
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:156)
        ... 27 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
        ... 34 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
        ... 35 more


Thanks
Ashutosh


Reply | Threaded
Open this post in threaded view
|

Re: Access to S3 from YARN on EC2

Ashutosh Kumar
It is not there.

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 2:58 PM, Robert Metzger <[hidden email]> wrote:
Hi,

did you check if the "org.apache.hadoop.fs.s3native.NativeS3FileSystem" class is in the flink-dist.jar in the lib/ folder?


On Sun, Mar 20, 2016 at 10:19 AM, Ashutosh Kumar <[hidden email]> wrote:
I have setup a 3 node YARN based cluster on EC2. I am running flink in cluster mode. I added these lines in core-site.xml

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>accesskey</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>secret key</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Also I added this line in conf/flink-conf.yaml
   fs.hdfs.hadoopconf: /usr/local/hadoop/etc/hadoop


However I am getting class not found error while accessing s3 through s3n. I am using flink 1.0.0.

Caused by: org.apache.flink.runtime.JobException: Creating the input splits caused an error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:172)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:696)
        at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1023)
        ... 25 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
        at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
        at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:311)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:450)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:57)
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:156)
        ... 27 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
        ... 34 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
        ... 35 more


Thanks
Ashutosh



Reply | Threaded
Open this post in threaded view
|

Re: Access to S3 from YARN on EC2

Ashutosh Kumar
Do I need to add some jars in lib ?

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 4:30 PM, Ashutosh Kumar <[hidden email]> wrote:
It is not there.

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 2:58 PM, Robert Metzger <[hidden email]> wrote:
Hi,

did you check if the "org.apache.hadoop.fs.s3native.NativeS3FileSystem" class is in the flink-dist.jar in the lib/ folder?


On Sun, Mar 20, 2016 at 10:19 AM, Ashutosh Kumar <[hidden email]> wrote:
I have setup a 3 node YARN based cluster on EC2. I am running flink in cluster mode. I added these lines in core-site.xml

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>accesskey</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>secret key</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Also I added this line in conf/flink-conf.yaml
   fs.hdfs.hadoopconf: /usr/local/hadoop/etc/hadoop


However I am getting class not found error while accessing s3 through s3n. I am using flink 1.0.0.

Caused by: org.apache.flink.runtime.JobException: Creating the input splits caused an error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:172)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:696)
        at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1023)
        ... 25 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
        at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
        at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:311)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:450)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:57)
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:156)
        ... 27 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
        ... 34 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
        ... 35 more


Thanks
Ashutosh




Reply | Threaded
Open this post in threaded view
|

Re: Access to S3 from YARN on EC2

Timothy Farkas
Hi Ashutosh,

I believe you need to add the hadoop-aws jar to your project.

http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws/2.6.0


Thanks,
Tim

On Sun, Mar 20, 2016 at 9:39 AM, Ashutosh Kumar <[hidden email]> wrote:
Do I need to add some jars in lib ?

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 4:30 PM, Ashutosh Kumar <[hidden email]> wrote:
It is not there.

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 2:58 PM, Robert Metzger <[hidden email]> wrote:
Hi,

did you check if the "org.apache.hadoop.fs.s3native.NativeS3FileSystem" class is in the flink-dist.jar in the lib/ folder?


On Sun, Mar 20, 2016 at 10:19 AM, Ashutosh Kumar <[hidden email]> wrote:
I have setup a 3 node YARN based cluster on EC2. I am running flink in cluster mode. I added these lines in core-site.xml

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>accesskey</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>secret key</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Also I added this line in conf/flink-conf.yaml
   fs.hdfs.hadoopconf: /usr/local/hadoop/etc/hadoop


However I am getting class not found error while accessing s3 through s3n. I am using flink 1.0.0.

Caused by: org.apache.flink.runtime.JobException: Creating the input splits caused an error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:172)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:696)
        at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1023)
        ... 25 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
        at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
        at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:311)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:450)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:57)
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:156)
        ... 27 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
        ... 34 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
        ... 35 more


Thanks
Ashutosh





Reply | Threaded
Open this post in threaded view
|

Re: Access to S3 from YARN on EC2

Ashutosh Kumar
Hi Tim,
I have this dependency in my pom file . This jar is present in my jar with dependencies. I exploded the jar and checked it. The class NativeS3FileSystem.class  is present there.

Thanks
Ashutosh
 

On Mon, Mar 21, 2016 at 7:20 AM, Timothy Farkas <[hidden email]> wrote:
Hi Ashutosh,

I believe you need to add the hadoop-aws jar to your project.

http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws/2.6.0


Thanks,
Tim

On Sun, Mar 20, 2016 at 9:39 AM, Ashutosh Kumar <[hidden email]> wrote:
Do I need to add some jars in lib ?

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 4:30 PM, Ashutosh Kumar <[hidden email]> wrote:
It is not there.

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 2:58 PM, Robert Metzger <[hidden email]> wrote:
Hi,

did you check if the "org.apache.hadoop.fs.s3native.NativeS3FileSystem" class is in the flink-dist.jar in the lib/ folder?


On Sun, Mar 20, 2016 at 10:19 AM, Ashutosh Kumar <[hidden email]> wrote:
I have setup a 3 node YARN based cluster on EC2. I am running flink in cluster mode. I added these lines in core-site.xml

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>accesskey</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>secret key</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Also I added this line in conf/flink-conf.yaml
   fs.hdfs.hadoopconf: /usr/local/hadoop/etc/hadoop


However I am getting class not found error while accessing s3 through s3n. I am using flink 1.0.0.

Caused by: org.apache.flink.runtime.JobException: Creating the input splits caused an error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:172)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:696)
        at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1023)
        ... 25 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
        at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
        at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:311)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:450)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:57)
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:156)
        ... 27 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
        ... 34 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
        ... 35 more


Thanks
Ashutosh






Reply | Threaded
Open this post in threaded view
|

Re: Access to S3 from YARN on EC2

Balaji Rajagopalan
This kind of class not found exception is a little bit misleading, it is not the class is not found is the real problem rather than the combination of the different libraries that are using there is a version compatibility mismatch, so you will have to go back and check if there is any version mismatch. Are you using scala or this is a java project ? 

On Mon, Mar 21, 2016 at 10:26 AM, Ashutosh Kumar <[hidden email]> wrote:
Hi Tim,
I have this dependency in my pom file . This jar is present in my jar with dependencies. I exploded the jar and checked it. The class NativeS3FileSystem.class  is present there.

Thanks
Ashutosh
 

On Mon, Mar 21, 2016 at 7:20 AM, Timothy Farkas <[hidden email]> wrote:
Hi Ashutosh,

I believe you need to add the hadoop-aws jar to your project.

http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws/2.6.0


Thanks,
Tim

On Sun, Mar 20, 2016 at 9:39 AM, Ashutosh Kumar <[hidden email]> wrote:
Do I need to add some jars in lib ?

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 4:30 PM, Ashutosh Kumar <[hidden email]> wrote:
It is not there.

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 2:58 PM, Robert Metzger <[hidden email]> wrote:
Hi,

did you check if the "org.apache.hadoop.fs.s3native.NativeS3FileSystem" class is in the flink-dist.jar in the lib/ folder?


On Sun, Mar 20, 2016 at 10:19 AM, Ashutosh Kumar <[hidden email]> wrote:
I have setup a 3 node YARN based cluster on EC2. I am running flink in cluster mode. I added these lines in core-site.xml

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>accesskey</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>secret key</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Also I added this line in conf/flink-conf.yaml
   fs.hdfs.hadoopconf: /usr/local/hadoop/etc/hadoop


However I am getting class not found error while accessing s3 through s3n. I am using flink 1.0.0.

Caused by: org.apache.flink.runtime.JobException: Creating the input splits caused an error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:172)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:696)
        at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1023)
        ... 25 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
        at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
        at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:311)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:450)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:57)
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:156)
        ... 27 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
        ... 34 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
        ... 35 more


Thanks
Ashutosh







Reply | Threaded
Open this post in threaded view
|

Re: Access to S3 from YARN on EC2

Ashutosh Kumar

Adding required jar files in flink/lib resolves the issue.

On Mar 21, 2016 12:57 PM, "Balaji Rajagopalan" <[hidden email]> wrote:
This kind of class not found exception is a little bit misleading, it is not the class is not found is the real problem rather than the combination of the different libraries that are using there is a version compatibility mismatch, so you will have to go back and check if there is any version mismatch. Are you using scala or this is a java project ? 

On Mon, Mar 21, 2016 at 10:26 AM, Ashutosh Kumar <[hidden email]> wrote:
Hi Tim,
I have this dependency in my pom file . This jar is present in my jar with dependencies. I exploded the jar and checked it. The class NativeS3FileSystem.class  is present there.

Thanks
Ashutosh
 

On Mon, Mar 21, 2016 at 7:20 AM, Timothy Farkas <[hidden email]> wrote:
Hi Ashutosh,

I believe you need to add the hadoop-aws jar to your project.

http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws/2.6.0


Thanks,
Tim

On Sun, Mar 20, 2016 at 9:39 AM, Ashutosh Kumar <[hidden email]> wrote:
Do I need to add some jars in lib ?

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 4:30 PM, Ashutosh Kumar <[hidden email]> wrote:
It is not there.

Thanks
Ashutosh

On Sun, Mar 20, 2016 at 2:58 PM, Robert Metzger <[hidden email]> wrote:
Hi,

did you check if the "org.apache.hadoop.fs.s3native.NativeS3FileSystem" class is in the flink-dist.jar in the lib/ folder?


On Sun, Mar 20, 2016 at 10:19 AM, Ashutosh Kumar <[hidden email]> wrote:
I have setup a 3 node YARN based cluster on EC2. I am running flink in cluster mode. I added these lines in core-site.xml

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>accesskey</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>secret key</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Also I added this line in conf/flink-conf.yaml
   fs.hdfs.hadoopconf: /usr/local/hadoop/etc/hadoop


However I am getting class not found error while accessing s3 through s3n. I am using flink 1.0.0.

Caused by: org.apache.flink.runtime.JobException: Creating the input splits caused an error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:172)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:696)
        at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1023)
        ... 25 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
        at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
        at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:311)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:450)
        at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:57)
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:156)
        ... 27 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
        ... 34 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
        ... 35 more


Thanks
Ashutosh