Using S3 as state backend

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Using S3 as state backend

Brian Chhun
Hello,

I'm trying to setup an HA cluster and I'm running into issues using S3 as the state backend. This is raised during startup:

2015-12-09T19:23:36.430724+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: java.io.IOException: No file system found with scheme s3, referenced in file URI 's3:///flink/recovery/blob'.

2015-12-09T19:23:36.430858+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:242)

2015-12-09T19:23:36.430989+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.FileSystemBlobStore.<init>(FileSystemBlobStore.java:67)

2015-12-09T19:23:36.431297+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.BlobServer.<init>(BlobServer.java:105)

2015-12-09T19:23:36.431435+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.createJobManagerComponents(JobManager.scala:1814)

2015-12-09T19:23:36.431569+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1944)

2015-12-09T19:23:36.431690+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1898)

2015-12-09T19:23:36.431810+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startActorSystemAndJobManagerActors(JobManager.scala:1584)

2015-12-09T19:23:36.431933+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:1486)

2015-12-09T19:23:36.432414+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1447)

2015-12-09T19:23:36.432649+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)

Is it possible to use S3 as the backend store or is only hdfs/mapfs supported?


Thanks,
Brian
Reply | Threaded
Open this post in threaded view
|

Re: Using S3 as state backend

Ufuk Celebi
Hey Brian,

did you follow the S3 setup guide? https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html

You have to set the fs.hdfs.hadoopconf property and add

<property>
<name>fs.s3.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>

to core-site.xml

– Ufuk

> On 09 Dec 2015, at 20:50, Brian Chhun <[hidden email]> wrote:
>
> Hello,
>
> I'm trying to setup an HA cluster and I'm running into issues using S3 as the state backend. This is raised during startup:
>
> 2015-12-09T19:23:36.430724+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: java.io.IOException: No file system found with scheme s3, referenced in file URI 's3:///flink/recovery/blob'.
>
> 2015-12-09T19:23:36.430858+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:242)
>
> 2015-12-09T19:23:36.430989+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.FileSystemBlobStore.<init>(FileSystemBlobStore.java:67)
>
> 2015-12-09T19:23:36.431297+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.BlobServer.<init>(BlobServer.java:105)
>
> 2015-12-09T19:23:36.431435+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.createJobManagerComponents(JobManager.scala:1814)
>
> 2015-12-09T19:23:36.431569+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1944)
>
> 2015-12-09T19:23:36.431690+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1898)
>
> 2015-12-09T19:23:36.431810+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startActorSystemAndJobManagerActors(JobManager.scala:1584)
>
> 2015-12-09T19:23:36.431933+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:1486)
>
> 2015-12-09T19:23:36.432414+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1447)
>
> 2015-12-09T19:23:36.432649+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)
>
> Is it possible to use S3 as the backend store or is only hdfs/mapfs supported?
>
>
> Thanks,
> Brian

Reply | Threaded
Open this post in threaded view
|

Re: Using S3 as state backend

Brian Chhun
Thanks Ufuk, this did the trick.

Thanks,
Brian

On Wed, Dec 9, 2015 at 4:37 PM, Ufuk Celebi <[hidden email]> wrote:
Hey Brian,

did you follow the S3 setup guide? https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html

You have to set the fs.hdfs.hadoopconf property and add

<property>
<name>fs.s3.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>

to core-site.xml

– Ufuk

> On 09 Dec 2015, at 20:50, Brian Chhun <[hidden email]> wrote:
>
> Hello,
>
> I'm trying to setup an HA cluster and I'm running into issues using S3 as the state backend. This is raised during startup:
>
> 2015-12-09T19:23:36.430724+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: java.io.IOException: No file system found with scheme s3, referenced in file URI 's3:///flink/recovery/blob'.
>
> 2015-12-09T19:23:36.430858+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:242)
>
> 2015-12-09T19:23:36.430989+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.FileSystemBlobStore.<init>(FileSystemBlobStore.java:67)
>
> 2015-12-09T19:23:36.431297+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.BlobServer.<init>(BlobServer.java:105)
>
> 2015-12-09T19:23:36.431435+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.createJobManagerComponents(JobManager.scala:1814)
>
> 2015-12-09T19:23:36.431569+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1944)
>
> 2015-12-09T19:23:36.431690+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1898)
>
> 2015-12-09T19:23:36.431810+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startActorSystemAndJobManagerActors(JobManager.scala:1584)
>
> 2015-12-09T19:23:36.431933+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:1486)
>
> 2015-12-09T19:23:36.432414+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1447)
>
> 2015-12-09T19:23:36.432649+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)
>
> Is it possible to use S3 as the backend store or is only hdfs/mapfs supported?
>
>
> Thanks,
> Brian


Reply | Threaded
Open this post in threaded view
|

Re: Using S3 as state backend

Brian Chhun
For anyone else looking, I was able to use the s3a filesystem which can use IAM role based authentication as provided by the underlying AWS client library.

Thanks,
Brian

On Thu, Dec 10, 2015 at 4:28 PM, Brian Chhun <[hidden email]> wrote:
Thanks Ufuk, this did the trick.

Thanks,
Brian

On Wed, Dec 9, 2015 at 4:37 PM, Ufuk Celebi <[hidden email]> wrote:
Hey Brian,

did you follow the S3 setup guide? https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html

You have to set the fs.hdfs.hadoopconf property and add

<property>
<name>fs.s3.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>

to core-site.xml

– Ufuk

> On 09 Dec 2015, at 20:50, Brian Chhun <[hidden email]> wrote:
>
> Hello,
>
> I'm trying to setup an HA cluster and I'm running into issues using S3 as the state backend. This is raised during startup:
>
> 2015-12-09T19:23:36.430724+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: java.io.IOException: No file system found with scheme s3, referenced in file URI 's3:///flink/recovery/blob'.
>
> 2015-12-09T19:23:36.430858+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:242)
>
> 2015-12-09T19:23:36.430989+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.FileSystemBlobStore.<init>(FileSystemBlobStore.java:67)
>
> 2015-12-09T19:23:36.431297+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.BlobServer.<init>(BlobServer.java:105)
>
> 2015-12-09T19:23:36.431435+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.createJobManagerComponents(JobManager.scala:1814)
>
> 2015-12-09T19:23:36.431569+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1944)
>
> 2015-12-09T19:23:36.431690+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1898)
>
> 2015-12-09T19:23:36.431810+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startActorSystemAndJobManagerActors(JobManager.scala:1584)
>
> 2015-12-09T19:23:36.431933+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:1486)
>
> 2015-12-09T19:23:36.432414+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1447)
>
> 2015-12-09T19:23:36.432649+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)
>
> Is it possible to use S3 as the backend store or is only hdfs/mapfs supported?
>
>
> Thanks,
> Brian



Reply | Threaded
Open this post in threaded view
|

Re: Using S3 as state backend

Thomas Götzinger

Hi Brian

Can you give me short summary how to achieve this.

Am 14.12.2015 23:20 schrieb "Brian Chhun" <[hidden email]>:
For anyone else looking, I was able to use the s3a filesystem which can use IAM role based authentication as provided by the underlying AWS client library.

Thanks,
Brian

On Thu, Dec 10, 2015 at 4:28 PM, Brian Chhun <[hidden email]> wrote:
Thanks Ufuk, this did the trick.

Thanks,
Brian

On Wed, Dec 9, 2015 at 4:37 PM, Ufuk Celebi <[hidden email]> wrote:
Hey Brian,

did you follow the S3 setup guide? https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html

You have to set the fs.hdfs.hadoopconf property and add

<property>
<name>fs.s3.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>

to core-site.xml

– Ufuk

> On 09 Dec 2015, at 20:50, Brian Chhun <[hidden email]> wrote:
>
> Hello,
>
> I'm trying to setup an HA cluster and I'm running into issues using S3 as the state backend. This is raised during startup:
>
> 2015-12-09T19:23:36.430724+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: java.io.IOException: No file system found with scheme s3, referenced in file URI 's3:///flink/recovery/blob'.
>
> 2015-12-09T19:23:36.430858+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:242)
>
> 2015-12-09T19:23:36.430989+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.FileSystemBlobStore.<init>(FileSystemBlobStore.java:67)
>
> 2015-12-09T19:23:36.431297+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.BlobServer.<init>(BlobServer.java:105)
>
> 2015-12-09T19:23:36.431435+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.createJobManagerComponents(JobManager.scala:1814)
>
> 2015-12-09T19:23:36.431569+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1944)
>
> 2015-12-09T19:23:36.431690+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1898)
>
> 2015-12-09T19:23:36.431810+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startActorSystemAndJobManagerActors(JobManager.scala:1584)
>
> 2015-12-09T19:23:36.431933+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:1486)
>
> 2015-12-09T19:23:36.432414+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1447)
>
> 2015-12-09T19:23:36.432649+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)
>
> Is it possible to use S3 as the backend store or is only hdfs/mapfs supported?
>
>
> Thanks,
> Brian



Reply | Threaded
Open this post in threaded view
|

Re: Using S3 as state backend

Brian Chhun
Sure, excuse me if anything was obvious or wrong, I know next to nothing about Hadoop.

1. get the Hadoop 2.7 distribution (I set its path to HADOOP_HOME to make things easier for mysellf)
2. set the HADOOP_CLASSPATH to include ${HADOOP_HOME}/share/hadoop/common/*:${HADOOP_HOME}/share/hadoop/tools/lib/* (you may not need all those paths?)
3. stick this into $HADOOP_HOME/etc/hadoop/core-site.xml
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>s3a://YOUR-BUCKET</value>
  </property>
  <property>
    <name>fs.s3a.impl</name>
    <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
  </property>
</configuration>
4. stick this into your flink-conf
fs.hdfs.hadoopconf: $HADOOP_HOME/etc/hadoop
recovery.mode: zookeeper
recovery.zookeeper.quorum: whatever01.local:2181
recovery.zookeeper.path.root: /whatever
state.backend: filesystem
state.backend.fs.checkpointdir: s3a:///YOUR-BUCKET/checkpoints
recovery.zookeeper.storageDir: s3a:///YOUR-BUCKET/recovery

That's all I had to do in the Flink side. obvs in the AWS side, I had my IAM role setup with readlwrite access to the bucket.

Thanks,
Brian

On Mon, Dec 14, 2015 at 10:39 PM, Thomas Götzinger <[hidden email]> wrote:

Hi Brian

Can you give me short summary how to achieve this.

Am 14.12.2015 23:20 schrieb "Brian Chhun" <[hidden email]>:
For anyone else looking, I was able to use the s3a filesystem which can use IAM role based authentication as provided by the underlying AWS client library.

Thanks,
Brian

On Thu, Dec 10, 2015 at 4:28 PM, Brian Chhun <[hidden email]> wrote:
Thanks Ufuk, this did the trick.

Thanks,
Brian

On Wed, Dec 9, 2015 at 4:37 PM, Ufuk Celebi <[hidden email]> wrote:
Hey Brian,

did you follow the S3 setup guide? https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html

You have to set the fs.hdfs.hadoopconf property and add

<property>
<name>fs.s3.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>

to core-site.xml

– Ufuk

> On 09 Dec 2015, at 20:50, Brian Chhun <[hidden email]> wrote:
>
> Hello,
>
> I'm trying to setup an HA cluster and I'm running into issues using S3 as the state backend. This is raised during startup:
>
> 2015-12-09T19:23:36.430724+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: java.io.IOException: No file system found with scheme s3, referenced in file URI 's3:///flink/recovery/blob'.
>
> 2015-12-09T19:23:36.430858+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:242)
>
> 2015-12-09T19:23:36.430989+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.FileSystemBlobStore.<init>(FileSystemBlobStore.java:67)
>
> 2015-12-09T19:23:36.431297+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.BlobServer.<init>(BlobServer.java:105)
>
> 2015-12-09T19:23:36.431435+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.createJobManagerComponents(JobManager.scala:1814)
>
> 2015-12-09T19:23:36.431569+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1944)
>
> 2015-12-09T19:23:36.431690+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1898)
>
> 2015-12-09T19:23:36.431810+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startActorSystemAndJobManagerActors(JobManager.scala:1584)
>
> 2015-12-09T19:23:36.431933+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:1486)
>
> 2015-12-09T19:23:36.432414+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1447)
>
> 2015-12-09T19:23:36.432649+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)
>
> Is it possible to use S3 as the backend store or is only hdfs/mapfs supported?
>
>
> Thanks,
> Brian




Reply | Threaded
Open this post in threaded view
|

Re: Using S3 as state backend

Thomas Götzinger
Hi Brian,

thanks, that helped me a lot.



2015-12-15 16:52 GMT+01:00 Brian Chhun <[hidden email]>:
Sure, excuse me if anything was obvious or wrong, I know next to nothing about Hadoop.

1. get the Hadoop 2.7 distribution (I set its path to HADOOP_HOME to make things easier for mysellf)
2. set the HADOOP_CLASSPATH to include ${HADOOP_HOME}/share/hadoop/common/*:${HADOOP_HOME}/share/hadoop/tools/lib/* (you may not need all those paths?)
3. stick this into $HADOOP_HOME/etc/hadoop/core-site.xml
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>s3a://YOUR-BUCKET</value>
  </property>
  <property>
    <name>fs.s3a.impl</name>
    <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
  </property>
</configuration>
4. stick this into your flink-conf
fs.hdfs.hadoopconf: $HADOOP_HOME/etc/hadoop
recovery.mode: zookeeper
recovery.zookeeper.quorum: whatever01.local:2181
recovery.zookeeper.path.root: /whatever
state.backend: filesystem
state.backend.fs.checkpointdir: s3a:///YOUR-BUCKET/checkpoints
recovery.zookeeper.storageDir: s3a:///YOUR-BUCKET/recovery

That's all I had to do in the Flink side. obvs in the AWS side, I had my IAM role setup with readlwrite access to the bucket.

Thanks,
Brian

On Mon, Dec 14, 2015 at 10:39 PM, Thomas Götzinger <[hidden email]> wrote:

Hi Brian

Can you give me short summary how to achieve this.

Am 14.12.2015 23:20 schrieb "Brian Chhun" <[hidden email]>:
For anyone else looking, I was able to use the s3a filesystem which can use IAM role based authentication as provided by the underlying AWS client library.

Thanks,
Brian

On Thu, Dec 10, 2015 at 4:28 PM, Brian Chhun <[hidden email]> wrote:
Thanks Ufuk, this did the trick.

Thanks,
Brian

On Wed, Dec 9, 2015 at 4:37 PM, Ufuk Celebi <[hidden email]> wrote:
Hey Brian,

did you follow the S3 setup guide? https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html

You have to set the fs.hdfs.hadoopconf property and add

<property>
<name>fs.s3.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>

to core-site.xml

– Ufuk

> On 09 Dec 2015, at 20:50, Brian Chhun <[hidden email]> wrote:
>
> Hello,
>
> I'm trying to setup an HA cluster and I'm running into issues using S3 as the state backend. This is raised during startup:
>
> 2015-12-09T19:23:36.430724+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: java.io.IOException: No file system found with scheme s3, referenced in file URI 's3:///flink/recovery/blob'.
>
> 2015-12-09T19:23:36.430858+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:242)
>
> 2015-12-09T19:23:36.430989+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.FileSystemBlobStore.<init>(FileSystemBlobStore.java:67)
>
> 2015-12-09T19:23:36.431297+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.blob.BlobServer.<init>(BlobServer.java:105)
>
> 2015-12-09T19:23:36.431435+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.createJobManagerComponents(JobManager.scala:1814)
>
> 2015-12-09T19:23:36.431569+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1944)
>
> 2015-12-09T19:23:36.431690+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1898)
>
> 2015-12-09T19:23:36.431810+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.startActorSystemAndJobManagerActors(JobManager.scala:1584)
>
> 2015-12-09T19:23:36.431933+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:1486)
>
> 2015-12-09T19:23:36.432414+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1447)
>
> 2015-12-09T19:23:36.432649+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: #011at org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)
>
> Is it possible to use S3 as the backend store or is only hdfs/mapfs supported?
>
>
> Thanks,
> Brian







--

Viele Grüße

 

Thomas Götzinger

Freiberuflicher Informatiker

 

Glockenstraße 2a

D-66882 Hütschenhausen OT Spesbach

Mobil: +49 (0)176 82180714

Homezone: +49 (0) 6371 735083

Privat: +49 (0) 6371 954050

[hidden email]

epost: [hidden email]