S3 checkpointing exception

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

S3 checkpointing exception

Vishwas Siravara
I am using ecs S3 instance to checkpoint, I use the following configuration. 


s3.access-keyvdna_np_user
s3.endpointhttps://SU73ECSG******COM:9021
s3.secret-key******
I set the checkpoint in the code like
 env.setStateBackend(new FsStateBackend("s3://vishwas.test1/checkpoints"))

I have a bucket called vishwas.test1, should I first create a directory in s3 called checkpoints first ? 

I see this error in the log , what does this mean ? Thank you so much for your help. 

org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on s3://vishwas.test1/checkpoint/35abe5cadda5ff77fd3347a956b6f1e2: org.apache.flink.fs.s3base.shaded.com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to create an XMLReader: Couldn't initialize a SAX driver to create an XMLReader
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2251)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:2037)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:2007)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2326)
at org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.mkdirs(HadoopFileSystem.java:170)
at org.apache.flink.core.fs.SafetyNetWrapperFileSystem.mkdirs(SafetyNetWrapperFileSystem.java:112)
at org.apache.flink.runtime.state.filesystem.FsCheckpointStorage.<init>(FsCheckpointStorage.java:83)
at org.apache.flink.runtime.state.filesystem.FsCheckpointStorage.<init>(FsCheckpointStorage.java:58)
at org.apache.flink.runtime.state.filesystem.FsStateBackend.createCheckpointStorage(FsStateBackend.java:444)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:257)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704)
at java.lang.Thread.run(Thread.java:748)
Reply | Threaded
Open this post in threaded view
|

Re: S3 checkpointing exception

Vishwas Siravara
I found the solution to this problem , it was a dependency issue, I had to exclude "xml-apis" to get this fixed. Also the s3-presto jar provides better error messages which was helpful. 

Thanks,
Vishwas 

On Thu, Jul 18, 2019 at 8:14 PM Vishwas Siravara <[hidden email]> wrote:
I am using ecs S3 instance to checkpoint, I use the following configuration. 


s3.access-keyvdna_np_user
s3.endpointhttps://SU73ECSG******COM:9021
s3.secret-key******
I set the checkpoint in the code like
 env.setStateBackend(new FsStateBackend("s3://vishwas.test1/checkpoints"))

I have a bucket called vishwas.test1, should I first create a directory in s3 called checkpoints first ? 

I see this error in the log , what does this mean ? Thank you so much for your help. 

org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on s3://vishwas.test1/checkpoint/35abe5cadda5ff77fd3347a956b6f1e2: org.apache.flink.fs.s3base.shaded.com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to create an XMLReader: Couldn't initialize a SAX driver to create an XMLReader
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2251)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:2037)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:2007)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2326)
at org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.mkdirs(HadoopFileSystem.java:170)
at org.apache.flink.core.fs.SafetyNetWrapperFileSystem.mkdirs(SafetyNetWrapperFileSystem.java:112)
at org.apache.flink.runtime.state.filesystem.FsCheckpointStorage.<init>(FsCheckpointStorage.java:83)
at org.apache.flink.runtime.state.filesystem.FsCheckpointStorage.<init>(FsCheckpointStorage.java:58)
at org.apache.flink.runtime.state.filesystem.FsStateBackend.createCheckpointStorage(FsStateBackend.java:444)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:257)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704)
at java.lang.Thread.run(Thread.java:748)