Cannot download Jars from S3 due to resource timestamp changed

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Cannot download Jars from S3 due to resource timestamp changed

Yan Yan
Hi, 

I am running issues when trying to move from HDFS to S3 using Flink 1.6. 

I am getting an exception from Hadoop code: 
IOException("Resource " + sCopy +
" changed on src filesystem (expected " + resource.getTimestamp() +
", was " + sStat.getModificationTime());

Digging into this, I found there was one commit made by Nico trying to fix this issue in 2018. However, the fix did not work for my case, as the fs.setTimes() method was not implemented in the hadoop-aws S3AFilesystem I am using. And it seems S3 does not allow you to override the last modified time for an object.

I am able to make an workaround the other way round: reading the timestamp from S3 and override the local resource. Just wonder if any one has seen similar issues, or he/she can actually make it work by using different implementation of S3AFilesystem? Thanks!

--
Best,
Yan
Reply | Threaded
Open this post in threaded view
|

Re: Cannot download Jars from S3 due to resource timestamp changed

yangtao.yt
Hi, Yan.
we have met this problem too when using aliyun-pangu and have commented in FLINK-8801 but no response yet. 
I think most file systems including s3/s3a/s3n/azure/aliyun-oss etc can encounter this problem since they doesn’t implement FileSystem#setTimes but the PR in FLINK-8801 think they does.
We have made a similar workaround for this problem.


Best, 
Tao Yang

在 2019年4月5日,上午5:22,Yan Yan <[hidden email]> 写道:

Hi, 

I am running issues when trying to move from HDFS to S3 using Flink 1.6. 

I am getting an exception from Hadoop code: 
IOException("Resource " + sCopy +
" changed on src filesystem (expected " + resource.getTimestamp() +
", was " + sStat.getModificationTime());

Digging into this, I found there was one commit made by Nico trying to fix this issue in 2018. However, the fix did not work for my case, as the fs.setTimes() method was not implemented in the hadoop-aws S3AFilesystem I am using. And it seems S3 does not allow you to override the last modified time for an object.

I am able to make an workaround the other way round: reading the timestamp from S3 and override the local resource. Just wonder if any one has seen similar issues, or he/she can actually make it work by using different implementation of S3AFilesystem? Thanks!

--
Best,
Yan


smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Cannot download Jars from S3 due to resource timestamp changed

Yan Yan
Hi Yantao,


@Nico @Till Do you mind review if an alternative fix would be needed? If so, I can create a new JIRA.

Thanks,
Yan

On Thu, Apr 4, 2019 at 5:45 PM yangtao.yt <[hidden email]> wrote:
Hi, Yan.
we have met this problem too when using aliyun-pangu and have commented in FLINK-8801 but no response yet. 
I think most file systems including s3/s3a/s3n/azure/aliyun-oss etc can encounter this problem since they doesn’t implement FileSystem#setTimes but the PR in FLINK-8801 think they does.
We have made a similar workaround for this problem.


Best, 
Tao Yang

在 2019年4月5日,上午5:22,Yan Yan <[hidden email]> 写道:

Hi, 

I am running issues when trying to move from HDFS to S3 using Flink 1.6. 

I am getting an exception from Hadoop code: 
IOException("Resource " + sCopy +
" changed on src filesystem (expected " + resource.getTimestamp() +
", was " + sStat.getModificationTime());

Digging into this, I found there was one commit made by Nico trying to fix this issue in 2018. However, the fix did not work for my case, as the fs.setTimes() method was not implemented in the hadoop-aws S3AFilesystem I am using. And it seems S3 does not allow you to override the last modified time for an object.

I am able to make an workaround the other way round: reading the timestamp from S3 and override the local resource. Just wonder if any one has seen similar issues, or he/she can actually make it work by using different implementation of S3AFilesystem? Thanks!

--
Best,
Yan



--
Best,
Yan
Reply | Threaded
Open this post in threaded view
|

Re: Cannot download Jars from S3 due to resource timestamp changed

Till Rohrmann
Hi Yan and Tao Yang,

thanks for raising this issue. Let's continue the discussion on the ticket in order to figure out a proper solution.

Cheers,
Till

On Fri, Apr 5, 2019 at 11:23 PM Yan Yan <[hidden email]> wrote:
Hi Yantao,


@Nico @Till Do you mind review if an alternative fix would be needed? If so, I can create a new JIRA.

Thanks,
Yan

On Thu, Apr 4, 2019 at 5:45 PM yangtao.yt <[hidden email]> wrote:
Hi, Yan.
we have met this problem too when using aliyun-pangu and have commented in FLINK-8801 but no response yet. 
I think most file systems including s3/s3a/s3n/azure/aliyun-oss etc can encounter this problem since they doesn’t implement FileSystem#setTimes but the PR in FLINK-8801 think they does.
We have made a similar workaround for this problem.


Best, 
Tao Yang

在 2019年4月5日,上午5:22,Yan Yan <[hidden email]> 写道:

Hi, 

I am running issues when trying to move from HDFS to S3 using Flink 1.6. 

I am getting an exception from Hadoop code: 
IOException("Resource " + sCopy +
" changed on src filesystem (expected " + resource.getTimestamp() +
", was " + sStat.getModificationTime());

Digging into this, I found there was one commit made by Nico trying to fix this issue in 2018. However, the fix did not work for my case, as the fs.setTimes() method was not implemented in the hadoop-aws S3AFilesystem I am using. And it seems S3 does not allow you to override the last modified time for an object.

I am able to make an workaround the other way round: reading the timestamp from S3 and override the local resource. Just wonder if any one has seen similar issues, or he/she can actually make it work by using different implementation of S3AFilesystem? Thanks!

--
Best,
Yan



--
Best,
Yan