Flink 1.7.1 flink-s3-fs-hadoop-1.7.1 doesn't delete older chk-<id> directories

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink 1.7.1 flink-s3-fs-hadoop-1.7.1 doesn't delete older chk-<id> directories

anaray
Hi,

I am using 1.7.1 and we store checkpoints in Ceph and we use
flink-s3-fs-hadoop-1.7.1 to connect to Ceph. I have only 1 checkpoint
retained. Issue I see is that previous/old chk-<id> directories are still
around. I verified that those older doesn't contain any checkpoint data. But
the directories keep accumulating.
Why is that these old stale directories aren't deleted after part checkpoint
deletion step? Please let me know.

I see :
metacheckpoints/00000000000000000000000000000000/chk-175
metacheckpoints/00000000000000000000000000000000/chk-176
metacheckpoints/00000000000000000000000000000000/chk-177  <- latest


Thanks,




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink 1.7.1 flink-s3-fs-hadoop-1.7.1 doesn't delete older chk-<id> directories

Fabian Hueske-2
Hi,

I found a few issues in Jira that are related to not deleted checkpoint directories, but only FLINK-10855 [1] seems to be a possible reason in your case.
Is it possible that the checkpoints of the remaining directories failed?

If that's not the case, would you mind creating a Jira issue and describe the bug?

Thank you,
Fabian


Am Do., 6. Juni 2019 um 21:04 Uhr schrieb anaray <[hidden email]>:
Hi,

I am using 1.7.1 and we store checkpoints in Ceph and we use
flink-s3-fs-hadoop-1.7.1 to connect to Ceph. I have only 1 checkpoint
retained. Issue I see is that previous/old chk-<id> directories are still
around. I verified that those older doesn't contain any checkpoint data. But
the directories keep accumulating.
Why is that these old stale directories aren't deleted after part checkpoint
deletion step? Please let me know.

I see :
metacheckpoints/00000000000000000000000000000000/chk-175
metacheckpoints/00000000000000000000000000000000/chk-176
metacheckpoints/00000000000000000000000000000000/chk-177  <- latest


Thanks,




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink 1.7.1 flink-s3-fs-hadoop-1.7.1 doesn't delete older chk-<id> directories

anaray
Hi Fabian,

Thank you. Your observation is correct. The stale directories belong to the
failed checkpoints. So it is related to FLINK-10855. I will closely follow
FLINK-10855 and test when fix is available


Thank You,
anaray



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/