(DEPRECATED) Apache Flink User Mailing List archive.

Empty directories left over from checkpointing

Classic

List

Threaded

6 messages Options

Hao Sun

Empty directories left over from checkpointing

Hi, I am using RocksDB and S3 as storage backend for my checkpoints.

Can flink delete these empty directories automatically? Or I need a background job to do the deletion?

I know this has been discussed before, but I could not get a concrete answer for it yet. Thanks

Elias Levy

Re: Empty directories left over from checkpointing

There are a couple of related JIRAs:

https://issues.apache.org/jira/browse/FLINK-7587

https://issues.apache.org/jira/browse/FLINK-7266

On Tue, Sep 19, 2017 at 12:20 PM, Hao Sun <[hidden email]> wrote:

Hi, I am using RocksDB and S3 as storage backend for my checkpoints.
Can flink delete these empty directories automatically? Or I need a background job to do the deletion?

I know this has been discussed before, but I could not get a concrete answer for it yet. Thanks

Hao Sun

Re: Empty directories left over from checkpointing

Thanks Elias! Seems like there is no better answer than "do not care about them now", or delete with a background job.

On Tue, Sep 19, 2017 at 4:11 PM Elias Levy <[hidden email]> wrote:

There are a couple of related JIRAs:

https://issues.apache.org/jira/browse/FLINK-7587
https://issues.apache.org/jira/browse/FLINK-7266

On Tue, Sep 19, 2017 at 12:20 PM, Hao Sun <[hidden email]> wrote:
Hi, I am using RocksDB and S3 as storage backend for my checkpoints.
Can flink delete these empty directories automatically? Or I need a background job to do the deletion?

I know this has been discussed before, but I could not get a concrete answer for it yet. Thanks

Stefan Richter

Re: Empty directories left over from checkpointing

Hi,

We recently removed some cleanup code, because it involved checking some store meta data to check when we can delete a directory. For certain stores (like S3), requesting this meta data whenever we delete a file was so expensive that it could bring down the job because removing state could not be processed fast enough. We have a temporary fix in place now, so that jobs at large scale can still run reliably on stores like S3. Currently, this comes at the cost of not cleaning up directories but we are clearly planning to introduce a different mechanism for directory cleanup in the future that is not as fine grained as doing meta data queries per file delete. In the meantime, unfortunately the best way is to cleanup empty directories with some external tool.

Best,

Stefan

Am 20.09.2017 um 01:23 schrieb Hao Sun <[hidden email]>:

Thanks Elias! Seems like there is no better answer than "do not care about them now", or delete with a background job.

On Tue, Sep 19, 2017 at 4:11 PM Elias Levy <[hidden email]> wrote:
There are a couple of related JIRAs:

https://issues.apache.org/jira/browse/FLINK-7587
https://issues.apache.org/jira/browse/FLINK-7266

On Tue, Sep 19, 2017 at 12:20 PM, Hao Sun <[hidden email]> wrote:
Hi, I am using RocksDB and S3 as storage backend for my checkpoints.
Can flink delete these empty directories automatically? Or I need a background job to do the deletion?

I know this has been discussed before, but I could not get a concrete answer for it yet. Thanks

<image.png>

Stephan Ewen

Re: Empty directories left over from checkpointing

Some updates on this:

Aside from reworking how the S3 directory handling is done, we also looked into supporting S3 different than we currently do. Currently support goes strictly through Hadoop's S3 file systems, which we need to change, because we want it to be possible to use Flink without Hadoop dependencies.

In the next release, we will have S3 file systems without Hadoop dependency:

- One implementation wraps and shades a newer version of s3a. For compatibility with current behavior.

- The second is interesting for this directory problem: It uses Pesto's S3 support which is a bit different from Hadoop' s3n and s3a. It does not create empty directly marker files, hence it is not trying to make S3 look as much like a file system as s3a and s3n are, but that is actually of advantage for checkpointing. With that implementation, the here mentioned issue should not exist.

Caveat: The new file systems and their aggressive shading needs to be testet at scale still, but we are happy to take any feedback on this.

Merged as of https://github.com/apache/flink/commit/991af3652479f85f732cbbade46bed7df1c5d819

You can use them by simply dropping the respective JARs from "/opt" into "/lib" and using the file system scheme "s3://".

The configuration is as in Hadoop/Presto, but you can drop the config keys into the Flink configuration - they will be forwarded to the Hadoop configuration.

Hope that this makes the S3 use a lot easier and more fun...

On Wed, Sep 20, 2017 at 2:49 PM, Stefan Richter <[hidden email]> wrote:

Hi,

We recently removed some cleanup code, because it involved checking some store meta data to check when we can delete a directory. For certain stores (like S3), requesting this meta data whenever we delete a file was so expensive that it could bring down the job because removing state could not be processed fast enough. We have a temporary fix in place now, so that jobs at large scale can still run reliably on stores like S3. Currently, this comes at the cost of not cleaning up directories but we are clearly planning to introduce a different mechanism for directory cleanup in the future that is not as fine grained as doing meta data queries per file delete. In the meantime, unfortunately the best way is to cleanup empty directories with some external tool.

Best,
Stefan

Am 20.09.2017 um 01:23 schrieb Hao Sun <[hidden email]>:

Thanks Elias! Seems like there is no better answer than "do not care about them now", or delete with a background job.

On Tue, Sep 19, 2017 at 4:11 PM Elias Levy <[hidden email]> wrote:
There are a couple of related JIRAs:

https://issues.apache.org/jira/browse/FLINK-7587
https://issues.apache.org/jira/browse/FLINK-7266

On Tue, Sep 19, 2017 at 12:20 PM, Hao Sun <[hidden email]> wrote:
Hi, I am using RocksDB and S3 as storage backend for my checkpoints.
Can flink delete these empty directories automatically? Or I need a background job to do the deletion?

I know this has been discussed before, but I could not get a concrete answer for it yet. Thanks

<image.png>

Elias Levy

Re: Empty directories left over from checkpointing

Stephan,

Thanks for taking care of this. We'll give it a try once 1.4 drops.

On Sat, Oct 14, 2017 at 1:25 PM, Stephan Ewen <[hidden email]> wrote:

Some updates on this:

Aside from reworking how the S3 directory handling is done, we also looked into supporting S3 different than we currently do. Currently support goes strictly through Hadoop's S3 file systems, which we need to change, because we want it to be possible to use Flink without Hadoop dependencies.

In the next release, we will have S3 file systems without Hadoop dependency:

- One implementation wraps and shades a newer version of s3a. For compatibility with current behavior.

- The second is interesting for this directory problem: It uses Pesto's S3 support which is a bit different from Hadoop' s3n and s3a. It does not create empty directly marker files, hence it is not trying to make S3 look as much like a file system as s3a and s3n are, but that is actually of advantage for checkpointing. With that implementation, the here mentioned issue should not exist.

Caveat: The new file systems and their aggressive shading needs to be testet at scale still, but we are happy to take any feedback on this.

Merged as of https://github.com/apache/flink/commit/991af3652479f85f732cbbade46bed7df1c5d819

You can use them by simply dropping the respective JARs from "/opt" into "/lib" and using the file system scheme "s3://".
The configuration is as in Hadoop/Presto, but you can drop the config keys into the Flink configuration - they will be forwarded to the Hadoop configuration.

Hope that this makes the S3 use a lot easier and more fun...

On Wed, Sep 20, 2017 at 2:49 PM, Stefan Richter <[hidden email]> wrote:
Hi,

We recently removed some cleanup code, because it involved checking some store meta data to check when we can delete a directory. For certain stores (like S3), requesting this meta data whenever we delete a file was so expensive that it could bring down the job because removing state could not be processed fast enough. We have a temporary fix in place now, so that jobs at large scale can still run reliably on stores like S3. Currently, this comes at the cost of not cleaning up directories but we are clearly planning to introduce a different mechanism for directory cleanup in the future that is not as fine grained as doing meta data queries per file delete. In the meantime, unfortunately the best way is to cleanup empty directories with some external tool.

Best,
Stefan

Am 20.09.2017 um 01:23 schrieb Hao Sun <[hidden email]>:

Thanks Elias! Seems like there is no better answer than "do not care about them now", or delete with a background job.

On Tue, Sep 19, 2017 at 4:11 PM Elias Levy <[hidden email]> wrote:
There are a couple of related JIRAs:

https://issues.apache.org/jira/browse/FLINK-7587
https://issues.apache.org/jira/browse/FLINK-7266

On Tue, Sep 19, 2017 at 12:20 PM, Hao Sun <[hidden email]> wrote:
Hi, I am using RocksDB and S3 as storage backend for my checkpoints.
Can flink delete these empty directories automatically? Or I need a background job to do the deletion?

I know this has been discussed before, but I could not get a concrete answer for it yet. Thanks

<image.png>