Automatically Clearing Temporary Directories

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Automatically Clearing Temporary Directories

David Maddison
Hi,

When a TaskManager is restarted it can leave behind unreferenced BlobServer cache directories in the temporary storage that never get cleaned up.  Would it be safe to automatically clear the temporary storage every time when a TaskManager is started?

(Note: the temporary volumes in use are dedicated to the TaskManager and not shared :-)

Thanks in advance,

David.
Reply | Threaded
Open this post in threaded view
|

Re: Automatically Clearing Temporary Directories

Yang Wang
Hi David,

Currently, the TaskManager could cleanup the non-referenced files in blob cache. It
could configured via `blob.service.cleanup.interval`[1].
Also when the TaskManager is shut down gracefully, the storage directory will be deleted.
So do you stop your TaskManager forcibly(i.e. kill -9)?



Best,
Yang

David Maddison <[hidden email]> 于2020年3月11日周三 上午1:39写道:
Hi,

When a TaskManager is restarted it can leave behind unreferenced BlobServer cache directories in the temporary storage that never get cleaned up.  Would it be safe to automatically clear the temporary storage every time when a TaskManager is started?

(Note: the temporary volumes in use are dedicated to the TaskManager and not shared :-)

Thanks in advance,

David.
Reply | Threaded
Open this post in threaded view
|

Re: Automatically Clearing Temporary Directories

Gary Yao-5
In reply to this post by David Maddison
Hi David,

> Would it be safe to automatically clear the temporary storage every time when a TaskManager is started?
> (Note: the temporary volumes in use are dedicated to the TaskManager and not shared :-)
Yes, it is safe in your case.

Best,
Gary

On Tue, Mar 10, 2020 at 6:39 PM David Maddison <[hidden email]> wrote:
Hi,

When a TaskManager is restarted it can leave behind unreferenced BlobServer cache directories in the temporary storage that never get cleaned up.  Would it be safe to automatically clear the temporary storage every time when a TaskManager is started?

(Note: the temporary volumes in use are dedicated to the TaskManager and not shared :-)

Thanks in advance,

David.
Reply | Threaded
Open this post in threaded view
|

Re: Automatically Clearing Temporary Directories

David Maddison
Thanks for the responses and thanks Gary for the confirmation.

Just to give some background, we deploy Flink inside Kubernetes so there is a chance that TaskManagers COULD be shut down in a non-graceful way leaving cache artifacts on the temporary volumes.

With Gary's confirmation, we'll add an init container to make sure the volumes are cleared before a TM starts.

/David/

On Thu, Mar 12, 2020 at 8:24 AM Gary Yao <[hidden email]> wrote:
Hi David,

> Would it be safe to automatically clear the temporary storage every time when a TaskManager is started?
> (Note: the temporary volumes in use are dedicated to the TaskManager and not shared :-)
Yes, it is safe in your case.

Best,
Gary

On Tue, Mar 10, 2020 at 6:39 PM David Maddison <[hidden email]> wrote:
Hi,

When a TaskManager is restarted it can leave behind unreferenced BlobServer cache directories in the temporary storage that never get cleaned up.  Would it be safe to automatically clear the temporary storage every time when a TaskManager is started?

(Note: the temporary volumes in use are dedicated to the TaskManager and not shared :-)

Thanks in advance,

David.