What are blobstore files and why do they keep filling up /tmp directory?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

What are blobstore files and why do they keep filling up /tmp directory?

HarshithBolar

Hi all,

 

We're running Flink on a standalone five node cluster. The /tmp/ directory keeps filling with directories starting with blobstore--*. These directories are very large (approx 1 GB) and fill up the space very quickly and the jobs fail with a No space left of device error. The files in these directories appear to be some form of binary representation of the jobs that are running on the cluster.

What are these files and how do I take care of cleaning them so they don't fill up /tmp/ causing jobs to fail?

Flink version: 1.4.2

 

Thanks,

Harshith

Reply | Threaded
Open this post in threaded view
|

Re: What are blobstore files and why do they keep filling up /tmp directory?

Till Rohrmann
Hi Harshith,

the blob store files are necessary to distribute the Flink job in your cluster. After the job has been completed, they should be cleaned up. Only in the case of cluster crashes the clean up should not happen. Since Flink 1.4.2 is no longer actively supported, I would suggest to upgrade to the latest Flink version and to check whether the problem still occurs.

Cheers,
Till

On Tue, Feb 26, 2019 at 2:48 AM Kumar Bolar, Harshith <[hidden email]> wrote:

Hi all,

 

We're running Flink on a standalone five node cluster. The /tmp/ directory keeps filling with directories starting with blobstore--*. These directories are very large (approx 1 GB) and fill up the space very quickly and the jobs fail with a No space left of device error. The files in these directories appear to be some form of binary representation of the jobs that are running on the cluster.

What are these files and how do I take care of cleaning them so they don't fill up /tmp/ causing jobs to fail?

Flink version: 1.4.2

 

Thanks,

Harshith

Reply | Threaded
Open this post in threaded view
|

Re: Re: What are blobstore files and why do they keep filling up /tmp directory?

HarshithBolar

Thanks Till,

 

It appears to occur when a task manager crashes and restarts – A new blob-store directory gets created and the old one remains as is, and this piles up over time. Should these *old* blob-stores be manually cleared every time a task manager crashes and restarts?

 

Regards,

Harshith

 

From: Till Rohrmann <[hidden email]>
Date: Tuesday, 26 February 2019 at 4:12 PM
To: Harshith Kumar Bolar <[hidden email]>
Cc: user <[hidden email]>
Subject: [External] Re: What are blobstore files and why do they keep filling up /tmp directory?

 

Hi Harshith,

 

the blob store files are necessary to distribute the Flink job in your cluster. After the job has been completed, they should be cleaned up. Only in the case of cluster crashes the clean up should not happen. Since Flink 1.4.2 is no longer actively supported, I would suggest to upgrade to the latest Flink version and to check whether the problem still occurs.

 

Cheers,

Till

 

On Tue, Feb 26, 2019 at 2:48 AM Kumar Bolar, Harshith <[hidden email]> wrote:

Hi all,

 

We're running Flink on a standalone five node cluster. The /tmp/ directory keeps filling with directories starting with blobstore--*. These directories are very large (approx 1 GB) and fill up the space very quickly and the jobs fail with a No space left of device error. The files in these directories appear to be some form of binary representation of the jobs that are running on the cluster.

What are these files and how do I take care of cleaning them so they don't fill up /tmp/ causing jobs to fail?

Flink version: 1.4.2

 

Thanks,

Harshith

Reply | Threaded
Open this post in threaded view
|

Re: Re: What are blobstore files and why do they keep filling up /tmp directory?

Till Rohrmann
Yes, at the moment this does not happen automatically. When deleting the directories you have to be careful not to delete the directory of a running TaskManager.

Cheers,
Till

On Wed, Feb 27, 2019 at 6:29 PM Kumar Bolar, Harshith <[hidden email]> wrote:

Thanks Till,

 

It appears to occur when a task manager crashes and restarts – A new blob-store directory gets created and the old one remains as is, and this piles up over time. Should these *old* blob-stores be manually cleared every time a task manager crashes and restarts?

 

Regards,

Harshith

 

From: Till Rohrmann <[hidden email]>
Date: Tuesday, 26 February 2019 at 4:12 PM
To: Harshith Kumar Bolar <[hidden email]>
Cc: user <[hidden email]>
Subject: [External] Re: What are blobstore files and why do they keep filling up /tmp directory?

 

Hi Harshith,

 

the blob store files are necessary to distribute the Flink job in your cluster. After the job has been completed, they should be cleaned up. Only in the case of cluster crashes the clean up should not happen. Since Flink 1.4.2 is no longer actively supported, I would suggest to upgrade to the latest Flink version and to check whether the problem still occurs.

 

Cheers,

Till

 

On Tue, Feb 26, 2019 at 2:48 AM Kumar Bolar, Harshith <[hidden email]> wrote:

Hi all,

 

We're running Flink on a standalone five node cluster. The /tmp/ directory keeps filling with directories starting with blobstore--*. These directories are very large (approx 1 GB) and fill up the space very quickly and the jobs fail with a No space left of device error. The files in these directories appear to be some form of binary representation of the jobs that are running on the cluster.

What are these files and how do I take care of cleaning them so they don't fill up /tmp/ causing jobs to fail?

Flink version: 1.4.2

 

Thanks,

Harshith

Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: What are blobstore files and why do they keep filling up /tmp directory?

HarshithBolar

Is there any way to figure out which one is being run on the TaskManager? Would it be safe to assume that it is the latest directory created?

 

Regards,

Harshith

 

From: Till Rohrmann <[hidden email]>
Date: Thursday, 28 February 2019 at 3:28 PM
To: Harshith Kumar Bolar <[hidden email]>
Cc: user <[hidden email]>
Subject: [External] Re: Re: What are blobstore files and why do they keep filling up /tmp directory?

 

Yes, at the moment this does not happen automatically. When deleting the directories you have to be careful not to delete the directory of a running TaskManager.

 

Cheers,

Till

 

On Wed, Feb 27, 2019 at 6:29 PM Kumar Bolar, Harshith <[hidden email]> wrote:

Thanks Till,

 

It appears to occur when a task manager crashes and restarts – A new blob-store directory gets created and the old one remains as is, and this piles up over time. Should these *old* blob-stores be manually cleared every time a task manager crashes and restarts?

 

Regards,

Harshith

 

From: Till Rohrmann <[hidden email]>
Date: Tuesday, 26 February 2019 at 4:12 PM
To: Harshith Kumar Bolar <[hidden email]>
Cc: user <[hidden email]>
Subject: [External] Re: What are blobstore files and why do they keep filling up /tmp directory?

 

Hi Harshith,

 

the blob store files are necessary to distribute the Flink job in your cluster. After the job has been completed, they should be cleaned up. Only in the case of cluster crashes the clean up should not happen. Since Flink 1.4.2 is no longer actively supported, I would suggest to upgrade to the latest Flink version and to check whether the problem still occurs.

 

Cheers,

Till

 

On Tue, Feb 26, 2019 at 2:48 AM Kumar Bolar, Harshith <[hidden email]> wrote:

Hi all,

 

We're running Flink on a standalone five node cluster. The /tmp/ directory keeps filling with directories starting with blobstore--*. These directories are very large (approx 1 GB) and fill up the space very quickly and the jobs fail with a No space left of device error. The files in these directories appear to be some form of binary representation of the jobs that are running on the cluster.

What are these files and how do I take care of cleaning them so they don't fill up /tmp/ causing jobs to fail?

Flink version: 1.4.2

 

Thanks,

Harshith

Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: What are blobstore files and why do they keep filling up /tmp directory?

Till Rohrmann
Yes this is one way. Another way could be to look into the logs of the running TaskManagers. They should contain the path of the blob store directory.

Cheers,
Till

On Thu, Feb 28, 2019 at 12:04 PM Kumar Bolar, Harshith <[hidden email]> wrote:

Is there any way to figure out which one is being run on the TaskManager? Would it be safe to assume that it is the latest directory created?

 

Regards,

Harshith

 

From: Till Rohrmann <[hidden email]>
Date: Thursday, 28 February 2019 at 3:28 PM
To: Harshith Kumar Bolar <[hidden email]>
Cc: user <[hidden email]>
Subject: [External] Re: Re: What are blobstore files and why do they keep filling up /tmp directory?

 

Yes, at the moment this does not happen automatically. When deleting the directories you have to be careful not to delete the directory of a running TaskManager.

 

Cheers,

Till

 

On Wed, Feb 27, 2019 at 6:29 PM Kumar Bolar, Harshith <[hidden email]> wrote:

Thanks Till,

 

It appears to occur when a task manager crashes and restarts – A new blob-store directory gets created and the old one remains as is, and this piles up over time. Should these *old* blob-stores be manually cleared every time a task manager crashes and restarts?

 

Regards,

Harshith

 

From: Till Rohrmann <[hidden email]>
Date: Tuesday, 26 February 2019 at 4:12 PM
To: Harshith Kumar Bolar <[hidden email]>
Cc: user <[hidden email]>
Subject: [External] Re: What are blobstore files and why do they keep filling up /tmp directory?

 

Hi Harshith,

 

the blob store files are necessary to distribute the Flink job in your cluster. After the job has been completed, they should be cleaned up. Only in the case of cluster crashes the clean up should not happen. Since Flink 1.4.2 is no longer actively supported, I would suggest to upgrade to the latest Flink version and to check whether the problem still occurs.

 

Cheers,

Till

 

On Tue, Feb 26, 2019 at 2:48 AM Kumar Bolar, Harshith <[hidden email]> wrote:

Hi all,

 

We're running Flink on a standalone five node cluster. The /tmp/ directory keeps filling with directories starting with blobstore--*. These directories are very large (approx 1 GB) and fill up the space very quickly and the jobs fail with a No space left of device error. The files in these directories appear to be some form of binary representation of the jobs that are running on the cluster.

What are these files and how do I take care of cleaning them so they don't fill up /tmp/ causing jobs to fail?

Flink version: 1.4.2

 

Thanks,

Harshith

Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: Re: What are blobstore files and why do they keep filling up /tmp directory?

HarshithBolar

Thanks a lot. Looking into the logs sounds like a much cleaner approach :-)

 

From: Till Rohrmann <[hidden email]>
Date: Thursday, 28 February 2019 at 8:14 PM
To: Harshith Kumar Bolar <[hidden email]>
Cc: user <[hidden email]>
Subject: [External] Re: Re: Re: What are blobstore files and why do they keep filling up /tmp directory?

 

Yes this is one way. Another way could be to look into the logs of the running TaskManagers. They should contain the path of the blob store directory.

 

Cheers,

Till

 

On Thu, Feb 28, 2019 at 12:04 PM Kumar Bolar, Harshith <[hidden email]> wrote:

Is there any way to figure out which one is being run on the TaskManager? Would it be safe to assume that it is the latest directory created?

 

Regards,

Harshith

 

From: Till Rohrmann <[hidden email]>
Date: Thursday, 28 February 2019 at 3:28 PM
To: Harshith Kumar Bolar <[hidden email]>
Cc: user <[hidden email]>
Subject: [External] Re: Re: What are blobstore files and why do they keep filling up /tmp directory?

 

Yes, at the moment this does not happen automatically. When deleting the directories you have to be careful not to delete the directory of a running TaskManager.

 

Cheers,

Till

 

On Wed, Feb 27, 2019 at 6:29 PM Kumar Bolar, Harshith <[hidden email]> wrote:

Thanks Till,

 

It appears to occur when a task manager crashes and restarts – A new blob-store directory gets created and the old one remains as is, and this piles up over time. Should these *old* blob-stores be manually cleared every time a task manager crashes and restarts?

 

Regards,

Harshith

 

From: Till Rohrmann <[hidden email]>
Date: Tuesday, 26 February 2019 at 4:12 PM
To: Harshith Kumar Bolar <[hidden email]>
Cc: user <[hidden email]>
Subject: [External] Re: What are blobstore files and why do they keep filling up /tmp directory?

 

Hi Harshith,

 

the blob store files are necessary to distribute the Flink job in your cluster. After the job has been completed, they should be cleaned up. Only in the case of cluster crashes the clean up should not happen. Since Flink 1.4.2 is no longer actively supported, I would suggest to upgrade to the latest Flink version and to check whether the problem still occurs.

 

Cheers,

Till

 

On Tue, Feb 26, 2019 at 2:48 AM Kumar Bolar, Harshith <[hidden email]> wrote:

Hi all,

 

We're running Flink on a standalone five node cluster. The /tmp/ directory keeps filling with directories starting with blobstore--*. These directories are very large (approx 1 GB) and fill up the space very quickly and the jobs fail with a No space left of device error. The files in these directories appear to be some form of binary representation of the jobs that are running on the cluster.

What are these files and how do I take care of cleaning them so they don't fill up /tmp/ causing jobs to fail?

Flink version: 1.4.2

 

Thanks,

Harshith