Flink does not cleanup some disk memory after submitting jar over rest

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink does not cleanup some disk memory after submitting jar over rest

Great Info

I have deployed my own flink setup in AWS ECS. One Service for JobManager and one Service for task Managers. I am running one ECS task for a job manager and 3 ecs tasks for TASK managers.

I have a kind of batch job which I upload using flink rest every-day with changing new arguments, when I submit each time disk memory gets increased by ~ 600MB, I have given a checkpoint as S3 . Also I have set historyserver.archive.clean-expired-jobs true .

Since I am running on ECS, I am not able to find why the memory is getting increased on every jar upload and execution .

What are the flink config params I should look at to make sure the memory is not shooting up?

Reply | Threaded
Open this post in threaded view
|

Re: Flink does not cleanup some disk memory after submitting jar over rest

Maciek Próchniak

Hi,

don't know if this is the problem you're facing, but some time ago we encountered two issues connected to REST API and increased disk usage after each submission:

https://issues.apache.org/jira/browse/FLINK-21164

https://issues.apache.org/jira/browse/FLINK-9844

- they're closed ATM, but only 1.12.2 contains the fixes.


maciek


On 08.04.2021 19:52, Great Info wrote:

I have deployed my own flink setup in AWS ECS. One Service for JobManager and one Service for task Managers. I am running one ECS task for a job manager and 3 ecs tasks for TASK managers.

I have a kind of batch job which I upload using flink rest every-day with changing new arguments, when I submit each time disk memory gets increased by ~ 600MB, I have given a checkpoint as S3 . Also I have set historyserver.archive.clean-expired-jobs true .

Since I am running on ECS, I am not able to find why the memory is getting increased on every jar upload and execution .

What are the flink config params I should look at to make sure the memory is not shooting up?

Reply | Threaded
Open this post in threaded view
|

Re: Flink does not cleanup some disk memory after submitting jar over rest

Till Rohrmann
Hi,

What you could also do is to create several heap dumps [1] whenever you submit a new job. This could allow us to analyze whether there is something increasing the heap memory consumption. Additionally, you could try to upgrade your cluster to Flink 1.12.2 since we fixed some problems Maciek mentioned.


Cheers,
Till

On Thu, Apr 8, 2021 at 9:15 PM Maciek Próchniak <[hidden email]> wrote:

Hi,

don't know if this is the problem you're facing, but some time ago we encountered two issues connected to REST API and increased disk usage after each submission:

https://issues.apache.org/jira/browse/FLINK-21164

https://issues.apache.org/jira/browse/FLINK-9844

- they're closed ATM, but only 1.12.2 contains the fixes.


maciek


On 08.04.2021 19:52, Great Info wrote:

I have deployed my own flink setup in AWS ECS. One Service for JobManager and one Service for task Managers. I am running one ECS task for a job manager and 3 ecs tasks for TASK managers.

I have a kind of batch job which I upload using flink rest every-day with changing new arguments, when I submit each time disk memory gets increased by ~ 600MB, I have given a checkpoint as S3 . Also I have set historyserver.archive.clean-expired-jobs true .

Since I am running on ECS, I am not able to find why the memory is getting increased on every jar upload and execution .

What are the flink config params I should look at to make sure the memory is not shooting up?