Flink Job Manager & Task Manager heap size

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink Job Manager & Task Manager heap size

Daniel Peled
Hi,

We have a flink cluster with 1 JM and 7 TM running about 20 jobs.
We have noticed that both JM & TM are consuming a huge amount of memory (several GB) although the jobs are doing nothing meaning no records are passing through the pipeline.
Checkpoints are enabled and the interval between checkpoints is 10 second (but again no records coming in)

Attached are screenshots of metrics of both JM and one of the TM

Is that normal ?
Any tips for debugging this issue ?

BR,
Danny


flink-job-manager.png (161K) Download Attachment
flink-task-manager.png (194K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Flink Job Manager & Task Manager heap size

Chesnay Schepler

Generally I see 2 options:

a) There's a memory leak somewhere. It would be good to know how the baseline heap usage during idleness evolves over time. Are the same 20 jobs running continuously or are they (or others) periodically re-submitted?

b) The JVM just doesn't feel like running garbage collection. This doesn't seem that unreasonable given that there's plenty of memory to go around.

Overall, unless you run into OutOfMemoryErrors or the usage during idleness keeps steadily rising I wouldn't worry about it too much at this time.

On 1/27/2021 8:12 AM, Daniel Peled wrote:
Hi,

We have a flink cluster with 1 JM and 7 TM running about 20 jobs.
We have noticed that both JM & TM are consuming a huge amount of memory (several GB) although the jobs are doing nothing meaning no records are passing through the pipeline.
Checkpoints are enabled and the interval between checkpoints is 10 second (but again no records coming in)

Attached are screenshots of metrics of both JM and one of the TM

Is that normal ?
Any tips for debugging this issue ?

BR,
Danny