Memory use reported on dashboard

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Memory use reported on dashboard

Emmanuel
Hello,

I believe at some point I was told the JM doesn't need much memory so I had jobmanager.heap.mb set to 128M

Looking at the TaskManager metrics on dashboard, I see my G1 YG GC is taking forever, and it's usually because of low memory

and I read: 
Memory.heap.used
Current: 188MAvg: 188M
Memory.flink.used
Current: 95MAvg: 95M
Memory.non-heap.used
Current: 42MAvg: 42M

So I'm surprised: the JM has 128M but the TM has 6GB of RAM
but the side report of the TM says:
Flink Managed Memory: 95 mb

So I'm confused here: is the TM using the jobmanager.heap.mb value for some reason? Why does the TM report 95MB when I allocated much more with taskmanager.heap.mb

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Memory use reported on dashboard

Fabian Hueske-2
Hi Emmanuel,

it is true that the JobManager does not need a lot of memory, 128M is close to the lower bound that I would recommend.

TaskManager memory configuration has a few more options. This is because, the TaskManager takes a certain portion of its heap aside and manages this memory by itself (we call this portion managed memory). This works as follows. The parameter taskmanager.heap.mb specifies the total heap size of a task manager JVM process. The size of the managed memory can be defined in two ways:
- either using the parameter taskmanager.memory.fraction, which defines the size of the managed memory proportional to the size of the heap (0.7 means 70% of the available heap after initialization are managed by the TM itself)
- or using the parameter taskmanager.memory.size which gives defines the absolute size of the managed memory.

By default, the taskmanager.memory.fraction parameter is set 0.7.

The metrics that the web dashboard reports relate to this configuration as follows:
- Memory.heap.used: the amount of used heap memory which is controlled by the JVM.
- Memory.flink.used: the amount of heap memory which is managed by the TM (managed memory). Managed memory is always considered as "used".
- Memory.non-heap.used: the amount of used non-heap memory of the TM JVM.

Looking at the numbers you mentioned, something looks indeed wrong. If you did not touch the taskmanager.memory.fraction or taskmanager.memory.size parameter, the managed memory should be about 70% of the unused heap after initialization. If you set taskmanager.heap.mb to 6144 (6GB) the managed memory should be about 4300MB. Adding managed and used heap memory (and assuming all heap memory is used) the size of the total size of the TM heap is 188MB + 95MB = 283MB which is far of 6GB.

What kind of setup are you running (local, cluster, YARN)?
In a YARN setup, the TM heap size is automatically configured to the size of the TaskManager’s YARN container, minus a certain tolerance value.
In a local setup, only a single JVM process is started for both JM and TM. The both heap sizes are added for that.
Can you verify that taskmanager.heap.mb is correctly set?

Thanks, Fabian


2015-09-11 19:31 GMT+02:00 Emmanuel <[hidden email]>:
Hello,

I believe at some point I was told the JM doesn't need much memory so I had jobmanager.heap.mb set to 128M

Looking at the TaskManager metrics on dashboard, I see my G1 YG GC is taking forever, and it's usually because of low memory

and I read: 
Memory.heap.used
Current: 188MAvg: 188M
Memory.flink.used
Current: 95MAvg: 95M
Memory.non-heap.used
Current: 42MAvg: 42M

So I'm surprised: the JM has 128M but the TM has 6GB of RAM
but the side report of the TM says:
Flink Managed Memory: 95 mb

So I'm confused here: is the TM using the jobmanager.heap.mb value for some reason? Why does the TM report 95MB when I allocated much more with taskmanager.heap.mb

Thanks