How to estimate the memory size of flink state

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

How to estimate the memory size of flink state

liujiangang
      We are using flink 1.6.2. For filesystem backend, we want to monitor the state size in memory. Once the state size becomes bigger, we can get noticed and take measures such as rescaling the job, or the job may fail because of the memory.
      We have tried to get the memory usage for the jvm, like gc throughput. For our case, state can vary greatly at the peak. So maybe I can refer to the state memory size.
      I checked the metrics and code, but didn't find any information about the state memory size. I can get the checkpoint size, but they are serialized result that can not reflect the running state in memory.  Can anyone give me some suggestions? Thank you very much.
Reply | Threaded
Open this post in threaded view
|

Re: How to estimate the memory size of flink state

liujiangang
      Thank you. Your suggestion is good and I benefit a lot. For my case, I want to know the state memory size for other reasons. 
      When the the gc pressure is bigger, I need to limit the source or discard some data from the source to ensure job’s running. If the state size is bigger, I need to discard data. If the state size is not bigger, I need to limit the source.  The state size shows the resident memory. For event time, discarding data can reduce memory usage.
      Could you please give me some suggestions? 

在 2019年11月20日,下午3:16,sysukelee <[hidden email]> 写道:

Hi Liu,
We monitor the jvm used/max heap memory to determine whether to rescale the job.
To avoid problems caused by oom, you don't need to know exactly how much memory exactly used by state. 
Focusing on jvm memory use is more reasonable.
On 11/20/2019 15:08[hidden email] wrote: 
We are using flink 1.6.2. For filesystem backend, we want to monitor
the state size in memory. Once the state size becomes bigger, we can get
noticed and take measures such as rescaling the job, or the job may fail
because of the memory.
We have tried to get the memory usage for the jvm, like gc throughput.
For our case, state can vary greatly at the peak. So maybe I can refer to
the state memory size.
I checked the metrics and code, but didn't find any information about
the state memory size. I can get the checkpoint size, but they are
serialized result that can not reflect the running state in memory.  Can
anyone give me some suggestions? Thank you very much.