Optimizing Heap usage for Streaming Jobs

Posted by Andrea Spina on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Optimizing-Heap-usage-for-Streaming-Jobs-tp28257.html

Dear community,
I'm currently debugging my job with flink-1.6.4, which manages big states so that I use rocksdb to host the data. I know that for streaming purpose Flink does not use the managed memory features, but anyway I'm asking then how can I optimize its usage anyway [2]. I reserve to each TM (three TM plus one JM) 40G heap memory.


then I configure each TM with the following:
taskmanager.heap.size: 40960m
taskmanager.memory.fraction: 0.7
taskmanager.memory.preallocate: false
taskmanager.numberOfTaskSlots: 16

which is more or less using the defaults, preallocate and fraction as well.
In the picture, I report the heap usage by profiling the job and it does not look really nice though.

Screenshot 2019-06-14 at 16.32.49.png

I'm still avoiding to pimp rocksdb configuration as nicely explained here [1] but I want to optimize the usage of the heap first. How can I let the GC start later than 10-15GB level? I'd expect the usage grow to the upper limit and then GC'ed close to the upper bound.

The second question I'd to ask is: should I decrease the reserved heap in order to let rocksdb use more native memory?

thank you

[1] - https://www.ververica.com/blog/manage-rocksdb-memory-size-apache-flink
[2] - https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#taskmanager-memory-fraction

Andrea Spina
Head of R&D @ Radicalbit Srl
Via Giovanni Battista Pirelli 11, 20124, Milano - IT