http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Optimizing-Heap-usage-for-Streaming-Jobs-tp28257.html
Dear community,
I'm currently debugging my job with flink-1.6.4, which manages big states so that I use rocksdb to host the data. I know that for streaming purpose Flink does not use the managed memory features, but anyway I'm asking then how can I optimize its usage anyway [2]. I reserve to each TM (three TM plus one JM) 40G heap memory.
then I configure each TM with the following:
taskmanager.heap.size: 40960m
taskmanager.memory.fraction: 0.7
taskmanager.memory.preallocate: false
taskmanager.numberOfTaskSlots: 16
which is more or less using the defaults, preallocate and fraction as well.
In the picture, I report the heap usage by profiling the job and it does not look really nice though.
Andrea Spina
Head of R&D @ Radicalbit Srl
Via Giovanni Battista Pirelli 11, 20124, Milano - IT