Hi everyone,
I'm running the Flink Streaming WordCount example with approx. 100M words ~ 1G of input data. The version I'm using is 1.7.1. I wanted to investigate memory usage, so I connected JVisualVM via JMX and ran the thing. While doing that, I stumbled upon the fact that during RecordWriter.emit(), a long time is spent waiting within LocalBufferPool.requestMemorySegment() (see attached screenshot). I also have the JVisualVM snapshot, but I'm not sure I'm comfortable sharing that on a public mailing list.
I'm running on three nodes (standalone) with one TM each, and each of those only has one slot. Furthermore I set the taskmanager.network.memory.max property to 5gb, which is also reflected in the logs:
2019-01-09 11:50:15,975 INFO org.apache.flink.runtime.io.network.buffer.NetworkBufferPool - Allocated 4906 MB for network buffer pool (number of memory segments: 157013, bytes per segment: 32768).
As you can see, I left the minimum size at 32K. Now I'm really wondering why so much time is spent waiting for new buffer builders, when there should be plenty available, and not so many should be needed. I browsed the source a bit and it seemed alright to me, since the buffers and builders are recycled properly (I think).
What I'm not sure of is whether this wait actually has an impact on overall execution time, as it's occurring in a separate thread.
Is this something I need to worry about?
Thanks in advance and cheers
Robert