An addition to Netty's memory footprint

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

An addition to Netty's memory footprint

Kurt Young-2
Hi,

Ufuk had write up an excellent document about Netty's memory allocation [1] inside Flink, and i want to add one more note after running some large scale jobs.

The only inaccurate thing about [1] is how much memory will LengthFieldBasedFrameDecoder use. From our observations, it will cost at most 4M for each physical connection. 

Why(tl;dr): the reason is ByteToMessageDecoder which is the base class of LengthFieldBasedFrameDecoder used a Cumulator to save the bytes for further decoding. The Cumulator will try to discard some read bytes to make room in the buffer when channelReadComplete is triggered. In most cases, channelReadComplete will only be triggered by AbstractNioByteChannel after which has read "maxMessagesPerRead" times. The default value for maxMessagesPerRead is 16. So in worst case, the Cumulator will write up to 1M (64K * 16) data. And due to the logic of ByteBuf's discardSomeReadBytes, the Cumulator will expand to 4M.

We add an option to tune the maxMessagesPerRead, set it to 2 and everything works fine. I know Stephan is working on network improvements, it will be a good choice to replace the whole netty pipeline with Flink's own implementation. But I think we will face some similar logics when implementing, careful about this.

BTW, should we open a jira to add this config?