Re: Crash in a simple "mapper style" streaming app likely due to a memory leak ?

Posted by Stephan Ewen on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Crash-in-a-simple-mapper-style-streaming-app-likely-due-to-a-memory-leak-tp3476p3538.html

Hi Arnaud!

Java direct-memory is tricky to debug. You can turn on the memory logging or check the TaskManager tab in teh web dashboard - both report on direct memory consumption.

One thing you can look for is forgetting to close streams. That means the streams consume native resources until the Java object is Garbage Collected, which may be quite a bit later.

Greetings,.
Stephan


On Fri, Nov 13, 2015 at 3:59 PM, Ufuk Celebi <[hidden email]> wrote:

> On 13 Nov 2015, at 15:49, LINZ, Arnaud <[hidden email]> wrote:
>
> Hi Robert,
>
> Thanks, it works with 50% -- at least way past the previous crash point.
>
> In my opinion (I lack real metrics), the part that uses the most memory is the M2 mapper, instantiated once per slot.
> The most complex part is the Sink (it does use a lot of hdfs files, flushing threads etc.) ; but I expect the “RichSinkFunction” to be instantiated only once per slot ? I’m really surprised by that memory usage, I will try using a monitoring app on the yarn jvm to understand.

In general it’s instantiated once per subtask. For your current deployment, it is one per slot.

– Ufuk