Hello,
I am running a streaming job on a small cluster, and after a few hours I noticed that my TaskManager processes are being killed by the OOM killer. The processes were using too much memory. After a bit of monitoring, I have the following
status:
So we have about 10GB of memory that was allocated in the process but is unknown to the JVM itself.
Some more info:
Another thing I noticed is that the job sometimes fails (due to external DB connectivity issues) and is restarted automatically as expected. But in some cases the failures also cause one or more of the following error logs:
java.lang.NullPointerException: null
at org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointThread.run(StreamTask.java:953) ~[flink-dist_2.10-1.1.4.jar:1.1.4]
I have 2 theories, and I hope to hear any ideas from you:
Thank you,
Avihai
Attached Native Memory Tracking (jcmd <PID> VM.native_memory summary):
Total: reserved=44399603KB, committed=44180459KB
- Java Heap (reserved=40960000KB, committed=40960000KB)
(mmap: reserved=40960000KB, committed=40960000KB)
- Class (reserved=134031KB, committed=132751KB)
(classes #22310)
(malloc=2959KB #43612)
(mmap: reserved=131072KB, committed=129792KB)
- Thread (reserved=716331KB, committed=716331KB)
(thread #694)
(stack: reserved=712404KB, committed=712404KB)
(malloc=2283KB #3483)
(arena=1644KB #1387)
- Code (reserved=273273KB, committed=135409KB)
(malloc=23673KB #30410)
(mmap: reserved=249600KB, committed=111736KB)
- GC (reserved=1635902KB, committed=1635902KB)
(malloc=83134KB #70605)
(mmap: reserved=1552768KB, committed=1552768KB)
- Compiler (reserved=1634KB, committed=1634KB)
(malloc=1504KB #2062)
(arena=131KB #3)
- Internal (reserved=575283KB, committed=575283KB)
(malloc=575251KB #106644)
(mmap: reserved=32KB, committed=32KB)
- Symbol (reserved=16394KB, committed=16394KB)
(malloc=14468KB #132075)
(arena=1926KB #1)
- Native Memory Tracking (reserved=6516KB, committed=6516KB)
(malloc=338KB #5024)
(tracking overhead=6178KB)
- Arena Chunk (reserved=237KB, committed=237KB)
(malloc=237KB)
- Unknown (reserved=80000KB, committed=0KB)
(mmap: reserved=80000KB, committed=0KB)
Free forum by Nabble | Edit this page |