java.lang.OutOfMemoryError: GC overhead limit exceeded

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

java.lang.OutOfMemoryError: GC overhead limit exceeded

bat man
Hi,

Getting the below OOM but the job failed 4-5 times and recovered from there.

java.lang.Exception: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.checkThrowSourceExecutionException(SourceStreamTask.java:212)
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.performDefaultAction(SourceStreamTask.java:132)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded


Is there any way I can debug this. since the job after a few re-starts started running fine. what could be the reason behind this.

Thanks,
Hemant
Reply | Threaded
Open this post in threaded view
|

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

Xintong Song
Hi Hemant,

This exception generally suggests that JVM is running out of heap memory. Per the official documentation [1], the amount of live data barely fits into the Java heap having little free space for new allocations.

You can try to increase the heap size following these guides [2].

If a memory leak is suspected, to further understand where the memory is consumed, you may need to dump the heap on OOMs and looking for unexpected memory usages leveraging profiling tools.

On Fri, Mar 5, 2021 at 4:24 PM bat man <[hidden email]> wrote:
Hi,

Getting the below OOM but the job failed 4-5 times and recovered from there.

java.lang.Exception: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.checkThrowSourceExecutionException(SourceStreamTask.java:212)
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.performDefaultAction(SourceStreamTask.java:132)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded


Is there any way I can debug this. since the job after a few re-starts started running fine. what could be the reason behind this.

Thanks,
Hemant
Reply | Threaded
Open this post in threaded view
|

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

bat man
Hi Xintong Song,
I tried using the java options to generate heap dump referring to docs[1] in flink-conf.yaml, however after adding this the task manager containers are not coming up. Note that I am using EMR. Am i doing anything wrong here?

env.java.opts: "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/dump.hprof"

Thanks,
Hemant





On Fri, Mar 5, 2021 at 3:05 PM Xintong Song <[hidden email]> wrote:
Hi Hemant,

This exception generally suggests that JVM is running out of heap memory. Per the official documentation [1], the amount of live data barely fits into the Java heap having little free space for new allocations.

You can try to increase the heap size following these guides [2].

If a memory leak is suspected, to further understand where the memory is consumed, you may need to dump the heap on OOMs and looking for unexpected memory usages leveraging profiling tools.

On Fri, Mar 5, 2021 at 4:24 PM bat man <[hidden email]> wrote:
Hi,

Getting the below OOM but the job failed 4-5 times and recovered from there.

java.lang.Exception: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.checkThrowSourceExecutionException(SourceStreamTask.java:212)
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.performDefaultAction(SourceStreamTask.java:132)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded


Is there any way I can debug this. since the job after a few re-starts started running fine. what could be the reason behind this.

Thanks,
Hemant
Reply | Threaded
Open this post in threaded view
|

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

Tamir Sagi
Hey Bruce Wayne,

I can suggest you 3 options to get the heap dump :

  1. Dont define file name in the dump path, just set the folder. JVM will automatically create the hprof file in case of OOM
    env.java.opts
    : "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp"

  2. it #1 still does not work for some reason or you don want to wait until the error occurs, you can dump it any time using cli tool.
    install openjdk-11-jdk and use jcmd tool to dump the heap. jcmd <process id> GC.heap_dump  /tmp/heap-dump.hprof

  3. If you don't want to deal with CLI tools, connect visual-Vm(here) to Java process 
    - Download the application
    - Run the java process with the following JVM args (The port number can be any available number):
      -Dcom.sun.management.jmxremote
      -Dcom.sun.management.jmxremote.port=9010
      -Dcom.sun.management.jmxremote.local.only=false
      -Dcom.sun.management.jmxremote.authenticate=false
      -Dcom.sun.management.jmxremote.ssl=false
      -Dcom.sun.management.jmxremote.rmi.port=9010
      -Djava.rmi.server.hostname=localhost
    Note: place these flags before you call to -jar <name>.jar --> java <args> -jar <jar-name>.jar
    connect with visual vm, just add JMX connection where the address in your case is
    localhost:9010



    You can watch the heap in real time + create heap dump from visual VM
    Note: If you are running the Flink application on top of Docker/Kubernetes you need a port forwarding.

Tamir

From: bat man <[hidden email]>
Sent: Saturday, March 6, 2021 9:53 AM
To: Xintong Song <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: java.lang.OutOfMemoryError: GC overhead limit exceeded
 

EXTERNAL EMAIL



Hi Xintong Song,
I tried using the java options to generate heap dump referring to docs[1] in flink-conf.yaml, however after adding this the task manager containers are not coming up. Note that I am using EMR. Am i doing anything wrong here?

env.java.opts: "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/dump.hprof"

Thanks,
Hemant





On Fri, Mar 5, 2021 at 3:05 PM Xintong Song <[hidden email]> wrote:
Hi Hemant,

This exception generally suggests that JVM is running out of heap memory. Per the official documentation [1], the amount of live data barely fits into the Java heap having little free space for new allocations.

You can try to increase the heap size following these guides [2].

If a memory leak is suspected, to further understand where the memory is consumed, you may need to dump the heap on OOMs and looking for unexpected memory usages leveraging profiling tools.

On Fri, Mar 5, 2021 at 4:24 PM bat man <[hidden email]> wrote:
Hi,

Getting the below OOM but the job failed 4-5 times and recovered from there.

java.lang.Exception: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.checkThrowSourceExecutionException(SourceStreamTask.java:212)
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.performDefaultAction(SourceStreamTask.java:132)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded


Is there any way I can debug this. since the job after a few re-starts started running fine. what could be the reason behind this.

Thanks,
Hemant


Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately. 
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.

Reply | Threaded
Open this post in threaded view
|

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

Xintong Song
In reply to this post by bat man
Hi Hemant,
I don't see any problem in your settings. Any exceptions suggesting why TM containers are not coming up?

Thank you~

Xintong Song



On Sat, Mar 6, 2021 at 3:53 PM bat man <[hidden email]> wrote:
Hi Xintong Song,
I tried using the java options to generate heap dump referring to docs[1] in flink-conf.yaml, however after adding this the task manager containers are not coming up. Note that I am using EMR. Am i doing anything wrong here?

env.java.opts: "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/dump.hprof"

Thanks,
Hemant





On Fri, Mar 5, 2021 at 3:05 PM Xintong Song <[hidden email]> wrote:
Hi Hemant,

This exception generally suggests that JVM is running out of heap memory. Per the official documentation [1], the amount of live data barely fits into the Java heap having little free space for new allocations.

You can try to increase the heap size following these guides [2].

If a memory leak is suspected, to further understand where the memory is consumed, you may need to dump the heap on OOMs and looking for unexpected memory usages leveraging profiling tools.

On Fri, Mar 5, 2021 at 4:24 PM bat man <[hidden email]> wrote:
Hi,

Getting the below OOM but the job failed 4-5 times and recovered from there.

java.lang.Exception: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.checkThrowSourceExecutionException(SourceStreamTask.java:212)
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.performDefaultAction(SourceStreamTask.java:132)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded


Is there any way I can debug this. since the job after a few re-starts started running fine. what could be the reason behind this.

Thanks,
Hemant
Reply | Threaded
Open this post in threaded view
|

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

bat man
The Java options should not have the double quotes. That was the issue. I was able to generate the heap dump. based on the dump have made some changes in the code to fix this issue.

This worked -

env.java.opts: -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/dump.hprof

Thanks.

On Mon, 8 Mar 2021 at 7:48 AM, Xintong Song <[hidden email]> wrote:
Hi Hemant,
I don't see any problem in your settings. Any exceptions suggesting why TM containers are not coming up?

Thank you~

Xintong Song



On Sat, Mar 6, 2021 at 3:53 PM bat man <[hidden email]> wrote:
Hi Xintong Song,
I tried using the java options to generate heap dump referring to docs[1] in flink-conf.yaml, however after adding this the task manager containers are not coming up. Note that I am using EMR. Am i doing anything wrong here?

env.java.opts: "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/dump.hprof"

Thanks,
Hemant





On Fri, Mar 5, 2021 at 3:05 PM Xintong Song <[hidden email]> wrote:
Hi Hemant,

This exception generally suggests that JVM is running out of heap memory. Per the official documentation [1], the amount of live data barely fits into the Java heap having little free space for new allocations.

You can try to increase the heap size following these guides [2].

If a memory leak is suspected, to further understand where the memory is consumed, you may need to dump the heap on OOMs and looking for unexpected memory usages leveraging profiling tools.

On Fri, Mar 5, 2021 at 4:24 PM bat man <[hidden email]> wrote:
Hi,

Getting the below OOM but the job failed 4-5 times and recovered from there.

java.lang.Exception: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.checkThrowSourceExecutionException(SourceStreamTask.java:212)
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.performDefaultAction(SourceStreamTask.java:132)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded


Is there any way I can debug this. since the job after a few re-starts started running fine. what could be the reason behind this.

Thanks,
Hemant