GC overhead limit exceeded when using Prometheus exporter

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

GC overhead limit exceeded when using Prometheus exporter

bat man
Hi there,

I am facing java.lang.OutOfMemoryError: GC overhead limit exceeded when using prometheus exporter with Flink 1.9 on AWS EMR emr-5.28.1. I have other jobs which run fine. tihs specific job fails with the below error stack.

Exception in thread "pool-3-thread-2" java.lang.OutOfMemoryError: GC overhead limit exceeded
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:133)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:220)
at java.io.Writer.write(Writer.java:157)
at org.apache.flink.shaded.io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:40)
at org.apache.flink.shaded.io.prometheus.client.exporter.HTTPServer$HTTPMetricHandler.handle(HTTPServer.java:59)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Thanks,
Hemant
Reply | Threaded
Open this post in threaded view
|

Re: GC overhead limit exceeded when using Prometheus exporter

Till Rohrmann
Hi Hemant,

Have you tried running a new Flink version? Can you create a heap dump when the process fails? This could help us digging into whether there is some memory leak.

Cheers,
Till

On Tue, Feb 16, 2021 at 5:21 PM bat man <[hidden email]> wrote:
Hi there,

I am facing java.lang.OutOfMemoryError: GC overhead limit exceeded when using prometheus exporter with Flink 1.9 on AWS EMR emr-5.28.1. I have other jobs which run fine. tihs specific job fails with the below error stack.

Exception in thread "pool-3-thread-2" java.lang.OutOfMemoryError: GC overhead limit exceeded
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:133)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:220)
at java.io.Writer.write(Writer.java:157)
at org.apache.flink.shaded.io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:40)
at org.apache.flink.shaded.io.prometheus.client.exporter.HTTPServer$HTTPMetricHandler.handle(HTTPServer.java:59)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Thanks,
Hemant
Reply | Threaded
Open this post in threaded view
|

Re: GC overhead limit exceeded when using Prometheus exporter

bat man
Hi Till,

Tried increasing the task manager memory to 4GB but unfortunately EMR nodes are going down, investigating that for now. Will share the results in case this works out,if not then will get the heap dump.

Thanks,
Hemant

On Tue, Feb 16, 2021 at 10:45 PM Till Rohrmann <[hidden email]> wrote:
Hi Hemant,

Have you tried running a new Flink version? Can you create a heap dump when the process fails? This could help us digging into whether there is some memory leak.

Cheers,
Till

On Tue, Feb 16, 2021 at 5:21 PM bat man <[hidden email]> wrote:
Hi there,

I am facing java.lang.OutOfMemoryError: GC overhead limit exceeded when using prometheus exporter with Flink 1.9 on AWS EMR emr-5.28.1. I have other jobs which run fine. tihs specific job fails with the below error stack.

Exception in thread "pool-3-thread-2" java.lang.OutOfMemoryError: GC overhead limit exceeded
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:133)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:220)
at java.io.Writer.write(Writer.java:157)
at org.apache.flink.shaded.io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:40)
at org.apache.flink.shaded.io.prometheus.client.exporter.HTTPServer$HTTPMetricHandler.handle(HTTPServer.java:59)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Thanks,
Hemant