(DEPRECATED) Apache Flink User Mailing List archive.

Flink 1.8.3 Kubernetes POD OOM

Classic

List

Threaded

6 messages Options

Josson Paul

May 21, 2020; 4:42pm

Flink 1.8.3 Kubernetes POD OOM

Cluster type: Standalone cluster
Job Type: Streaming
JVM memory: 26.2 GB
POD memory: 33 GB
CPU: 10 Cores
GC: G1GC
Flink Version: 1.8.3
State back end: File based
NETWORK_BUFFERS_MEMORY_FRACTION : 0.02f of the Heap
We are not accessing Direct memory from application. Only Flink uses direct memory

We notice that in Flink 1.8.3 over a period of 30 minutes the POD is killed with OOM. JVM Heap is with in limit.
We read from Kafka and have windows in the application. Our Sink is either Kafka or Elastic Search
The same application/job was working perfectly in Flink 1.4.1 with the same input rate and output rate
No back pressure
I have attached few Grafana charts as PDF
Any idea why the off heap memory / outside JVM memory is going up and eventually reaching the limit.

Java Heap (reserved=26845184KB, committed=26845184KB)
(mmap: reserved=26845184KB, committed=26845184KB)

- Class (reserved=1241866KB, committed=219686KB)
(classes #36599)
(malloc=4874KB #74568)
(mmap: reserved=1236992KB, committed=214812KB)

- Thread (reserved=394394KB, committed=394394KB)
(thread #383)
(stack: reserved=392696KB, committed=392696KB)
(malloc=1250KB #1920)
(arena=448KB #764)

- Code (reserved=272178KB, committed=137954KB)
(malloc=22578KB #33442)
(mmap: reserved=249600KB, committed=115376KB)

- GC (reserved=1365088KB, committed=1365088KB)
(malloc=336112KB #1130298)
(mmap: reserved=1028976KB, committed=1028976KB)

Thanks
Josson

memory_issue.pdf (1M) Download Attachment

Fabian Hueske-2

May 22, 2020; 8:39am

Re: Flink 1.8.3 Kubernetes POD OOM

Hi Josson,

I don't have much experience setting memory bounds in Kubernetes myself, but my colleague Andrey (in CC) reworked Flink's memory configuration for the last release to ease the configuration in container envs.

He might be able to help.

Best, Fabian

Am Do., 21. Mai 2020 um 18:43 Uhr schrieb Josson Paul <[hidden email]>:

Cluster type: Standalone cluster
Job Type: Streaming
JVM memory: 26.2 GB
POD memory: 33 GB
CPU: 10 Cores
GC: G1GC
Flink Version: 1.8.3
State back end: File based
NETWORK_BUFFERS_MEMORY_FRACTION : 0.02f of the Heap
We are not accessing Direct memory from application. Only Flink uses direct memory

We notice that in Flink 1.8.3 over a period of 30 minutes the POD is killed with OOM. JVM Heap is with in limit.
We read from Kafka and have windows in the application. Our Sink is either Kafka or Elastic Search
The same application/job was working perfectly in Flink 1.4.1 with the same input rate and output rate
No back pressure
I have attached few Grafana charts as PDF
Any idea why the off heap memory / outside JVM memory is going up and eventually reaching the limit.

Java Heap (reserved=26845184KB, committed=26845184KB)
(mmap: reserved=26845184KB, committed=26845184KB)

- Class (reserved=1241866KB, committed=219686KB)
(classes #36599)
(malloc=4874KB #74568)
(mmap: reserved=1236992KB, committed=214812KB)

- Thread (reserved=394394KB, committed=394394KB)
(thread #383)
(stack: reserved=392696KB, committed=392696KB)
(malloc=1250KB #1920)
(arena=448KB #764)

- Code (reserved=272178KB, committed=137954KB)
(malloc=22578KB #33442)
(mmap: reserved=249600KB, committed=115376KB)

- GC (reserved=1365088KB, committed=1365088KB)
(malloc=336112KB #1130298)
(mmap: reserved=1028976KB, committed=1028976KB)

--
Thanks
Josson

... [show rest of quote]

Andrey Zagrebin-5

May 22, 2020; 10:37am

Re: Flink 1.8.3 Kubernetes POD OOM

Hi Josson,

Do you use state backend? is it RocksDB?

Best,

Andrey

On Fri, May 22, 2020 at 12:58 PM Fabian Hueske <[hidden email]> wrote:

Hi Josson,

I don't have much experience setting memory bounds in Kubernetes myself, but my colleague Andrey (in CC) reworked Flink's memory configuration for the last release to ease the configuration in container envs.
He might be able to help.

Best, Fabian

Am Do., 21. Mai 2020 um 18:43 Uhr schrieb Josson Paul <[hidden email]>:
Cluster type: Standalone cluster
Job Type: Streaming
JVM memory: 26.2 GB
POD memory: 33 GB
CPU: 10 Cores
GC: G1GC
Flink Version: 1.8.3
State back end: File based
NETWORK_BUFFERS_MEMORY_FRACTION : 0.02f of the Heap
We are not accessing Direct memory from application. Only Flink uses direct memory

We notice that in Flink 1.8.3 over a period of 30 minutes the POD is killed with OOM. JVM Heap is with in limit.
We read from Kafka and have windows in the application. Our Sink is either Kafka or Elastic Search
The same application/job was working perfectly in Flink 1.4.1 with the same input rate and output rate
No back pressure
I have attached few Grafana charts as PDF
Any idea why the off heap memory / outside JVM memory is going up and eventually reaching the limit.

Java Heap (reserved=26845184KB, committed=26845184KB)
(mmap: reserved=26845184KB, committed=26845184KB)

- Class (reserved=1241866KB, committed=219686KB)
(classes #36599)
(malloc=4874KB #74568)
(mmap: reserved=1236992KB, committed=214812KB)

- Thread (reserved=394394KB, committed=394394KB)
(thread #383)
(stack: reserved=392696KB, committed=392696KB)
(malloc=1250KB #1920)
(arena=448KB #764)

- Code (reserved=272178KB, committed=137954KB)
(malloc=22578KB #33442)
(mmap: reserved=249600KB, committed=115376KB)

- GC (reserved=1365088KB, committed=1365088KB)
(malloc=336112KB #1130298)
(mmap: reserved=1028976KB, committed=1028976KB)

--
Thanks
Josson

... [show rest of quote]

... [show rest of quote]

Josson Paul

May 24, 2020; 12:37am

Re: Flink 1.8.3 Kubernetes POD OOM

Hi Andrey,

We don't use Rocks DB. As I said in the original email I am using File Based. Even though our cluster is on Kubernetes out Flink cluster is Flink's stand alone resource manager. We have not yet integrated our Flink with Kubernetes.

Thanks,

Josson

On Fri, May 22, 2020 at 3:37 AM Andrey Zagrebin <[hidden email]> wrote:

Hi Josson,

Do you use state backend? is it RocksDB?

Best,
Andrey

On Fri, May 22, 2020 at 12:58 PM Fabian Hueske <[hidden email]> wrote:
Hi Josson,

I don't have much experience setting memory bounds in Kubernetes myself, but my colleague Andrey (in CC) reworked Flink's memory configuration for the last release to ease the configuration in container envs.
He might be able to help.

Best, Fabian

Am Do., 21. Mai 2020 um 18:43 Uhr schrieb Josson Paul <[hidden email]>:
Cluster type: Standalone cluster
Job Type: Streaming
JVM memory: 26.2 GB
POD memory: 33 GB
CPU: 10 Cores
GC: G1GC
Flink Version: 1.8.3
State back end: File based
NETWORK_BUFFERS_MEMORY_FRACTION : 0.02f of the Heap
We are not accessing Direct memory from application. Only Flink uses direct memory

We notice that in Flink 1.8.3 over a period of 30 minutes the POD is killed with OOM. JVM Heap is with in limit.
We read from Kafka and have windows in the application. Our Sink is either Kafka or Elastic Search
The same application/job was working perfectly in Flink 1.4.1 with the same input rate and output rate
No back pressure
I have attached few Grafana charts as PDF
Any idea why the off heap memory / outside JVM memory is going up and eventually reaching the limit.

Java Heap (reserved=26845184KB, committed=26845184KB)
(mmap: reserved=26845184KB, committed=26845184KB)

- Class (reserved=1241866KB, committed=219686KB)
(classes #36599)
(malloc=4874KB #74568)
(mmap: reserved=1236992KB, committed=214812KB)

- Thread (reserved=394394KB, committed=394394KB)
(thread #383)
(stack: reserved=392696KB, committed=392696KB)
(malloc=1250KB #1920)
(arena=448KB #764)

- Code (reserved=272178KB, committed=137954KB)
(malloc=22578KB #33442)
(mmap: reserved=249600KB, committed=115376KB)

- GC (reserved=1365088KB, committed=1365088KB)
(malloc=336112KB #1130298)
(mmap: reserved=1028976KB, committed=1028976KB)

--
Thanks
Josson

... [show rest of quote]

... [show rest of quote]

... [show rest of quote]

Thanks
Josson

Josson Paul

May 24, 2020; 3:18am

Re: Flink 1.8.3 Kubernetes POD OOM

Hi Andrey,

To clarify the above email. I am using Heap Based State and not Rocks DB.

Thanks,

Josson

On Sat, May 23, 2020, 17:37 Josson Paul <[hidden email]> wrote:

Hi Andrey,
We don't use Rocks DB. As I said in the original email I am using File Based. Even though our cluster is on Kubernetes out Flink cluster is Flink's stand alone resource manager. We have not yet integrated our Flink with Kubernetes.

Thanks,
Josson

On Fri, May 22, 2020 at 3:37 AM Andrey Zagrebin <[hidden email]> wrote:
Hi Josson,

Do you use state backend? is it RocksDB?

Best,
Andrey

On Fri, May 22, 2020 at 12:58 PM Fabian Hueske <[hidden email]> wrote:
Hi Josson,

I don't have much experience setting memory bounds in Kubernetes myself, but my colleague Andrey (in CC) reworked Flink's memory configuration for the last release to ease the configuration in container envs.
He might be able to help.

Best, Fabian

Am Do., 21. Mai 2020 um 18:43 Uhr schrieb Josson Paul <[hidden email]>:
Cluster type: Standalone cluster
Job Type: Streaming
JVM memory: 26.2 GB
POD memory: 33 GB
CPU: 10 Cores
GC: G1GC
Flink Version: 1.8.3
State back end: File based
NETWORK_BUFFERS_MEMORY_FRACTION : 0.02f of the Heap
We are not accessing Direct memory from application. Only Flink uses direct memory

We notice that in Flink 1.8.3 over a period of 30 minutes the POD is killed with OOM. JVM Heap is with in limit.
We read from Kafka and have windows in the application. Our Sink is either Kafka or Elastic Search
The same application/job was working perfectly in Flink 1.4.1 with the same input rate and output rate
No back pressure
I have attached few Grafana charts as PDF
Any idea why the off heap memory / outside JVM memory is going up and eventually reaching the limit.

Java Heap (reserved=26845184KB, committed=26845184KB)
(mmap: reserved=26845184KB, committed=26845184KB)

- Class (reserved=1241866KB, committed=219686KB)
(classes #36599)
(malloc=4874KB #74568)
(mmap: reserved=1236992KB, committed=214812KB)

- Thread (reserved=394394KB, committed=394394KB)
(thread #383)
(stack: reserved=392696KB, committed=392696KB)
(malloc=1250KB #1920)
(arena=448KB #764)

- Code (reserved=272178KB, committed=137954KB)
(malloc=22578KB #33442)
(mmap: reserved=249600KB, committed=115376KB)

- GC (reserved=1365088KB, committed=1365088KB)
(malloc=336112KB #1130298)
(mmap: reserved=1028976KB, committed=1028976KB)

--
Thanks
Josson

... [show rest of quote]

... [show rest of quote]

... [show rest of quote]

--
Thanks
Josson

... [show rest of quote]

Andrey Zagrebin-4

May 24, 2020; 11:50am

Re: Flink 1.8.3 Kubernetes POD OOM

Hi Josson,

Thanks for the details. Sorry, I overlooked, you indeed mentioned the file backend.

Looking into Flink memory model [1], I do not notice any problems related to the types of memory consumption we model in Flink.

Direct memory consumption by network stack corresponds to your configured fraction (0.02f). JVM heap cannot cause problems.

I do not know any other types of memory consumption in Flink 1.8.

Nonetheless, there is no way to control all types of memory consumption,

especially native memory allocation either by user code or JVM (if you do not use RocksDB, Flink barely uses the native memory explicitly).

The examples (not exhaustive):

- native libraries in user code or its dependencies which use off-heap, e.g. malloc (detecting this would require some OS process dump)

- JVM metaspace, threads/GC overhead etc (we do not limit any of this in 1.8 by JVM args)

Recently, we discovered some class loading leaks (JVM meatspace), e.g. [2] or [3].

Since 1.10, Flink limits JVM meatspace and direct memory then you would get a concrete OOM exception before container dies.

Maybe Kafka or Elastic search connector clients got updated with 1.8 and caused some leaks.

I cc’ed Gordon and Piotr whether they have an idea.

I suggest to try to decrease POD memory, note the consumed memory of various types at the moment the container dies

(I suppose as you did), and then increase POD memory multiple times until you see which type of memory consumption always grows till OOM

and other types hopefully stabilise on some level.

Then you could take a dump of that ever growing type of memory consumption to analyse if there is memory leak.

Best,

Andrey

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/mem_setup.html#total-memory

[2] https://issues.apache.org/jira/browse/FLINK-16142

[3] https://issues.apache.org/jira/browse/FLINK-11205

On 24 May 2020, at 06:18, Josson Paul <[hidden email]> wrote:

Hi Andrey,
To clarify the above email. I am using Heap Based State and not Rocks DB.

Thanks,
Josson

On Sat, May 23, 2020, 17:37 Josson Paul <[hidden email]> wrote:
Hi Andrey,
We don't use Rocks DB. As I said in the original email I am using File Based. Even though our cluster is on Kubernetes out Flink cluster is Flink's stand alone resource manager. We have not yet integrated our Flink with Kubernetes.

Thanks,
Josson

On Fri, May 22, 2020 at 3:37 AM Andrey Zagrebin <[hidden email]> wrote:
Hi Josson,

Do you use state backend? is it RocksDB?

Best,
Andrey

On Fri, May 22, 2020 at 12:58 PM Fabian Hueske <[hidden email]> wrote:
Hi Josson,

I don't have much experience setting memory bounds in Kubernetes myself, but my colleague Andrey (in CC) reworked Flink's memory configuration for the last release to ease the configuration in container envs.
He might be able to help.

Best, Fabian

Am Do., 21. Mai 2020 um 18:43 Uhr schrieb Josson Paul <[hidden email]>:
Cluster type: Standalone cluster
Job Type: Streaming
JVM memory: 26.2 GB
POD memory: 33 GB
CPU: 10 Cores
GC: G1GC
Flink Version: 1.8.3
State back end: File based
NETWORK_BUFFERS_MEMORY_FRACTION : 0.02f of the Heap
We are not accessing Direct memory from application. Only Flink uses direct memory

We notice that in Flink 1.8.3 over a period of 30 minutes the POD is killed with OOM. JVM Heap is with in limit.
We read from Kafka and have windows in the application. Our Sink is either Kafka or Elastic Search
The same application/job was working perfectly in Flink 1.4.1 with the same input rate and output rate
No back pressure
I have attached few Grafana charts as PDF
Any idea why the off heap memory / outside JVM memory is going up and eventually reaching the limit.

Java Heap (reserved=26845184KB, committed=26845184KB)
(mmap: reserved=26845184KB, committed=26845184KB)

- Class (reserved=1241866KB, committed=219686KB)
(classes #36599)
(malloc=4874KB #74568)
(mmap: reserved=1236992KB, committed=214812KB)

- Thread (reserved=394394KB, committed=394394KB)
(thread #383)
(stack: reserved=392696KB, committed=392696KB)
(malloc=1250KB #1920)
(arena=448KB #764)

- Code (reserved=272178KB, committed=137954KB)
(malloc=22578KB #33442)
(mmap: reserved=249600KB, committed=115376KB)

- GC (reserved=1365088KB, committed=1365088KB)
(malloc=336112KB #1130298)
(mmap: reserved=1028976KB, committed=1028976KB)

--
Thanks
Josson

... [show rest of quote]

... [show rest of quote]

... [show rest of quote]

--
Thanks
Josson

... [show rest of quote]

... [show rest of quote]