Container is is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Container is is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

sohimankotia
I am running a flink streaming job with parallelism 1 .

Suddenly after 4 hours job failed . It showed

Container container_e39_1492083788459_0676_01_000002 is completed with diagnostics: Container [pid=79546,containerID=container_e39_1492083788459_0676_01_000002] is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

       
I tried to monitor with jmap on task manager and did not get anything that can cause out of memory . No out of memory error in logs also
Reply | Threaded
Open this post in threaded view
|

Re: Container is is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

Shannon Carey
I've had similar problems when running Flink in Yarn. Flink task manager fails and it can't launch re-start jobs because there aren't enough slots and eventually Yarn decides to terminate Flink and you lose all your jobs & state because Flink regards it as a graceful shutdown. My latest attempt to solve the issue was to attempt to disable the vmem and pmem checks in yarn with the "yarn.nodemanager.pmem-check-enabled" and "yarn.nodemanager.vmem-check-enabled" settings. It's been ok so far, but I'm not totally sure if it was a good idea or not.

Of course, I'm not sure if that's the exact same problem you're having because I'm not sure if you're running Flink in Yarn or not.

-Shannon
 


On 4/14/17, 2:55 AM, "sohimankotia" <[hidden email]> wrote:

>I am running a flink streaming job with parallelism 1 .
>
>Suddenly after 4 hours job failed . It showed
>
>Container container_e39_1492083788459_0676_01_000002 is completed with
>diagnostics: Container
>[pid=79546,containerID=container_e39_1492083788459_0676_01_000002] is
>running beyond physical memory limits. Current usage: 2.0 GB of 2 GB
>physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing
>container.
>
>        
>I tried to monitor with jmap on task manager and did not get anything that
>can cause out of memory . No out of memory error in logs also
>
>
>
>--
>View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Container-is-is-running-beyond-physical-memory-limits-Current-usage-2-0-GB-of-2-GB-physical-memory-u-tp12615.html
>Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: Container is is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

sohimankotia
This post was updated on .
Hi Shannon,

Thanks for your response .

First Yes, I am running flink in yarn and my job is running with parallelism 1 .

There are few points , may those can help you to narrow down for a solution to help me ,

1. I have other jobs also running in same cluster but with more than 1 parallelism , and those are running fine .
2. There is no observation regarding out of memory from application.
3. If I run job with memory 2GB , it is failing after 4-6 hours . But If I am running my job with 4GB it is getting failed after 21-24 hours .
4. I took jmap heap dump every 15 min for process on task manager,  everything seems fine .
5. My checkpointing state is every 30 sec and having size 1.17KB with no backpressure

Just dumb thoughts :

1. Can running with parallelism 1 cause any problem ?
2. I hope setting those 2 properties is not recommended .