Re: Task Manager was lost/killed due to full GC

Posted by Stephan Ewen on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Task-Manager-was-lost-killed-due-to-full-GC-tp15386p15761.html

Hi!

The garbage collection stats actually look okay, not terribly bad - almost surprised that this seems to cause failures.

Can you check whether you find messages in the TM / JM log about heartbeat timeouts, actor systems being "gated" or "quarantined"?

Would also be interesting to know how the program is actually set up - where does data of the files you read go?
Do you just keep them as objects in lists, or do you emit them from the operators?

Best,
Stephan


On Wed, Sep 20, 2017 at 1:58 AM, ShB <[hidden email]> wrote:
Thanks for your response!

Recommendation to decrease allotted memory? Which allotted memory should be
decreased?

I tried decreasing taskmanager.memory.fraction to give more memory to user
managed operations, that doesn't work beyond a point. Also tried increasing
containerized.heap-cutoff-ratio, that didn't work either.

What eventually solved the problem was increasing parallelism - throwing in
many more task managers.