Re: heap dump shows StoppableSourceStreamTask retained by java.lang.finalizer

Posted by Steven Wu on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/heap-dump-shows-StoppableSourceStreamTask-retained-by-java-lang-finalizer-tp15535p15731.html

How did you kill the TaskManagers? I assume you didn't kill the JVM process because otherwise you wouldn't see the finalizer objects piling up.

Till, I configure Chao Monkey to always kill the newest/same TaskManager. So other N-1 TaskManagers stayed up during the whole process. Each of them experience a job restart for each kill. Then I saw the deferred memory cleanup by finalizer. 

On Tue, Sep 19, 2017 at 9:58 AM, Steven Wu <[hidden email]> wrote:
Stephan, agree that it is not a real memory leak. I haven't found it affecting the system. so it is sth odd for now.

but if it is not really necessary, why do we want to defer memory release with unpredictable behavior? can StreamTask stop() method take care of the cleanup work and don't need to rely on finalizer() or PhantomReference?

On Tue, Sep 19, 2017 at 2:56 AM, Stephan Ewen <[hidden email]> wrote:
Hi!

From my understanding, overriding finalize() still has some use cases and is valid if done correctly, (although PhantomReference has more control over the cleanup process). finalize() is still used in JDK classes as well.

Whenever one overrides finalize(), the object cannot be immediately garbage collected because the finalize() method may make it reachable again. It results in the following life cycle:

  1) object becomes unreachable, is detected eligible for GC
  2) In the GC cycle, object is NOT collected, but finalize() is called
  3) If object is still not reachable, it will be collected in the subsequent GC cycle

In essence, objects that override finalize() stay for one more GC cycle. That may be what you are seeing. It should not be a real memory leak, but deferred memory release.

Is this a problem that is affecting the system, or only something that seems odd for now?

If you are very concerned about this, would you be up to contribute a change that uses a PhantomReference and Reference Queue for cleanup instead?

Stephan


On Tue, Sep 19, 2017 at 12:56 AM, Till Rohrmann <[hidden email]> wrote:
Hi Steven,

the finalize method in StreamTask acts as a safety net in case the services of the StreamTask haven't been properly shut down. In the code, however, it looks as if the TimerService, for example, is always being stopped in the finally block of the invoke method. Thus, it might not be necessary to have the finalize method as a safety net.

How did you kill the TaskManagers? I assume you didn't kill the JVM process because otherwise you wouldn't see the finalizer objects piling up.

I think that you can create a JIRA issue for removing the finalizer method.

Cheers,
Till



On Thu, Sep 14, 2017 at 12:26 PM, Fabian Hueske <[hidden email]> wrote:
Hi Steven,

thanks for reporting this issue.
Looping in Till who's more familiar with the task lifecycles.

Thanks, Fabian

2017-09-12 7:08 GMT+02:00 Steven Wu <[hidden email]>:
Hi ,

I was using Chaos Monkey to test Flink's behavior against frequent killing of task manager nodes. I found that stopped/disposed StreamTask got retained by java finalizer. It is kind like a memory leak. Since each StreamTask retains 2.6 MB memory. With 20 kills (and job restarts) for 8-CPU container, there are 2.6 * 20 * 8 MB retained in heap.

Inline image 1

finalize() is generally not recommended for cleanup, because "Finalizers are unpredictable, often dangerous, and generally unnecessary", quoted from Joshua Bloch's book.

This code from StreamTask.java seems to be the cause. Is it necessary? can it be removed? We are using flink-1.2 release branch. But I see the same code in flink-1.3 and master branch

/**
* The finalize method shuts down the timer. This is a fail-safe shutdown, in case the original
* shutdown method was never called.
*
* <p>
* This should not be relied upon! It will cause shutdown to happen much later than if manual
* shutdown is attempted, and cause threads to linger for longer than needed.
*/
@Override
protected void finalize() throws Throwable {
super.finalize();
if (timerService != null) {
if (!timerService.isTerminated()) {
LOG.info("Timer service is shutting down.");
timerService.shutdownService();
}
}

cancelables.close();
}

Thanks,
Steven