Hello,
Every time I deploy a flink job the code cache increases, which is expected. However, when I stop and start the job or it restarts the code cache continuous to increase. Screenshot_2018-12-11_at_11.png <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t612/Screenshot_2018-12-11_at_11.png> I've added the flags "-XX:+PrintCompilation -XX:ReservedCodeCacheSize=350m -XX:-UseCodeCacheFlushing" to Flink taskmanagers and jobmanagers, but the cache doesn't decrease very much, as it is depicted in the screenshot above. Even if I stop all the jobs, the cache doesn't decrease. This gets to a point where I get the error "CodeCache is full. Compiler has been disabled". I've attached the taskmanagers output with the "XX:+PrintCompilation" flag activated. flink-flink-taskexecutor.out <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t612/flink-flink-taskexecutor.out> Flink: 1.6.2 Java: openjdk version "1.8.0_191" Best Regards, Pedro Chaves. ----- Best Regards, Pedro Chaves -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Best Regards,
Pedro Chaves |
Hi,
in general, Flink uses user-code class loader for job specific code and the lifecycle of the class loader should end with the job. This usually means that job related code could be removed after the job is finished. However, objects of a class that was loaded by the user-code class loader should no longer be referenced from anywhere after the job finished or else the user-code class loader cannot be freed. If that is the case depends on the user code and the used dependencies, e.g. the user code might register some objects somewhere and does not remove them by the end of the job. This would prevent freeing the user-code and result in a leak. To figure out the root cause, you can take can analyse a heap dump for leaking class loaders, e.g. [1] and other sources on the web go deeper into this topic. Best, Stefan
|
Hello Stefan,
Thank you for the reply. I've taken a heap dump from a development cluster using jmap and analysed it. To do the tests we restarted the cluster and then left a job running for a few minutes. After that, we restarted the job a couple of times and stopped it. After leaving the cluster with no running jobs for 20 min we toke a heap dump. We've found out that a thread which consumes data from kafka was still running with a lot of finalizer calls as depicted bellow. <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t612/Screenshot_2018-12-11_at_17.png> I will deploy a job without a Kafka consumer to see if the code cache still increases (all of our cluster have problems with the code cache, coincidentally all of the deployed jobs read from kafka). Best Regards, Pedro Chaves ----- Best Regards, Pedro Chaves -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Best Regards,
Pedro Chaves |
Hi,
Thanks for analyzing the problem. If it turns out that there is a problem with the termination of the Kafka sources, could you please open an issue for that with your results? Best, Stefan > On 11. Dec 2018, at 19:04, PedroMrChaves <[hidden email]> wrote: > > Hello Stefan, > > Thank you for the reply. > > I've taken a heap dump from a development cluster using jmap and analysed > it. To do the tests we restarted the cluster and then left a job running for > a few minutes. After that, we restarted the job a couple of times and > stopped it. After leaving the cluster with no running jobs for 20 min we > toke a heap dump. > > We've found out that a thread which consumes data from kafka was still > running with a lot of finalizer calls as depicted bellow. > > > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t612/Screenshot_2018-12-11_at_17.png> > > I will deploy a job without a Kafka consumer to see if the code cache still > increases (all of our cluster have problems with the code cache, > coincidentally all of the deployed jobs read from kafka). > > > Best Regards, > Pedro Chaves > > > > ----- > Best Regards, > Pedro Chaves > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Free forum by Nabble | Edit this page |