Taskmanager process memory increasing always

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Taskmanager process memory increasing always

YennieChen88
Hello,
        My case is counting the number of successful login and failures within 1
hour, 10 min, 5 min, 3 min, 1 min, 10 second and 1 second, keyBy login ip or
device id. Based on previous counting results of different time dimensions,
predict the complicance of the next login.
        After varous attempts, I chose slide windows to count, e.g. 1 hour window
size with 1 min window step, 10 min widonw size with 10 second window step,
5 min window with 5 second window step... Except this, I used rocksdb as
state backend, and enabled checkpoint.
        But now encounter some problems.
        1. The RES memory of every taskmanager process is increasing all the time
and can not be stable, until the process killed without any OOM exception
log.
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1520/memory_usage.png>
          After several tests, I found that the process memory increase is related
to the key (ip or device id). If key values fix in a certain range,  process
memory can be stable. But if key values randomly changing, the memory
increasing. In fact, the key login ip and device id is random. We also found
that login reduces after the midnight, and the memory can be shortly stable.
But memory increases during the day. I ran a job 15 days ago, the memory is
still increasing.The key random changes, the memory increases, is it normal?

        2. The rocksdb seems take up a lot of memory.
           If I changed rocksdb to file system state backend, the memory can drop
to around 30%. If there is no limit configuration, will rocksdb's used
memory increases all the time?

        3. There are some taskmanagers of the flink cluster do not run any task (no
slot be used), but the memory is also increasing linearly after the job run
several days. What do they use memory for? I have no idea.
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1520/memory_usage2.png>

        Hope for your reply. Thank you.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Taskmanager process memory increasing always

Yan Zhou [FDS Science] ­

I have met similar issue. Yarn kills the TaskManagers, as their memory usage grows to the limit. I think it might be rocksdb causing the problem. Is there any way to debug the memory usage of rocksdb backend?


Best

Yan 


From: YennieChen88 <[hidden email]>
Sent: Wednesday, August 29, 2018 6:14:11 AM
To: [hidden email]
Subject: Taskmanager process memory increasing always
 
Hello,
        My case is counting the number of successful login and failures within 1
hour, 10 min, 5 min, 3 min, 1 min, 10 second and 1 second, keyBy login ip or
device id. Based on previous counting results of different time dimensions,
predict the complicance of the next login.
        After varous attempts, I chose slide windows to count, e.g. 1 hour window
size with 1 min window step, 10 min widonw size with 10 second window step,
5 min window with 5 second window step... Except this, I used rocksdb as
state backend, and enabled checkpoint.
        But now encounter some problems.
        1. The RES memory of every taskmanager process is increasing all the time
and can not be stable, until the process killed without any OOM exception
log.
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1520/memory_usage.png>
          After several tests, I found that the process memory increase is related
to the key (ip or device id). If key values fix in a certain range,  process
memory can be stable. But if key values randomly changing, the memory
increasing. In fact, the key login ip and device id is random. We also found
that login reduces after the midnight, and the memory can be shortly stable.
But memory increases during the day. I ran a job 15 days ago, the memory is
still increasing.The key random changes, the memory increases, is it normal?

        2. The rocksdb seems take up a lot of memory.
           If I changed rocksdb to file system state backend, the memory can drop
to around 30%. If there is no limit configuration, will rocksdb's used
memory increases all the time?

        3. There are some taskmanagers of the flink cluster do not run any task (no
slot be used), but the memory is also increasing linearly after the job run
several days. What do they use memory for? I have no idea.
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1520/memory_usage2.png>

        Hope for your reply. Thank you.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Taskmanager process memory increasing always

YennieChen88
As far as I know, rocksdb mainly uses off-heap memory, which is hard to be
controlled by JVM. Maybe you can monitor off-heap memory of taskmanager
process by professional tools, such as gperftools...



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/