Hello,
My case is counting the number of successful login and failures within 1 hour, 10 min, 5 min, 3 min, 1 min, 10 second and 1 second, keyBy login ip or device id. Based on previous counting results of different time dimensions, predict the complicance of the next login. After varous attempts, I chose slide windows to count, e.g. 1 hour window size with 1 min window step, 10 min widonw size with 10 second window step, 5 min window with 5 second window step... Except this, I used rocksdb as state backend, and enabled checkpoint. But now encounter some problems. 1. The RES memory of every taskmanager process is increasing all the time and can not be stable, until the process killed without any OOM exception log. <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1520/memory_usage.png> After several tests, I found that the process memory increase is related to the key (ip or device id). If key values fix in a certain range, process memory can be stable. But if key values randomly changing, the memory increasing. In fact, the key login ip and device id is random. We also found that login reduces after the midnight, and the memory can be shortly stable. But memory increases during the day. I ran a job 15 days ago, the memory is still increasing.The key random changes, the memory increases, is it normal? 2. The rocksdb seems take up a lot of memory. If I changed rocksdb to file system state backend, the memory can drop to around 30%. If there is no limit configuration, will rocksdb's used memory increases all the time? 3. There are some taskmanagers of the flink cluster do not run any task (no slot be used), but the memory is also increasing linearly after the job run several days. What do they use memory for? I have no idea. <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1520/memory_usage2.png> Hope for your reply. Thank you. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
I have met similar issue. Yarn kills the TaskManagers, as their memory usage grows to the limit. I think it might be rocksdb causing the problem. Is there any way to debug the memory usage of rocksdb backend?
Best Yan From: YennieChen88 <[hidden email]>
Sent: Wednesday, August 29, 2018 6:14:11 AM To: [hidden email] Subject: Taskmanager process memory increasing always Hello,
My case is counting the number of successful login and failures within 1 hour, 10 min, 5 min, 3 min, 1 min, 10 second and 1 second, keyBy login ip or device id. Based on previous counting results of different time dimensions, predict the complicance of the next login. After varous attempts, I chose slide windows to count, e.g. 1 hour window size with 1 min window step, 10 min widonw size with 10 second window step, 5 min window with 5 second window step... Except this, I used rocksdb as state backend, and enabled checkpoint. But now encounter some problems. 1. The RES memory of every taskmanager process is increasing all the time and can not be stable, until the process killed without any OOM exception log. <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1520/memory_usage.png> After several tests, I found that the process memory increase is related to the key (ip or device id). If key values fix in a certain range, process memory can be stable. But if key values randomly changing, the memory increasing. In fact, the key login ip and device id is random. We also found that login reduces after the midnight, and the memory can be shortly stable. But memory increases during the day. I ran a job 15 days ago, the memory is still increasing.The key random changes, the memory increases, is it normal? 2. The rocksdb seems take up a lot of memory. If I changed rocksdb to file system state backend, the memory can drop to around 30%. If there is no limit configuration, will rocksdb's used memory increases all the time? 3. There are some taskmanagers of the flink cluster do not run any task (no slot be used), but the memory is also increasing linearly after the job run several days. What do they use memory for? I have no idea. <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1520/memory_usage2.png> Hope for your reply. Thank you. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
As far as I know, rocksdb mainly uses off-heap memory, which is hard to be
controlled by JVM. Maybe you can monitor off-heap memory of taskmanager process by professional tools, such as gperftools... -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Free forum by Nabble | Edit this page |