Checkpoint is disable, will history data in rocksdb be leak when job restart?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Checkpoint is disable, will history data in rocksdb be leak when job restart?

SmileSmile

Hi

My job work on flink 1.10.1 with event time , container memory usage  will rise 2G after one restart,then pod will be killed by os after some times restart。

I find history data will be cleared when  new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data  leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb,Heap backend is ok。

Can anyone help me on how to deal with this?


Reply | Threaded
Open this post in threaded view
|

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Yun Tang
Hi

If your job does not need checkpoint, why you would still restore your job with checkpoints?

Actually, I did not total understand what you want, are you afraid that the state restored from last checkpoint would not be cleared? Since the event timer is also stored in checkpoint, after you restore from checkpoint, the event time window would also be triggered to clean history state.

In the end, I think you just want to know why the pod is killed after some time? Please consider to increase the process memory to increase the overhead of JVM to provide some more buffer space for native memory usage [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and once you have some extra memory, the pod might be treated as OOM to be killed.


Best
Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:01
To: '[hidden email]' <[hidden email]>
Subject: Checkpoint is disable, will history data in rocksdb be leak when job restart?
 

Hi

My job work on flink 1.10.1 with event time , container memory usage  will rise 2G after one restart,then pod will be killed by os after some times restart。

I find history data will be cleared when  new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data  leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb,Heap backend is ok。

Can anyone help me on how to deal with this?


Reply | Threaded
Open this post in threaded view
|

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

SmileSmile
Hi,yun tang

I dont open checkpoint,so when  my job restart,flink how to clean history state?

my pod be killed only  happend after the job restart again and again, in this case ,I have to rebuild the flink cluster 。




a511955993
邮箱:a511955993@...

签名由 网易邮箱大师 定制

On 07/03/2020 14:22, [hidden email] wrote:
Hi

If your job does not need checkpoint, why you would still restore your job with checkpoints?

Actually, I did not total understand what you want, are you afraid that the state restored from last checkpoint would not be cleared? Since the event timer is also stored in checkpoint, after you restore from checkpoint, the event time window would also be triggered to clean history state.

In the end, I think you just want to know why the pod is killed after some time? Please consider to increase the process memory to increase the overhead of JVM to provide some more buffer space for native memory usage [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and once you have some extra memory, the pod might be treated as OOM to be killed.


Best
Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:01
To: '[hidden email]' <[hidden email]>
Subject: Checkpoint is disable, will history data in rocksdb be leak when job restart?
 

Hi

My job work on flink 1.10.1 with event time , container memory usage  will rise 2G after one restart,then pod will be killed by os after some times restart。

I find history data will be cleared when  new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data  leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb,Heap backend is ok。

Can anyone help me on how to deal with this?


Reply | Threaded
Open this post in threaded view
|

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Yun Tang
Hi

If you do not enable checkpoint and have you ever restored checkpoint for the new job. As what I have said, the timer would also be restored and the event time would also be triggered so that following onEventTime() could also be triggered to clean history data.

For the 2nd question, why your job restarts again and again? I think that problem should be first considered.

Best
Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:30
To: Yun Tang <[hidden email]>
Cc: '[hidden email]' <[hidden email]>
Subject: Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?
 
Hi,yun tang

I dont open checkpoint,so when  my job restart,flink how to clean history state?

my pod be killed only  happend after the job restart again and again, in this case ,I have to rebuild the flink cluster 。




a511955993
邮箱:a511955993@...

签名由 网易邮箱大师 定制

On 07/03/2020 14:22, [hidden email] wrote:
Hi

If your job does not need checkpoint, why you would still restore your job with checkpoints?

Actually, I did not total understand what you want, are you afraid that the state restored from last checkpoint would not be cleared? Since the event timer is also stored in checkpoint, after you restore from checkpoint, the event time window would also be triggered to clean history state.

In the end, I think you just want to know why the pod is killed after some time? Please consider to increase the process memory to increase the overhead of JVM to provide some more buffer space for native memory usage [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and once you have some extra memory, the pod might be treated as OOM to be killed.


Best
Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:01
To: '[hidden email]' <[hidden email]>
Subject: Checkpoint is disable, will history data in rocksdb be leak when job restart?
 

Hi

My job work on flink 1.10.1 with event time , container memory usage  will rise 2G after one restart,then pod will be killed by os after some times restart。

I find history data will be cleared when  new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data  leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb,Heap backend is ok。

Can anyone help me on how to deal with this?


Reply | Threaded
Open this post in threaded view
|

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Congxian Qiu
Hi SmileSmile

As the OOM problem, maybe you can try to get a memory dump before OOM, after you get the memory dump, you can know who consumes more memory as expected.

Best,
Congxian


Yun Tang <[hidden email]> 于2020年7月3日周五 下午3:04写道:
Hi

If you do not enable checkpoint and have you ever restored checkpoint for the new job. As what I have said, the timer would also be restored and the event time would also be triggered so that following onEventTime() could also be triggered to clean history data.

For the 2nd question, why your job restarts again and again? I think that problem should be first considered.

Best
Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:30
To: Yun Tang <[hidden email]>
Cc: '[hidden email]' <[hidden email]>
Subject: Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?
 
Hi,yun tang

I dont open checkpoint,so when  my job restart,flink how to clean history state?

my pod be killed only  happend after the job restart again and again, in this case ,I have to rebuild the flink cluster 。




a511955993
邮箱:a511955993@...

签名由 网易邮箱大师 定制

On 07/03/2020 14:22, [hidden email] wrote:
Hi

If your job does not need checkpoint, why you would still restore your job with checkpoints?

Actually, I did not total understand what you want, are you afraid that the state restored from last checkpoint would not be cleared? Since the event timer is also stored in checkpoint, after you restore from checkpoint, the event time window would also be triggered to clean history state.

In the end, I think you just want to know why the pod is killed after some time? Please consider to increase the process memory to increase the overhead of JVM to provide some more buffer space for native memory usage [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and once you have some extra memory, the pod might be treated as OOM to be killed.


Best
Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:01
To: '[hidden email]' <[hidden email]>
Subject: Checkpoint is disable, will history data in rocksdb be leak when job restart?
 

Hi

My job work on flink 1.10.1 with event time , container memory usage  will rise 2G after one restart,then pod will be killed by os after some times restart。

I find history data will be cleared when  new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data  leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb,Heap backend is ok。

Can anyone help me on how to deal with this?