(DEPRECATED) Apache Flink User Mailing List archive.

Checkpoint is disable, will history data in rocksdb be leak when job restart?

Classic

List

Threaded

5 messages Options

SmileSmile

Jul 03, 2020; 6:01am

Checkpoint is disable, will history data in rocksdb be leak when job restart?

Hi

My job work on flink 1.10.1 with event time , container memory usage will rise 2G after one restart，then pod will be killed by os after some times restart。

I find history data will be cleared when new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb，Heap backend is ok。

Can anyone help me on how to deal with this?

	a511955993
邮箱：a511955993@...

签名由网易邮箱大师定制

Yun Tang

Jul 03, 2020; 6:22am

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

If your job does not need checkpoint, why you would still restore your job with checkpoints?

Actually, I did not total understand what you want, are you afraid that the state restored from last checkpoint would not be cleared? Since the event timer is also stored in checkpoint, after you restore from checkpoint, the event time window would also be triggered to clean history state.

In the end, I think you just want to know why the pod is killed after some time? Please consider to increase the process memory to increase the overhead of JVM to provide some more buffer space for native memory usage [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and once you have some extra memory, the pod might be treated as OOM to be killed.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#overview

Best

Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:01
To: '[hidden email]' <[hidden email]>
Subject: Checkpoint is disable, will history data in rocksdb be leak when job restart?

	a511955993
邮箱：a511955993@...

签名由网易邮箱大师定制

SmileSmile

Jul 03, 2020; 6:30am

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Hi，yun tang

I dont open checkpoint，so when my job restart，flink how to clean history state？

my pod be killed only happend after the job restart again and again， in this case ，I have to rebuild the flink cluster 。

	a511955993
邮箱：a511955993@...

签名由网易邮箱大师定制

On 07/03/2020 14:22, [hidden email] wrote:

Hi

If your job does not need checkpoint, why you would still restore your job with checkpoints?

Actually, I did not total understand what you want, are you afraid that the state restored from last checkpoint would not be cleared? Since the event timer is also stored in checkpoint, after you restore from checkpoint, the event time window would also be triggered to clean history state.

In the end, I think you just want to know why the pod is killed after some time? Please consider to increase the process memory to increase the overhead of JVM to provide some more buffer space for native memory usage [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and once you have some extra memory, the pod might be treated as OOM to be killed.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#overview

Best

Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:01
To: '[hidden email]' <[hidden email]>
Subject: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Hi

My job work on flink 1.10.1 with event time , container memory usage will rise 2G after one restart，then pod will be killed by os after some times restart。

I find history data will be cleared when new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb，Heap backend is ok。

Can anyone help me on how to deal with this?

a511955993

邮箱：a511955993@...

签名由网易邮箱大师定制

... [show rest of quote]

... [show rest of quote]

Yun Tang

Jul 03, 2020; 7:03am

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

If you do not enable checkpoint and have you ever restored checkpoint for the new job. As what I have said, the timer would also be restored and the event time would also be triggered so that following onEventTime() could also be triggered to clean history data.

For the 2nd question, why your job restarts again and again? I think that problem should be first considered.

Best

Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:30
To: Yun Tang <[hidden email]>
Cc: '[hidden email]' <[hidden email]>
Subject: Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

	a511955993
邮箱：a511955993@...

签名由网易邮箱大师定制

On 07/03/2020 14:22, [hidden email] wrote:

Hi

If your job does not need checkpoint, why you would still restore your job with checkpoints?

Actually, I did not total understand what you want, are you afraid that the state restored from last checkpoint would not be cleared? Since the event timer is also stored in checkpoint, after you restore from checkpoint, the event time window would also be triggered to clean history state.

In the end, I think you just want to know why the pod is killed after some time? Please consider to increase the process memory to increase the overhead of JVM to provide some more buffer space for native memory usage [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and once you have some extra memory, the pod might be treated as OOM to be killed.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#overview

Best

Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:01
To: '[hidden email]' <[hidden email]>
Subject: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Hi

My job work on flink 1.10.1 with event time , container memory usage will rise 2G after one restart，then pod will be killed by os after some times restart。

I find history data will be cleared when new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb，Heap backend is ok。

Can anyone help me on how to deal with this?

a511955993

邮箱：a511955993@...

签名由网易邮箱大师定制

... [show rest of quote]

... [show rest of quote]

Congxian Qiu

Jul 05, 2020; 10:26am

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Hi SmileSmile

As the OOM problem, maybe you can try to get a memory dump before OOM, after you get the memory dump, you can know who consumes more memory as expected.

Best,

Congxian

Yun Tang <[hidden email]> 于2020年7月3日周五下午3:04写道：

Hi

If you do not enable checkpoint and have you ever restored checkpoint for the new job. As what I have said, the timer would also be restored and the event time would also be triggered so that following onEventTime() could also be triggered to clean history data.

For the 2nd question, why your job restarts again and again? I think that problem should be first considered.

Best

Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:30
To: Yun Tang <[hidden email]>
Cc: '[hidden email]' <[hidden email]>
Subject: Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Hi，yun tang

I dont open checkpoint，so when my job restart，flink how to clean history state？

my pod be killed only happend after the job restart again and again， in this case ，I have to rebuild the flink cluster 。

a511955993

邮箱：a511955993@...

签名由网易邮箱大师定制

On 07/03/2020 14:22, [hidden email] wrote:

Hi

If your job does not need checkpoint, why you would still restore your job with checkpoints?

Actually, I did not total understand what you want, are you afraid that the state restored from last checkpoint would not be cleared? Since the event timer is also stored in checkpoint, after you restore from checkpoint, the event time window would also be triggered to clean history state.

In the end, I think you just want to know why the pod is killed after some time? Please consider to increase the process memory to increase the overhead of JVM to provide some more buffer space for native memory usage [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and once you have some extra memory, the pod might be treated as OOM to be killed.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#overview

Best

Yun Tang

From: SmileSmile <[hidden email]>
Sent: Friday, July 3, 2020 14:01
To: '[hidden email]' <[hidden email]>
Subject: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Hi

My job work on flink 1.10.1 with event time , container memory usage will rise 2G after one restart，then pod will be killed by os after some times restart。

I find history data will be cleared when new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb，Heap backend is ok。

Can anyone help me on how to deal with this?

a511955993

邮箱：a511955993@...

签名由网易邮箱大师定制

... [show rest of quote]

... [show rest of quote]

... [show rest of quote]