Rocksdb - Incremental vs full checkpoints

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Rocksdb - Incremental vs full checkpoints

sudranga
Hi,
I have an event-window pipeline which handles a fixed number of messages per
second for a fixed number of keys. When i have rocksdb as the state backend
with incremental checkpoints, i see the delta checkpoint size constantly
increase. Please see
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t2790/Screen_Shot_2020-10-13_at_6.png>

I turned off incremental checkpoints and all the checkpoints are 64kb (There
appears to be no state leak in user code or otherwise). It is not clear why
the incremental checkpoints keep increasing in size. Perhaps, the
incremental checkpoints are not incremental(for this small state size) and
are simply full state appended to full state and so on...

From some posts on this forum, I understand the use case for incremental
checkpoints is designed when the state size is fairly large (Gbs-Tbs) and
where the changes in state are minimal across checkpoints. However, does
this mean that we should not enable incremental checkpointing for use cases
where the state size is much smaller? Would the 'constantly' increasing
snapshot delta size reduce at some point?  I don't see any compaction runs
happening
(taskmanager_job_task_operator_column_family_rocksdb.num-running-compactions).
Not sure if that is what I am missing...

Thanks
Sudharsan



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Rocksdb - Incremental vs full checkpoints

Yun Tang
Hi

This difference of data size of incremental vs full checkpoint is due to the different implementations.
The incremental checkpoint strategy upload binary sst files while full checkpoint strategy scans the DB and write all kv entries to external DFS.

As your state size is really small (only 200 KB), I think your RocksDB has not ever triggered compaction to reduce sst files, that's why the size constantly increase.

Best
Yun Tang

From: sudranga <[hidden email]>
Sent: Wednesday, October 14, 2020 10:40
To: [hidden email] <[hidden email]>
Subject: Rocksdb - Incremental vs full checkpoints
 
Hi,
I have an event-window pipeline which handles a fixed number of messages per
second for a fixed number of keys. When i have rocksdb as the state backend
with incremental checkpoints, i see the delta checkpoint size constantly
increase. Please see
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t2790/Screen_Shot_2020-10-13_at_6.png>

I turned off incremental checkpoints and all the checkpoints are 64kb (There
appears to be no state leak in user code or otherwise). It is not clear why
the incremental checkpoints keep increasing in size. Perhaps, the
incremental checkpoints are not incremental(for this small state size) and
are simply full state appended to full state and so on...

From some posts on this forum, I understand the use case for incremental
checkpoints is designed when the state size is fairly large (Gbs-Tbs) and
where the changes in state are minimal across checkpoints. However, does
this mean that we should not enable incremental checkpointing for use cases
where the state size is much smaller? Would the 'constantly' increasing
snapshot delta size reduce at some point?  I don't see any compaction runs
happening
(taskmanager_job_task_operator_column_family_rocksdb.num-running-compactions).
Not sure if that is what I am missing...

Thanks
Sudharsan



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Rocksdb - Incremental vs full checkpoints

sudranga
Hi Yun,
Sorry for the late reply - I was doing some reading. As far as i understand, when incremental checkpointing is enabled, the reported checkpoint size(metrics/UI) is only the size of the deltas and not the full state size. I understand that compaction may not get triggered. But, if we are creating a fixed amount of state every checkpoint interval, shouldn't the reported checkpoint size remain the same(as it is a delta)?


Thanks

Sudharsan


On Tue, Oct 13, 2020 at 11:34 PM Yun Tang <[hidden email]> wrote:
Hi

This difference of data size of incremental vs full checkpoint is due to the different implementations.
The incremental checkpoint strategy upload binary sst files while full checkpoint strategy scans the DB and write all kv entries to external DFS.

As your state size is really small (only 200 KB), I think your RocksDB has not ever triggered compaction to reduce sst files, that's why the size constantly increase.

Best
Yun Tang

From: sudranga <[hidden email]>
Sent: Wednesday, October 14, 2020 10:40
To: [hidden email] <[hidden email]>
Subject: Rocksdb - Incremental vs full checkpoints
 
Hi,
I have an event-window pipeline which handles a fixed number of messages per
second for a fixed number of keys. When i have rocksdb as the state backend
with incremental checkpoints, i see the delta checkpoint size constantly
increase. Please see
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t2790/Screen_Shot_2020-10-13_at_6.png>

I turned off incremental checkpoints and all the checkpoints are 64kb (There
appears to be no state leak in user code or otherwise). It is not clear why
the incremental checkpoints keep increasing in size. Perhaps, the
incremental checkpoints are not incremental(for this small state size) and
are simply full state appended to full state and so on...

From some posts on this forum, I understand the use case for incremental
checkpoints is designed when the state size is fairly large (Gbs-Tbs) and
where the changes in state are minimal across checkpoints. However, does
this mean that we should not enable incremental checkpointing for use cases
where the state size is much smaller? Would the 'constantly' increasing
snapshot delta size reduce at some point?  I don't see any compaction runs
happening
(taskmanager_job_task_operator_column_family_rocksdb.num-running-compactions).
Not sure if that is what I am missing...

Thanks
Sudharsan



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Rocksdb - Incremental vs full checkpoints

Yun Tang
Hi Sudharsan

Once enable the incremental checkpoint, the delta size is the same as the size of newly uploaded sst files. Which might not be always the same considering RocksDB's compression ratio, compaction times and time to flush. If you really want to check the details, you could login the machine and find where locates state dir to see how sst files stored for each checkpoint when local recovery is enabled [1].


Best
Yun Tang

From: Sudharsan R <[hidden email]>
Sent: Monday, October 26, 2020 10:38
To: Yun Tang <[hidden email]>
Cc: [hidden email] <[hidden email]>
Subject: Re: Rocksdb - Incremental vs full checkpoints
 
Hi Yun,
Sorry for the late reply - I was doing some reading. As far as i understand, when incremental checkpointing is enabled, the reported checkpoint size(metrics/UI) is only the size of the deltas and not the full state size. I understand that compaction may not get triggered. But, if we are creating a fixed amount of state every checkpoint interval, shouldn't the reported checkpoint size remain the same(as it is a delta)?


Thanks

Sudharsan


On Tue, Oct 13, 2020 at 11:34 PM Yun Tang <[hidden email]> wrote:
Hi

This difference of data size of incremental vs full checkpoint is due to the different implementations.
The incremental checkpoint strategy upload binary sst files while full checkpoint strategy scans the DB and write all kv entries to external DFS.

As your state size is really small (only 200 KB), I think your RocksDB has not ever triggered compaction to reduce sst files, that's why the size constantly increase.

Best
Yun Tang

From: sudranga <[hidden email]>
Sent: Wednesday, October 14, 2020 10:40
To: [hidden email] <[hidden email]>
Subject: Rocksdb - Incremental vs full checkpoints
 
Hi,
I have an event-window pipeline which handles a fixed number of messages per
second for a fixed number of keys. When i have rocksdb as the state backend
with incremental checkpoints, i see the delta checkpoint size constantly
increase. Please see
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t2790/Screen_Shot_2020-10-13_at_6.png>

I turned off incremental checkpoints and all the checkpoints are 64kb (There
appears to be no state leak in user code or otherwise). It is not clear why
the incremental checkpoints keep increasing in size. Perhaps, the
incremental checkpoints are not incremental(for this small state size) and
are simply full state appended to full state and so on...

From some posts on this forum, I understand the use case for incremental
checkpoints is designed when the state size is fairly large (Gbs-Tbs) and
where the changes in state are minimal across checkpoints. However, does
this mean that we should not enable incremental checkpointing for use cases
where the state size is much smaller? Would the 'constantly' increasing
snapshot delta size reduce at some point?  I don't see any compaction runs
happening
(taskmanager_job_task_operator_column_family_rocksdb.num-running-compactions).
Not sure if that is what I am missing...

Thanks
Sudharsan



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/