Measuring the Size of State, Savepoint Size vs. Restore time

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Measuring the Size of State, Savepoint Size vs. Restore time

Kevin Lam
Hi all,

We're interested in doing some analysis on how the size of our savepoints and state affects the time it takes to restore from a savepoint. We're running Flink 1.12 and using RocksDB as a state backend, on Kubernetes.

What is the best way to measure the size of a Flink Application's state? Is state.backend.rocksdb.metrics.total-sst-files-size the right thing to look at?

We tried looking at state.backend.rocksdb.metrics.total-sst-files-size for all our operators, after restoring from a savepoint, and we noticed that the sum of all the sst files sizes is much much smaller than the total size of our savepoint (7GB vs 10TB).  Where does that discrepancy come from? 

Do you have any general advice on correlating savepoint size with restore times? 

Thanks in advance!
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Size of State, Savepoint Size vs. Restore time

Guowei Ma
Hi, Kevin

If you use the RocksDB and want to know the data on the disk I think that is the right metric. But the SST files might include some expired data. Some data in memory is not included in the SST files yet. In general I think it could reflect the state size of your application.

I think that there is no metric for the time that spends on restoring from a savepoint.

As for why there is a huge difference between the size of sst and the size of savepoint, I think @Yun can give some detailed insights.

Best,
Guowei


On Thu, Apr 1, 2021 at 1:38 AM Kevin Lam <[hidden email]> wrote:
Hi all,

We're interested in doing some analysis on how the size of our savepoints and state affects the time it takes to restore from a savepoint. We're running Flink 1.12 and using RocksDB as a state backend, on Kubernetes.

What is the best way to measure the size of a Flink Application's state? Is state.backend.rocksdb.metrics.total-sst-files-size the right thing to look at?

We tried looking at state.backend.rocksdb.metrics.total-sst-files-size for all our operators, after restoring from a savepoint, and we noticed that the sum of all the sst files sizes is much much smaller than the total size of our savepoint (7GB vs 10TB).  Where does that discrepancy come from? 

Do you have any general advice on correlating savepoint size with restore times? 

Thanks in advance!
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Size of State, Savepoint Size vs. Restore time

Yun Tang
HI Kevin,

Currently, you can view logs to find when to start and finish to restore [1] to know how much time spent on task side. Flink-1.13 also try to expose stage of task initializations [2] and maybe it could help you.


state.backend.rocksdb.metrics.total-sst-files-size should be correct to describe the sst file size. We can have several reasons why the savepoint size larger than sst-files size:
  1. SST files are compressed with snappy format by default while savepoint not.
  2. SST files could save spaces due to same prefix key bytes.
  3. Some contents are still in memory write buffer and not yet flushed.

However, the difference is really huge, have you ever logined machines having keyed state to see how much space occupried? And what's the incremental checkpoint size of your job, have you ever enabeld TTL for state?



Best
Yun Tang



From: Guowei Ma <[hidden email]>
Sent: Thursday, April 1, 2021 11:57
To: Kevin Lam <[hidden email]>
Cc: user <[hidden email]>; Yun Tang <[hidden email]>
Subject: Re: Measuring the Size of State, Savepoint Size vs. Restore time
 
Hi, Kevin

If you use the RocksDB and want to know the data on the disk I think that is the right metric. But the SST files might include some expired data. Some data in memory is not included in the SST files yet. In general I think it could reflect the state size of your application.

I think that there is no metric for the time that spends on restoring from a savepoint.

As for why there is a huge difference between the size of sst and the size of savepoint, I think @Yun can give some detailed insights.

Best,
Guowei


On Thu, Apr 1, 2021 at 1:38 AM Kevin Lam <[hidden email]> wrote:
Hi all,

We're interested in doing some analysis on how the size of our savepoints and state affects the time it takes to restore from a savepoint. We're running Flink 1.12 and using RocksDB as a state backend, on Kubernetes.

What is the best way to measure the size of a Flink Application's state? Is state.backend.rocksdb.metrics.total-sst-files-size the right thing to look at?

We tried looking at state.backend.rocksdb.metrics.total-sst-files-size for all our operators, after restoring from a savepoint, and we noticed that the sum of all the sst files sizes is much much smaller than the total size of our savepoint (7GB vs 10TB).  Where does that discrepancy come from? 

Do you have any general advice on correlating savepoint size with restore times? 

Thanks in advance!