Recommendation about RocksDB Metrics ?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Recommendation about RocksDB Metrics ?

Kien Truong
Hi all,

We are thinking about enabling RocksDB metrics to better monitor our pipeline. However, since they will have performance impact, we will have to be selective about which metrics we use.

Does anyone have experience about which metrics are more important than the others ?

And what metrics have the largest performance impact ?

Thanks,
Kien
Reply | Threaded
Open this post in threaded view
|

Re: Recommendation about RocksDB Metrics ?

r_khachatryan
Hi Kien,

I am pulling in Yun who might know better.

Regards,
Roman


On Sun, Dec 6, 2020 at 3:52 AM Truong Duc Kien <[hidden email]> wrote:
Hi all,

We are thinking about enabling RocksDB metrics to better monitor our pipeline. However, since they will have performance impact, we will have to be selective about which metrics we use.

Does anyone have experience about which metrics are more important than the others ?

And what metrics have the largest performance impact ?

Thanks,
Kien
Reply | Threaded
Open this post in threaded view
|

Re: Recommendation about RocksDB Metrics ?

Steven Wu
just a data point. we actually enabled all RocksDb metrics by default (including very large jobs in terms of parallelism and state size). We didn't see any significant performance impact. There is probably a small impact. At least, it didn't jump out for our workload.

On Tue, Dec 8, 2020 at 9:00 AM Khachatryan Roman <[hidden email]> wrote:
Hi Kien,

I am pulling in Yun who might know better.

Regards,
Roman


On Sun, Dec 6, 2020 at 3:52 AM Truong Duc Kien <[hidden email]> wrote:
Hi all,

We are thinking about enabling RocksDB metrics to better monitor our pipeline. However, since they will have performance impact, we will have to be selective about which metrics we use.

Does anyone have experience about which metrics are more important than the others ?

And what metrics have the largest performance impact ?

Thanks,
Kien
Reply | Threaded
Open this post in threaded view
|

Re: Recommendation about RocksDB Metrics ?

Yun Tang
Hi Kien,

From my point of view, RocksDB native metrics could be classified into 5 parts below, and you could select what you're interested in to enable. Enable those metrics could cause about 10% performance regression, and this might impact the overall performance as not all jobs are state-access bottleneck.

Performance related:
state.backend.rocksdb.metrics.actual-delayed-write-rate
state.backend.rocksdb.metrics.is-write-stopped

Compaction & flush related, which will impact the memory usage and write stall:
state.backend.rocksdb.metrics.mem-table-flush-pending
state.backend.rocksdb.metrics.num-running-flushes
state.backend.rocksdb.metrics.compaction-pending
state.backend.rocksdb.metrics.num-running-compactions
 
Memory usage status:
state.backend.rocksdb.metrics.block-cache-usage  (If Flink's managed memory over RocksDB is enabled, this value would be the same for all column families in the same slot)
state.backend.rocksdb.metrics.cur-size-all-mem-tables

DB static properties:
state.backend.rocksdb.metrics.block-cache-capacity

DB number of keys and data usage:
state.backend.rocksdb.metrics.estimate-live-data-size
state.backend.rocksdb.metrics.total-sst-files-size

BTW, state.backend.rocksdb.metrics.column-family-as-variable is not rocksDB internal metrics but to expose column family as variable so that we could classify different state status.

Best
Yun Tang

From: Steven Wu <[hidden email]>
Sent: Wednesday, December 9, 2020 12:11
To: Khachatryan Roman <[hidden email]>
Cc: Truong Duc Kien <[hidden email]>; Yun Tang <[hidden email]>; user <[hidden email]>
Subject: Re: Recommendation about RocksDB Metrics ?
 
just a data point. we actually enabled all RocksDb metrics by default (including very large jobs in terms of parallelism and state size). We didn't see any significant performance impact. There is probably a small impact. At least, it didn't jump out for our workload.

On Tue, Dec 8, 2020 at 9:00 AM Khachatryan Roman <[hidden email]> wrote:
Hi Kien,

I am pulling in Yun who might know better.

Regards,
Roman


On Sun, Dec 6, 2020 at 3:52 AM Truong Duc Kien <[hidden email]> wrote:
Hi all,

We are thinking about enabling RocksDB metrics to better monitor our pipeline. However, since they will have performance impact, we will have to be selective about which metrics we use.

Does anyone have experience about which metrics are more important than the others ?

And what metrics have the largest performance impact ?

Thanks,
Kien