Hi all, We are thinking about enabling RocksDB metrics to better monitor our pipeline. However, since they will have performance impact, we will have to be selective about which metrics we use. Does anyone have experience about which metrics are more important than the others ? And what metrics have the largest performance impact ? Thanks, Kien |
Hi Kien, I am pulling in Yun who might know better. Regards,
Roman On Sun, Dec 6, 2020 at 3:52 AM Truong Duc Kien <[hidden email]> wrote:
|
just a data point. we actually enabled all RocksDb metrics by default (including very large jobs in terms of parallelism and state size). We didn't see any significant performance impact. There is probably a small impact. At least, it didn't jump out for our workload. On Tue, Dec 8, 2020 at 9:00 AM Khachatryan Roman <[hidden email]> wrote:
|
Hi Kien,
From my
point of view, RocksDB native metrics could be classified into 5 parts below, and you could select what you're interested in to enable. Enable those metrics could cause about 10% performance regression, and this might impact the overall performance as not
all jobs are state-access bottleneck.
Performance
related:
state.backend.rocksdb.metrics.actual-delayed-write-ratestate.backend.rocksdb.metrics.is-write-stoppedCompaction & flush related, which will impact the memory usage and write stall:
state.backend.rocksdb.metrics.mem-table-flush-pendingstate.backend.rocksdb.metrics.num-running-flushes
state.backend.rocksdb.metrics.compaction-pendingstate.backend.rocksdb.metrics.num-running-compactionsMemory usage status:
state.backend.rocksdb.metrics.block-cache-usage (If Flink's managed memory over RocksDB is enabled, this value would be the same for all column families in the same slot)state.backend.rocksdb.metrics.cur-size-all-mem-tablesDB static properties:
state.backend.rocksdb.metrics.block-cache-capacityDB number of keys and data usage:
state.backend.rocksdb.metrics.estimate-live-data-size
state.backend.rocksdb.metrics.total-sst-files-size
BTW, state.backend.rocksdb.metrics.column-family-as-variable is
not rocksDB internal metrics but to expose column family as variable so that we could classify different state status.
Best
Yun Tang
From: Steven Wu <[hidden email]>
Sent: Wednesday, December 9, 2020 12:11 To: Khachatryan Roman <[hidden email]> Cc: Truong Duc Kien <[hidden email]>; Yun Tang <[hidden email]>; user <[hidden email]> Subject: Re: Recommendation about RocksDB Metrics ? just a data point. we actually enabled all RocksDb metrics by default (including very large jobs in terms of parallelism and state size). We didn't see any significant performance impact. There is probably a small impact. At least, it didn't
jump out for our workload.
On Tue, Dec 8, 2020 at 9:00 AM Khachatryan Roman <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |