(DEPRECATED) Apache Flink User Mailing List archive.

Question about RocksDB performance tunning

Classic

List

Threaded

3 messages Options

Peter Huang

Question about RocksDB performance tunning

Hi,

I have a stateful Flink job with 500k QPS. The job basically counts the message number on a combination key with 10 minutes tumbling window. If I use memory state backend, the job can run without lag but periodically fails due to OOM. If I turn up RocksDB state backend, it will have a high Kafka lag even about memory tunning. The QPS is also growing very fast. I am wondering whether we have good guidance for performance tunning of RocksDB state backend for such kind of large QPS jobs.

Best Regards

Peter Huang

Yun Tang

Re: Question about RocksDB performance tunning

Hi Peter

This is a general problem and you could refer to RocksDB's tuning guides[1][2], you could also refer to Flink built-in PredefinedOptions.java [3].

Generally speaking, increase write buffer size to reduce write amplification, increase the parallelism of keyed operator to share the pressure to disks if found IO bottleneck. Bloom filter is good to add to reduce the cost of read amplification. Use high performance disk would help much.

[1] https://github.com/facebook/rocksdb/wiki/Setup-Options-and-Basic-Tuning

[2] https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide

[3] https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java

Best

Yun Tang

From: Peter Huang <[hidden email]>
Sent: Friday, July 3, 2020 13:31
To: user <[hidden email]>
Subject: Question about RocksDB performance tunning

Hi,

Best Regards

Peter Huang

Re: Question about RocksDB performance tunning

Hi Yun,

Thanks for the info. These materials help a lot.

Best Regards

Peter Huang

On Thu, Jul 2, 2020 at 11:36 PM Yun Tang <[hidden email]> wrote:

Hi Peter

This is a general problem and you could refer to RocksDB's tuning guides[1][2], you could also refer to Flink built-in PredefinedOptions.java [3].

Generally speaking, increase write buffer size to reduce write amplification, increase the parallelism of keyed operator to share the pressure to disks if found IO bottleneck. Bloom filter is good to add to reduce the cost of read amplification. Use high performance disk would help much.

[1] https://github.com/facebook/rocksdb/wiki/Setup-Options-and-Basic-Tuning

[2] https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide

[3] https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java

Best

Yun Tang

From: Peter Huang <[hidden email]>
Sent: Friday, July 3, 2020 13:31
To: user <[hidden email]>
Subject: Question about RocksDB performance tunning

Hi,

I have a stateful Flink job with 500k QPS. The job basically counts the message number on a combination key with 10 minutes tumbling window. If I use memory state backend, the job can run without lag but periodically fails due to OOM. If I turn up RocksDB state backend, it will have a high Kafka lag even about memory tunning. The QPS is also growing very fast. I am wondering whether we have good guidance for performance tunning of RocksDB state backend for such kind of large QPS jobs.

Best Regards

Peter Huang