Re: Large state RocksDb backend increases app start time

Posted by Yun Tang on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Large-state-RocksDb-backend-increases-app-start-time-tp38720p38735.html

Hi Arpith

If you use savepoint to restore RocksDB state, the actual phase is to insert original binary key-value pairs into an empty RocksDB which would be slow if state large. There existed several discussions about the optimizations of this phase [1] [2].

If you want to walk around this issue quickly, you could use incremental checkpoint to restore rocksDB state as it just open the DB with existing sst files instead of loading data. Moreover, rocksDB incremental checkpoint also support the job to change parallelism currently.

[1] https://issues.apache.org/jira/browse/FLINK-17971
[2] https://issues.apache.org/jira/browse/FLINK-17288

Best
Yun Tang

From: Arpith P <[hidden email]>
Sent: Thursday, October 15, 2020 0:50
To: user <[hidden email]>
Subject: Large state RocksDb backend increases app start time
 
Hi,

I'm currently storing around 70GB of data in map sate backed by RocksDB backend . Once I restore an application from savepoint currently the application takes more than 4mins to start processing events. How can I speed this up or is there any other recommended approach.

I'm using the following predefined options with RocksDB.
RocksDBStateBackend backend = new RocksDBStateBackend(checkpointDir, incrementalCheckpoints);
backend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM);

Thanks,
Arpith