Hello, I'm reading the docs/blog on incremental checkpoints and it says: >You can also no longer delete old checkpoints as newer checkpoints need
them, and the history of differences between checkpoints can grow
indefinitely over time. You need to plan for larger distributed storage
to maintain the checkpoints and the network overhead to read from it. I'm wondering why this would be true though. It says earlier that incremental checkpoints compact so why would the history grow indefinitely? Thanks! -- Rex Fenley | Software Engineer - Mobile and Backend Remind.com | BLOG | FOLLOW US | LIKE US |
Hi Rex, As per my understanding there are multiple levels of compactions (with RocksDB), and files which are not compacted recently would remain in older checkpoint directories, and there will be references to those files in the current checkpoint. There is no clear way of identifying these references and clearing older checkpoint directories. What we do instead to avoid ever increasing checkpoint directory size is to stop the job with a savepoint, clear the checkpoints directory and start the job from the savepoint periodically. Thanks, Akshay Aggarwal On Tue, Nov 10, 2020 at 10:55 PM Rex Fenley <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |