Trying to understand these 3 parameters: state.backend state.backend.fs.checkpointdir state.backend.rocksdb.checkpointdir state.checkpoints.dir As I understand stream of data and the state of operators are 2 different concepts and that both need to be checkpointed. I am bit confused about the purpose of these parameters and their applicability. |
Hi,
the purpose of the configuration parameter is described in the documentation under https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/config.html. In a nutshell, state.checkpoints.dir contains the (small) meta data files for checkpoints, which typically contains pointers to the files which contain the actual state snapshot data. The state.backend.fs.checkpointdir is the directory into which the actual state from the backends is written. Finally, state.backend.rocksdb.checkpointdir is a poorly named key for the directory of the RocksDB instance data and has in fact nothing to do with checkpoints. Best, Stefan
|
I thought rocksdb is used to as a store backend. If that is the case then why would are there 2 configuration parameter? Or in other words what is the behavior if both state.backend.fs.checkpointdir and state.backend.rocksdb is set? On Fri, Feb 3, 2017 at 1:47 AM, Stefan Richter <[hidden email]> wrote:
|
If you have configured RocksDB as backend, Flink typically has multiple RocksDB instances per job - one for each parallel operator instance with keyed state. Those RocksDB instances live local to their corresponding operator instances. Parameter state.backend.rocksdb.
|
Thanks for the clarification! On Sat, Feb 4, 2017 at 3:34 AM, Stefan Richter <[hidden email]> wrote:
|
In reply to this post by Stefan Richter
Hi guys,
This is great clarification! An extended question from me is, what's the difference between `state.checkpoints.dir` and the param you pass in to RocksDBStateBackend constructor in`public RocksDBStateBackend(URI checkpointDataUri) throws IOException`? They are really confusing. I specified checkpointDataUri but got error of `CheckpointConfig says to persist periodic checkpoints, but no checkpoint directory has been configured. You can configure configure one via key 'state.checkpoints.dir'.`. Thanks, Bowen |
In reply to this post by Stefan Richter
FYI, http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Clarification-on-state-backend-parameters-td11419.html here's the context that discussed differences among: state.backend.fs.checkpointdir state.backend.rocksdb.checkpointdir state.checkpoints.dir On Wed, Jun 14, 2017 at 12:20 PM, bowen.li <[hidden email]> wrote: Hi guys, |
Free forum by Nabble | Edit this page |