(DEPRECATED) Apache Flink User Mailing List archive.

Using RocksDB as State Backend over a Distributed File System

Classic

List

Threaded

7 messages Options

chiggi_dev

Using RocksDB as State Backend over a Distributed File System

Hi,

I am working on a use case where I need to store a large amount of data in state. I am using RocksDB as my state backend. Now to ensure data replication, I want to store the RocksDB files in some distributed file system.

From the documentation I can see that Flink recommends a list of FileSystem to be used for state backend. Given here :

Apache Flink 1.4 Documentation: File Systems

But I cannot figure out the file system for RocksDB. What are the recommendations for File Systems to be used with RocksDB?

Thanks in advance.

Regards,

Chirag

Stefan Richter

Re: Using RocksDB as State Backend over a Distributed File System

Hi,

I think there is a misunderstanding. RocksDB state backend always operates on local disk of the node that runs your task to give you optimal performance. You can think of this as a transient working area that does not require any durability. Durability always happens through checkpoints (or savepoints) which, in turn, go to a distributed storage. Checkpoints and checkpoints are like a consistent moment-in-time image of the backends content and can be used to recover under failure (checkpoints) or manually resume your job (savepoints).

Best,

Stefan

Am 26.04.2018 um 13:16 schrieb Chirag Dewan <[hidden email]>:

Hi,

I am working on a use case where I need to store a large amount of data in state. I am using RocksDB as my state backend. Now to ensure data replication, I want to store the RocksDB files in some distributed file system.

From the documentation I can see that Flink recommends a list of FileSystem to be used for state backend. Given here :

Apache Flink 1.4 Documentation: File Systems

Apache Flink 1.4 Documentation: File Systems

But I cannot figure out the file system for RocksDB. What are the recommendations for File Systems to be used with RocksDB?

Thanks in advance.

Regards,

Chirag

Marvin777

Re: Using RocksDB as State Backend over a Distributed File System

Hi,

I'm agree with Stefan. I think you can look at this document, given here: Apache Flink 1.4 Documentation：Checkpointing

Best,

Qingxiang Ma.

2018-04-26 20:00 GMT+08:00 Stefan Richter <[hidden email]>:

Hi,

I think there is a misunderstanding. RocksDB state backend always operates on local disk of the node that runs your task to give you optimal performance. You can think of this as a transient working area that does not require any durability. Durability always happens through checkpoints (or savepoints) which, in turn, go to a distributed storage. Checkpoints and checkpoints are like a consistent moment-in-time image of the backends content and can be used to recover under failure (checkpoints) or manually resume your job (savepoints).

Best,
Stefan

Am 26.04.2018 um 13:16 schrieb Chirag Dewan <[hidden email]>:

Hi,

I am working on a use case where I need to store a large amount of data in state. I am using RocksDB as my state backend. Now to ensure data replication, I want to store the RocksDB files in some distributed file system.

From the documentation I can see that Flink recommends a list of FileSystem to be used for state backend. Given here :

Apache Flink 1.4 Documentation: File Systems

Apache Flink 1.4 Documentation: File Systems

But I cannot figure out the file system for RocksDB. What are the recommendations for File Systems to be used with RocksDB?

Thanks in advance.

Regards,

Chirag

chiggi_dev

Re: Using RocksDB as State Backend over a Distributed File System

In reply to this post by Stefan Richter

Wow never considered it that way.

Thanks a lot for clarifying Stefan.

This gives rise to another question. Whats the format of this data? Is it the same format which is used to store checkpoints when FS state backend is used?

Regards,

Chirag

Sent from Yahoo Mail on Android

On Thu, 26 Apr 2018 at 5:30 PM, Stefan Richter
<[hidden email]> wrote:

Hi,

I think there is a misunderstanding. RocksDB state backend always operates on local disk of the node that runs your task to give you optimal performance. You can think of this as a transient working area that does not require any durability. Durability always happens through checkpoints (or savepoints) which, in turn, go to a distributed storage. Checkpoints and checkpoints are like a consistent moment-in-time image of the backends content and can be used to recover under failure (checkpoints) or manually resume your job (savepoints).

Best,
Stefan

Am 26.04.2018 um 13:16 schrieb Chirag Dewan <[hidden email]>:

Hi,

I am working on a use case where I need to store a large amount of data in state. I am using RocksDB as my state backend. Now to ensure data replication, I want to store the RocksDB files in some distributed file system.

From the documentation I can see that Flink recommends a list of FileSystem to be used for state backend. Given here :

Apache Flink 1.4 Documentation: File Systems

Apache Flink 1.4 Documentation: File Systems

But I cannot figure out the file system for RocksDB. What are the recommendations for File Systems to be used with RocksDB?

Thanks in advance.

Regards,

Chirag

Stefan Richter

Re: Using RocksDB as State Backend over a Distributed File System

On the local disk you have the normal RocksDB working directory consisting mainly of the SSTable files. In the checkpoint directory on distributed storage it depends on whether or not you are using incremental checkpoints. For incremental checkpoints, the files are essentially the SSTables uploaded to distributed storage (their names are changed to avoid collisions) and some files are logically shared across multiple checkpoints (not physically, i.e. you cannot simply see this sharing from the files or directories and this is done to avoid redundant duplication because SSTable files are immutable). For non-incremental checkpoints you will find files in Flink’s own checkpoint format, where each file represent a dumps of all the key/value pairs from one RocksDB instance plus some meta data.

Am 26.04.2018 um 15:22 schrieb Chirag Dewan <[hidden email]>:

Wow never considered it that way.

Thanks a lot for clarifying Stefan.

This gives rise to another question. Whats the format of this data? Is it the same format which is used to store checkpoints when FS state backend is used?

Regards,

Chirag

Sent from Yahoo Mail on Android

On Thu, 26 Apr 2018 at 5:30 PM, Stefan Richter
<[hidden email]> wrote:

Hi,

I think there is a misunderstanding. RocksDB state backend always operates on local disk of the node that runs your task to give you optimal performance. You can think of this as a transient working area that does not require any durability. Durability always happens through checkpoints (or savepoints) which, in turn, go to a distributed storage. Checkpoints and checkpoints are like a consistent moment-in-time image of the backends content and can be used to recover under failure (checkpoints) or manually resume your job (savepoints).

Best,
Stefan

Am 26.04.2018 um 13:16 schrieb Chirag Dewan <[hidden email]>:

Hi,

I am working on a use case where I need to store a large amount of data in state. I am using RocksDB as my state backend. Now to ensure data replication, I want to store the RocksDB files in some distributed file system.

From the documentation I can see that Flink recommends a list of FileSystem to be used for state backend. Given here :

Apache Flink 1.4 Documentation: File Systems

Apache Flink 1.4 Documentation: File Systems

But I cannot figure out the file system for RocksDB. What are the recommendations for File Systems to be used with RocksDB?

Thanks in advance.

Regards,

Chirag

Stefan Richter

Re: Using RocksDB as State Backend over a Distributed File System

In reply to this post by chiggi_dev

Adding one thing, the format of the non-incremental is similar but unfortunately not (yet) identical with the FS backend. This is because of some internal implementation details that allow the FS checkpoints to be slightly more consise in the file format but we might „de-optimize“ this minor difference for the sake of compatibility in the near future.

Am 26.04.2018 um 15:22 schrieb Chirag Dewan <[hidden email]>:

Wow never considered it that way.

Thanks a lot for clarifying Stefan.

This gives rise to another question. Whats the format of this data? Is it the same format which is used to store checkpoints when FS state backend is used?

Regards,

Chirag

Sent from Yahoo Mail on Android

On Thu, 26 Apr 2018 at 5:30 PM, Stefan Richter
<[hidden email]> wrote:

Hi,

I think there is a misunderstanding. RocksDB state backend always operates on local disk of the node that runs your task to give you optimal performance. You can think of this as a transient working area that does not require any durability. Durability always happens through checkpoints (or savepoints) which, in turn, go to a distributed storage. Checkpoints and checkpoints are like a consistent moment-in-time image of the backends content and can be used to recover under failure (checkpoints) or manually resume your job (savepoints).

Best,
Stefan

Am 26.04.2018 um 13:16 schrieb Chirag Dewan <[hidden email]>:

Hi,

I am working on a use case where I need to store a large amount of data in state. I am using RocksDB as my state backend. Now to ensure data replication, I want to store the RocksDB files in some distributed file system.

From the documentation I can see that Flink recommends a list of FileSystem to be used for state backend. Given here :

Apache Flink 1.4 Documentation: File Systems

Apache Flink 1.4 Documentation: File Systems

But I cannot figure out the file system for RocksDB. What are the recommendations for File Systems to be used with RocksDB?

Thanks in advance.

Regards,

Chirag

chiggi_dev

Re: Using RocksDB as State Backend over a Distributed File System

Thanks a lot Stefan. This clarifies everything.

Regards,

Chirag

On Thursday, 26 April, 2018, 7:16:52 PM IST, Stefan Richter <[hidden email]> wrote:

Am 26.04.2018 um 15:22 schrieb Chirag Dewan <[hidden email]>:

Wow never considered it that way.

Thanks a lot for clarifying Stefan.

This gives rise to another question. Whats the format of this data? Is it the same format which is used to store checkpoints when FS state backend is used?

Regards,

Chirag

Sent from Yahoo Mail on Android

On Thu, 26 Apr 2018 at 5:30 PM, Stefan Richter
<[hidden email]> wrote:

Hi,

I think there is a misunderstanding. RocksDB state backend always operates on local disk of the node that runs your task to give you optimal performance. You can think of this as a transient working area that does not require any durability. Durability always happens through checkpoints (or savepoints) which, in turn, go to a distributed storage. Checkpoints and checkpoints are like a consistent moment-in-time image of the backends content and can be used to recover under failure (checkpoints) or manually resume your job (savepoints).

Best,
Stefan

Am 26.04.2018 um 13:16 schrieb Chirag Dewan <[hidden email]>:

Hi,

I am working on a use case where I need to store a large amount of data in state. I am using RocksDB as my state backend. Now to ensure data replication, I want to store the RocksDB files in some distributed file system.

From the documentation I can see that Flink recommends a list of FileSystem to be used for state backend. Given here :

Apache Flink 1.4 Documentation: File Systems

Apache Flink 1.4 Documentation: File Systems

But I cannot figure out the file system for RocksDB. What are the recommendations for File Systems to be used with RocksDB?

Thanks in advance.

Regards,

Chirag