Flink with Ceph as the persistent storage

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink with Ceph as the persistent storage

Jayant Ameta
Hi,
Flink documents suggests that Ceph can be used as a persistent storage for states. https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/checkpointing.html

Considering that Ceph is a transactional database, wouldn't it have adverse effect on Flink's performance?


Reply | Threaded
Open this post in threaded view
|

Re: Flink with Ceph as the persistent storage

Gyula Fóra
Hi,

To my understanding Ceph as in http://ceph.com/ceph-storage/  is a block based object storage system. You can use it mounted to your server and will behave as a local file system to most extent but will be shared in the cluster.

The performance might not be as good as with HDFS to our experience.

Gyula

Jayant Ameta <[hidden email]> ezt írta (időpont: 2017. dec. 5., K, 12:00):
Hi,
Flink documents suggests that Ceph can be used as a persistent storage for states. https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/checkpointing.html

Considering that Ceph is a transactional database, wouldn't it have adverse effect on Flink's performance?


Reply | Threaded
Open this post in threaded view
|

Re: Flink with Ceph as the persistent storage

Jayant Ameta
If the checkpointing to Ceph happens asynchronously, does it still have any impact on the stream processing?

Jayant Ameta

On Tue, Dec 5, 2017 at 4:34 PM, Gyula Fóra <[hidden email]> wrote:
Hi,

To my understanding Ceph as in http://ceph.com/ceph-storage/  is a block based object storage system. You can use it mounted to your server and will behave as a local file system to most extent but will be shared in the cluster.

The performance might not be as good as with HDFS to our experience.

Gyula

Jayant Ameta <[hidden email]> ezt írta (időpont: 2017. dec. 5., K, 12:00):
Hi,
Flink documents suggests that Ceph can be used as a persistent storage for states. https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/checkpointing.html

Considering that Ceph is a transactional database, wouldn't it have adverse effect on Flink's performance?



Reply | Threaded
Open this post in threaded view
|

Re: Flink with Ceph as the persistent storage

Gyula Fóra
It would be the same as with any other form of async checkpointing. No direct blocking of processing but the network traffic might indirectly affect it to some extent :)

Jayant Ameta <[hidden email]> ezt írta (időpont: 2017. dec. 5., K, 12:15):
If the checkpointing to Ceph happens asynchronously, does it still have any impact on the stream processing?

Jayant Ameta

On Tue, Dec 5, 2017 at 4:34 PM, Gyula Fóra <[hidden email]> wrote:
Hi,

To my understanding Ceph as in http://ceph.com/ceph-storage/  is a block based object storage system. You can use it mounted to your server and will behave as a local file system to most extent but will be shared in the cluster.

The performance might not be as good as with HDFS to our experience.

Gyula

Jayant Ameta <[hidden email]> ezt írta (időpont: 2017. dec. 5., K, 12:00):
Hi,
Flink documents suggests that Ceph can be used as a persistent storage for states. https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/checkpointing.html

Considering that Ceph is a transactional database, wouldn't it have adverse effect on Flink's performance?