Stateful function and large state applications

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Stateful function and large state applications

Lian Jiang
Hi,

I am learning Stateful function and saw below:

"In addition to the Apache Flink processes, a full deployment requires ZooKeeper (for master failover) and bulk storage (S3, HDFS, NAS, GCS, Azure Blob Store, etc.) to store Flink’s checkpoints. In turn, the deployment requires no database, and Flink processes do not require persistent volumes."

Does this mean stateful function does not support rocksdb (and incremental checkpoint, local task recovery)? Will it be an issue for large state (e.g. 200GB) applications? Thanks for clarifying.


Thanks
Lian
Reply | Threaded
Open this post in threaded view
|

Re: Stateful function and large state applications

Tzu-Li (Gordon) Tai
Hi,

The StateFun runtime is built directly on top of Apache Flink, so RocksDB as the state backend is supported as well as all the features for large state such as checkpointing and local task recovery.

Cheers,
Gordon


On Wed, Oct 14, 2020 at 11:49 AM Lian Jiang <[hidden email]> wrote:
Hi,

I am learning Stateful function and saw below:

"In addition to the Apache Flink processes, a full deployment requires ZooKeeper (for master failover) and bulk storage (S3, HDFS, NAS, GCS, Azure Blob Store, etc.) to store Flink’s checkpoints. In turn, the deployment requires no database, and Flink processes do not require persistent volumes."

Does this mean stateful function does not support rocksdb (and incremental checkpoint, local task recovery)? Will it be an issue for large state (e.g. 200GB) applications? Thanks for clarifying.


Thanks
Lian