|
Hi Sameer,
In case of a failure, the job will restarts the operators and resets them to the latest successful checkpoint. So if you turn off checkpoints, all data will be lost.
Generally speaking, snapshots are very light-weight and can be drawn frequently without much impact on performance. If it do affect performance of your job and you don't want to lose all of your state, you can try to increase the checkpoint interval.
// start a checkpoint every 600000 ms (10min) env.enableCheckpointing(600000);
Best, Hequn Hi,
We have a job which is using ValueState. We have turned off checkpoints. The state is backed by rocksdb which is backed by S3.
If the job fails for any exception (ex. Partitions not available or an occasional S3 404 error) and auto-recovers, is the entire state lost or does it continue from the last saved state. We see that the job has the same identifier. We don’t mind losing data during the small interval when the job is recovering. But because we are using ValueState as a custom global window to accumulate state for a key over a 3 hour window we don’t want to lose all of it.
Checkpointing is not an option because it takes longer per checkpoint and the state is huge.
Thanks,
Sameer
Sent from my iPhone
|