Incremental checkpointing documentation

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Incremental checkpointing documentation

Elias Levy
There doesn't appear to be much in the way of documentation for incremental checkpointing other than how to turn it on.  That leaves a lot of questions unanswered.

What is the interaction of incremental checkpointing and external checkpoints?

Any interaction with the state.checkpoints.num-retained config?

Does incremental checkpointing require any maintenance?

Any interaction with savepoints?

Does it perform better against certain "file systems"?  E.g. it S3 not recommended for it?  How about EFS?


Reply | Threaded
Open this post in threaded view
|

Re: Incremental checkpointing documentation

Nico Kruber
Hi Elias,
let me answer the questions to the best of my knowledge, but in general I
think this is as expected.
(Let me give a link to the docs explaining the activation [1] for other
readers first.)

On Friday, 3 November 2017 01:11:52 CET Elias Levy wrote:
> What is the interaction of incremental checkpointing and external
> checkpoints?

Externalized checkpoints may be incremental [2] (I'll fix the formatting error
that is not rendering the arguments as a list, making them less visible)

> Any interaction with the state.checkpoints.num-retained config?

Yes, this remains the number of available checkpoints. There may, however, be
more folders containing RocksDB state that was originally put into checkpoint
X but is also still required in checkpoint X+10 or so. These files will be
cleaned up once they are not needed anymore.

> Does incremental checkpointing require any maintenance?

No, state is cleaned up once it is not used/referenced anymore.

> Any interaction with savepoints?

No, a savepoint uses Flink's own data format and is not incremental [3].

> Does it perform better against certain "file systems"?  E.g. it S3 not
> recommended for it?  How about EFS?

I can't think of a reason this should be any different to non-incremental
checkpoints. Maybe Stefan (cc'd) has some more info on this.

For more details on the whole topic, I can recommend Stefan's talk at the last
Flink Forward [4] though.


Nico


[1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/
large_state_tuning.html#tuning-rocksdb
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/
checkpoints.html#difference-to-savepoints
[3] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/
savepoints.html
[4] https://www.youtube.com/watch?
v=dWQ24wERItM&index=36&list=PLDX4T_cnKjD0JeULl1X6iTn7VIkDeYX_X

signature.asc (201 bytes) Download Attachment