In a checkpointed Flink job will doing a graceful restart make it resume from last known internal checkpoint? Or are all checkpoints discarded when the job is stopped?
If discarded, what will be the resume point? Moiz |
Hi Moiz,
yes the job will be restartet in case of failure using the last successful checkpoint. If you cancel the job, the checkpoints will be discarded. That's why Flink has savepoints [1] in order to store checkpoints permantently (with additional meta-information). If there is no checkpoint/savepoint, the job would start with empty state in all operators which also means that Kafka offets are reset. If you are interested in checkpointing internals, you can find more information here [2]. Regards, Timo [1] https://data-artisans.com/blog/turning-back-time-savepoints [2] https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/stream_checkpointing.html Am 30.05.17 um 18:47 schrieb Moiz S Jinia: > In a checkpointed Flink job will doing a graceful restart make it > resume from last known internal checkpoint? Or are all checkpoints > discarded when the job is stopped? > > If discarded, what will be the resume point? > > Moiz |
Additionally, externalized checkpoints [3] may be retained after cancelling a
job. However, externalized checkpoints do not support rescaling (some documentation improvements on this part are already present in a PR[4]). Nico [3] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/ checkpoints.html [4] https://github.com/apache/flink/pull/4033 On Wednesday, 31 May 2017 16:17:36 CEST Timo Walther wrote: > Hi Moiz, > > yes the job will be restartet in case of failure using the last > successful checkpoint. If you cancel the job, the checkpoints will be > discarded. That's why Flink has savepoints [1] in order to store > checkpoints permantently (with additional meta-information). If there is > no checkpoint/savepoint, the job would start with empty state in all > operators which also means that Kafka offets are reset. > > If you are interested in checkpointing internals, you can find more > information here [2]. > > Regards, > Timo > > [1] https://data-artisans.com/blog/turning-back-time-savepoints > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/stream > _checkpointing.html > Am 30.05.17 um 18:47 schrieb Moiz S Jinia: > > In a checkpointed Flink job will doing a graceful restart make it > > resume from last known internal checkpoint? Or are all checkpoints > > discarded when the job is stopped? > > > > If discarded, what will be the resume point? > > > > Moiz signature.asc (201 bytes) Download Attachment |
In reply to this post by Moiz Jinia
Bump.. On Tue, May 30, 2017 at 10:17 PM, Moiz S Jinia <[hidden email]> wrote:
|
Hi Moiz,
didn't Timo's answer cover your questions? see here in case you didn't receive it: https://lists.apache.org/thread.html/ a1a0d04e7707f4b0ac8b8b2f368110b898b2ba11463d32f9bba73968@ %3Cuser.flink.apache.org%3E Nico On Thursday, 1 June 2017 20:30:59 CEST Moiz S Jinia wrote: > Bump.. > > On Tue, May 30, 2017 at 10:17 PM, Moiz S Jinia <[hidden email]> wrote: > > In a checkpointed Flink job will doing a graceful restart make it resume > > from last known internal checkpoint? Or are all checkpoints discarded when > > the job is stopped? > > > > If discarded, what will be the resume point? > > > > Moiz signature.asc (201 bytes) Download Attachment |
Thanks for that! Yes I indeed did not receive those emails. And my question is answered. Moiz On Fri, Jun 2, 2017 at 12:46 PM, Nico Kruber <[hidden email]> wrote: Hi Moiz, |
Free forum by Nabble | Edit this page |