(DEPRECATED) Apache Flink User Mailing List archive.

Does job restart resume from last known internal checkpoint?

Classic

List

Threaded

6 messages Options

Moiz Jinia

Does job restart resume from last known internal checkpoint?

In a checkpointed Flink job will doing a graceful restart make it resume from last known internal checkpoint? Or are all checkpoints discarded when the job is stopped?

If discarded, what will be the resume point?

Moiz

Timo Walther

Re: Does job restart resume from last known internal checkpoint?

Hi Moiz,

yes the job will be restartet in case of failure using the last
successful checkpoint. If you cancel the job, the checkpoints will be
discarded. That's why Flink has savepoints [1] in order to store
checkpoints permantently (with additional meta-information). If there is
no checkpoint/savepoint, the job would start with empty state in all
operators which also means that Kafka offets are reset.

If you are interested in checkpointing internals, you can find more
information here [2].

Regards,
Timo

[1] https://data-artisans.com/blog/turning-back-time-savepoints
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/stream_checkpointing.html

Am 30.05.17 um 18:47 schrieb Moiz S Jinia:
> In a checkpointed Flink job will doing a graceful restart make it
> resume from last known internal checkpoint? Or are all checkpoints
> discarded when the job is stopped?
>
> If discarded, what will be the resume point?
>
> Moiz

Nico Kruber

Re: Does job restart resume from last known internal checkpoint?

Additionally, externalized checkpoints [3] may be retained after cancelling a
job. However, externalized checkpoints do not support rescaling (some
documentation improvements on this part are already present in a PR[4]).

Nico

[3] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/
checkpoints.html
[4] https://github.com/apache/flink/pull/4033

On Wednesday, 31 May 2017 16:17:36 CEST Timo Walther wrote:

> Hi Moiz,
>
> yes the job will be restartet in case of failure using the last
> successful checkpoint. If you cancel the job, the checkpoints will be
> discarded. That's why Flink has savepoints [1] in order to store
> checkpoints permantently (with additional meta-information). If there is
> no checkpoint/savepoint, the job would start with empty state in all
> operators which also means that Kafka offets are reset.
>
> If you are interested in checkpointing internals, you can find more
> information here [2].
>
> Regards,
> Timo
>
> [1] https://data-artisans.com/blog/turning-back-time-savepoints
> [2]
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/stream
> _checkpointing.html
> Am 30.05.17 um 18:47 schrieb Moiz S Jinia:
> > In a checkpointed Flink job will doing a graceful restart make it
> > resume from last known internal checkpoint? Or are all checkpoints
> > discarded when the job is stopped?
> >
> > If discarded, what will be the resume point?
> >
> > Moiz

signature.asc (201 bytes) Download Attachment

Moiz Jinia

Re: Does job restart resume from last known internal checkpoint?

In reply to this post by Moiz Jinia

Bump..

On Tue, May 30, 2017 at 10:17 PM, Moiz S Jinia <[hidden email]> wrote:

In a checkpointed Flink job will doing a graceful restart make it resume from last known internal checkpoint? Or are all checkpoints discarded when the job is stopped?

If discarded, what will be the resume point?

Moiz

Nico Kruber

Re: Does job restart resume from last known internal checkpoint?

Hi Moiz,
didn't Timo's answer cover your questions?

see here in case you didn't receive it:
https://lists.apache.org/thread.html/
a1a0d04e7707f4b0ac8b8b2f368110b898b2ba11463d32f9bba73968@
%3Cuser.flink.apache.org%3E

Nico

On Thursday, 1 June 2017 20:30:59 CEST Moiz S Jinia wrote:

> Bump..
>
> On Tue, May 30, 2017 at 10:17 PM, Moiz S Jinia <[hidden email]> wrote:
> > In a checkpointed Flink job will doing a graceful restart make it resume
> > from last known internal checkpoint? Or are all checkpoints discarded when
> > the job is stopped?
> >
> > If discarded, what will be the resume point?
> >
> > Moiz

signature.asc (201 bytes) Download Attachment

Moiz Jinia

Re: Does job restart resume from last known internal checkpoint?

Thanks for that! Yes I indeed did not receive those emails. And my question is answered.

Moiz

On Fri, Jun 2, 2017 at 12:46 PM, Nico Kruber <[hidden email]> wrote:

Hi Moiz,
didn't Timo's answer cover your questions?

see here in case you didn't receive it:
https://lists.apache.org/thread.html/
a1a0d04e7707f4b0ac8b8b2f368110b898b2ba11463d32f9bba73968@
%3Cuser.flink.apache.org%3E

Nico

On Thursday, 1 June 2017 20:30:59 CEST Moiz S Jinia wrote:
> Bump..
>
> On Tue, May 30, 2017 at 10:17 PM, Moiz S Jinia <[hidden email]> wrote:
> > In a checkpointed Flink job will doing a graceful restart make it resume
> > from last known internal checkpoint? Or are all checkpoints discarded when
> > the job is stopped?
> >
> > If discarded, what will be the resume point?
> >
> > Moiz