savepoint - checkpoint - directory

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

savepoint - checkpoint - directory

Fanbin Bu
Hi,

For savepoint, the dir looks like
s3://bucket/savepoint-jobid/*

To resume, i do:
flink run -s s3://bucket/savepoint-jobid/
perfect!


For checkpoint, the dir looks like
s3://bucket/jobid/chk-100
s3://bucket/jobid/shared.   <-- what is this for?

To resume, which one should i do:
flink run -s s3://bucket/jobid
or 
flink run -s s3://bucket/jobid/chk-100


Another question, I saw that `flink cancel` is deprecated and recommend to use `flink stop`. But isn't this causing production down time? In order to avoid down time, is it recommended to just do `flink savepoint`?

Thanks,
Fanbin
Reply | Threaded
Open this post in threaded view
|

Re: savepoint - checkpoint - directory

Yun Tang
Hi Fanbin

To resume from checkpoint, you should provide at least the directory named as /path/chk-x or /path/chk-x/_metadata. The sub-dir named as “shared” is used to store incremental  checkpoint content. You could refer to [1] for more information.

BTW, stop with savepoint could help reduce source rewind time.


From: Fanbin Bu <[hidden email]>
Sent: Thursday, March 26, 2020 2:53:29 AM
To: user <[hidden email]>
Subject: savepoint - checkpoint - directory
 
Hi,

For savepoint, the dir looks like
s3://bucket/savepoint-jobid/*

To resume, i do:
flink run -s s3://bucket/savepoint-jobid/
perfect!


For checkpoint, the dir looks like
s3://bucket/jobid/chk-100
s3://bucket/jobid/shared.   <-- what is this for?

To resume, which one should i do:
flink run -s s3://bucket/jobid
or 
flink run -s s3://bucket/jobid/chk-100


Another question, I saw that `flink cancel` is deprecated and recommend to use `flink stop`. But isn't this causing production down time? In order to avoid down time, is it recommended to just do `flink savepoint`?

Thanks,
Fanbin