Re: Externalized checkpoints and metadata

Posted by hao gao on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Externalized-checkpoints-and-metadata-tp19751p19764.html

Hi Juan,

We modified the flink code a little bit to change the flink checkpoint structure so we can easily identify which is which
you can read my note or the PR
https://medium.com/hadoop-noob/flink-externalized-checkpoint-eb86e693cfed
https://github.com/BranchMetrics/flink/pull/6/files
Hope it helps

Thanks
Hao

2018-04-25 6:07 GMT-07:00 Juan Gentile <[hidden email]>:

Hello,

 

We are trying to use externalized checkpoints, using RocksDB on Hadoop hdfs.

We would like to know what is the proper way to resume from a saved checkpoint as we are currently running many jobs in the same flink cluster.

 

The problem is that when we want to restart the jobs and pass the metadata file (or directory) there is 1 file per job but they are not easily identifiable based on the name:

Example

/checkpoints/checkpoint_metadata-69053704a5ca

/checkpoints/checkpoint_metadata-c7c016909607

 

We are not using savepoints and reading the documentation I see there are 2 ways to resume, 1 passing the metadata file (not possible as we have many jobs) and the other passing the directory,

But by default it looks for a _metadata file which doesn’t exist.

 

Thank you,

Juan G.




--
Thanks
 - Hao