Hi,
My job on Flink 1.10 uses RocksDB with incremental checkpointing enabled. The checkpoints are retained on cancellation. How do I resume from the retained checkpoint after cancellation (e.g., when upgrading the job binary)? Docs say to use the checkpoint or savepoint metadata file, but AFAICT there's no metadata file in HDFS in the various directories under "$checkpointsDir/snapshots/$jobID", Thanks, Jeff Martin |
Hi Jeff You can restore from retained checkpoint such as[1] `bin/flink run -s :checkpointMetaDataPath [:runArgs]` , you may find the metadata in the `chk-xxx` directory[2] Jeffrey Martin <[hidden email]> 于2020年9月15日周二 下午1:30写道:
|
Thanks for the quick reply Congxian. The non-empty chk-N directories I looked at contained only files whose names are UUIDs. Nothing named _metadata (unless HDFS hides files that start with an underscore?). Just to be clear though -- I should expect a metadata file when using incremental checkpoints? On Mon, Sep 14, 2020 at 10:46 PM Congxian Qiu <[hidden email]> wrote:
|
Hi Jeff Sorry for the late reply. You can only restore the checkpoint in which there is a _metadata in the chk-xxx directory, if there is not _metadata in the chk-xxx directory, that means the chk-xxx is not complete, you can't restore from it. Best, Congxian Jeffrey Martin <[hidden email]> 于2020年9月15日周二 下午2:18写道:
|
Free forum by Nabble | Edit this page |