Hi,
It seems that the behaviour to store the checkpoint metadata files for externalized checkpoints changed from 1.4 to 1.5 and the docs seem to be incorrectly saying that: Now the external metadata files simply end up in the checkpoint dir as configured in the state backend so basically the "state.checkpoints.dir" param is overwritten by the job. This breaks a bunch of tooling around checkpoints that rely on a common place for metadata files instead of scattered around in subdirectories. I wonder is there any way to get the previous behaviour back? Thanks, Gyula |
Hi,
does that mean https://issues.apache.org/jira/browse/FLINK-5627 is no longer relevant for you, since it seems to request the behaviour that we have now? But yes, I think it's currently not possible (with out-of-box functionality) to write externalized-checkpoint metadata to a central location. There is this Jira issue which aims at implementing a solution: https://issues.apache.org/jira/browse/FLINK-9114. I quickly talked to Stephan, it seems to be that the meta info about externalized checkpoints is also written to the HA storage directory, maybe that's helpful for you. Best, Aljoscha
|
Hi! Well it depends on how we look at it FLINK-5627 is not necessarily the current behaviour. You still can't really specify the exact location from within the job as it now goes to a checkpoint specific place determined by the checkpoint dir. I will close the jira as 9114 covers it in a more generic way. I can probably work around the current behaviour in the meantime :) Cheers, Gyula Aljoscha Krettek <[hidden email]> ezt írta (időpont: 2018. júl. 12., Cs, 17:35):
|
Out of curiosity, how will you work around it? And how is it easier for your tooling if checkpoints are in a central location?
Best, Aljoscha
|
Hi, To be fair my "workaround" is to restore the previous behaviour in our Flink build as it only takes a few lines of code :) To explain why this is convenient I can describe how we use the checkpoints/savepoints. For every Streaming Application (not flink job) we have a unique name from which the savepoint directory is derived. This means that all savepoints + external metadata pointers should end up in this directory and we can easily pick out the latest for the current application when we want to restore. This is extremely convenient compared to crawling through a bunch of nested directories to find the latest and also makes migrating checkpoint directories and such things more straightforward. Cheers, Gyula Aljoscha Krettek <[hidden email]> ezt írta (időpont: 2018. júl. 13., P, 8:29):
|
Free forum by Nabble | Edit this page |