Hello I was reading this: https://stackoverflow.com/questions/61010970/flink-resume-from-externalised-checkpoint-question I am trying to run a standalone job on my local with a single job manager and task manager. I have enabled checkpointing as below: env.setStateBackend(new RocksDBStateBackend(“file:///Users/test/checkpoint", true)); After I stop my job (I also tried to cancel the job using bin/flink cancel -s /Users/test/savepoint <jobId>), I tried to start the same job using… ./standalone-job.sh start-foreground test.jar --job-id <UUID> --job-classname com.test.MyClass --fromSavepoint /Users/test/savepoint But it never restores the state, and always starts afresh. In Flink, I see this: StandaloneCompletedCheckpointStore * {@link CompletedCheckpointStore} for JobManagers running in {@link HighAvailabilityMode#NONE}. public void recover() throws Exception { // Nothing to do Does this have something to do with not being able to restore state? Does this need Zookeeper or K8S HA for functioning? Thanks, Sandeep |
Hi Sandeep, I think it should work fine with `StandaloneCompletedCheckpointStore`. Have you checked if your directory /Users/test/savepoint is being populated in the first place? And if so, if the restarted job is not throwing some exceptions like it can not access those files? Also note, that cancel with savepoint is deprecated and you should be using stop with savepoint [1] Piotrek pt., 26 mar 2021 o 18:55 Sandeep khanzode <[hidden email]> napisał(a):
|
Free forum by Nabble | Edit this page |