Hi, I am trying to test a scenario that triggers a savepoint on a Flink 1.7.1 Job deployed with jobmanager HA mode. The purpose is to check if savepoint process recovers if the leader jobmanager fails during the savepoint. During my testing, I found that the new leader jobmanager returns the below error for the savepoint trigger request – {"errors":["Operation not found under key: org.apache.flink.runtime.rest.handler.job.AsynchronousJobOperationKey@e287af3"]} Does Flink support savepoint process recovery in Jobmanager HA setup? If yes, can you please suggest how to find the savepoint request? Appreciate your time and help. ~ Abhinav Bajaj |
Hi Abhinav
If the leader jobmanager fails during savepoint, that savepoint would fail and new jobmanager would then restore from previous jobgraph with latest completed checkpoint in the high-availability storage. That's why
new jobmanager could not know anything with regard to previous savepoint.
Best
Yun Tang
From: Bajaj, Abhinav <[hidden email]>
Sent: Saturday, July 27, 2019 7:25 To: [hidden email] <[hidden email]> Subject: Savepoint process recovery in Jobmanager HA setup Hi,
I am trying to test a scenario that triggers a savepoint on a Flink 1.7.1 Job deployed with jobmanager HA mode. The purpose is to check if savepoint process recovers if the leader jobmanager fails during the savepoint.
During my testing, I found that the new leader jobmanager returns the below error for the savepoint trigger request – {"errors":["Operation not found under key: org.apache.flink.runtime.rest.handler.job.AsynchronousJobOperationKey@e287af3"]}
Does Flink support savepoint process recovery in Jobmanager HA setup? If yes, can you please suggest how to find the savepoint request?
Appreciate your time and help.
~ Abhinav Bajaj |
Thanks much for your response. I was also suspecting the same and just wanted to confirm.
I guess the best way forward for now is to request savepoint again. ~ Abhi From: Yun Tang <[hidden email]> Hi Abhinav If the leader jobmanager fails during savepoint, that savepoint would fail and new jobmanager would then restore from previous jobgraph with latest completed checkpoint in the high-availability storage. That's
why new jobmanager could not know anything with regard to previous savepoint. Best Yun Tang From: Bajaj, Abhinav <[hidden email]> Hi, I am trying to test a scenario that triggers a savepoint on a Flink 1.7.1 Job deployed with jobmanager HA mode. The purpose is to check if savepoint process recovers if the leader jobmanager fails during the savepoint. During my testing, I found that the new leader jobmanager returns the below error for the savepoint trigger request – {"errors":["Operation not found under key: org.apache.flink.runtime.rest.handler.job.AsynchronousJobOperationKey@e287af3"]} Does Flink support savepoint process recovery in Jobmanager HA setup? If yes, can you please suggest how to find the savepoint request? Appreciate your time and help. ~ Abhinav Bajaj |
Free forum by Nabble | Edit this page |