Hi, When i test 1.10.0, i found i must to set savepoint path otherwise i can’t stop the job. I confuse about this, beacuse as i know, savepoint offen large than checkpoint, so i usually resume job from checkpoint. Another problem is sometimes job throw exception and i can’t trigger a savepoint, so i cancel the job and change logical, resume it from last checkpoint. although sometimes will failed, i think this will be a better way, because i can choose cancel with a savepoint or not, so i can decede how to resume. but in 1.10.0, i must to set it, and seems system will trigger savepoint, i think this will take more risk, and it will delete checkpoint even i set retain on cancellation. so i have no checkpoint left. If i use cancel <jobID>, it will break with exception. So how to work with 1.10.0 ? any advice will be helpful. Thanks.
|
Hi
I think you could still use ./bin/flink cancel <jobID> to cancel the job. What is the exception thrown?
Best
Yun Tang
From: seeksst <[hidden email]>
Sent: Wednesday, April 22, 2020 18:17 To: user <[hidden email]> Subject: Flink 1.10.0 stop command Hi,
When i test 1.10.0, i found i must to set savepoint path otherwise i can’t stop the job. I confuse about this, beacuse as i know, savepoint offen large than checkpoint, so i usually resume job from checkpoint. Another problem is sometimes job throw exception and i can’t trigger a savepoint, so i cancel the job and change logical, resume it from last checkpoint. although sometimes will failed, i think this will be a better way, because i can choose cancel with a savepoint or not, so i can decede how to resume. but in 1.10.0, i must to set it, and seems system will trigger savepoint, i think this will take more risk, and it will delete checkpoint even i set retain on cancellation. so i have no checkpoint left. If i use cancel <jobID>, it will break with exception. So how to work with 1.10.0 ? any advice will be helpful. Thanks.
|
Yun Tang <[hidden email]> 于2020年4月23日周四 上午1:18写道:
|
To be precise, the cancel command would succeed on cluster side but the response *might* lost so that client throws with TimeoutException. If it is the case, this is the root which will be fixed in 1.10.1. Best, tison. tison <[hidden email]> 于2020年4月23日周四 上午1:20写道:
|
In reply to this post by seeksst
Thanks a lot. I’m glad to hear that and looking forward to 1.10.1 it there more plan about stop command? it will replace cancel in future. Is the state.savepoints.dir required at the end? 原始邮件 发件人: tison<[hidden email]> 收件人: Yun Tang<[hidden email]> 抄送: seeksst<[hidden email]>; user<[hidden email]> 发送时间: 2020年4月23日(周四) 01:21 主题: Re: Flink 1.10.0 stop command To be precise, the cancel command would succeed on cluster side but the response *might* lost so that client throws with TimeoutException. If it is the case, this is the root which will be fixed in 1.10.1. Best, tison. tison <[hidden email]> 于2020年4月23日周四 上午1:20写道:
|
Free forum by Nabble | Edit this page |