Re: Checkpoint fail due to timeout

Posted by Alexey Trenikhun on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Checkpoint-fail-due-to-timeout-tp42125p42338.html

According to [1] checkpoints do not support Flink specific features like rescaling, but I can try. Thank you for suggestions

[1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#difference-to-savepoints



From: ChangZhuo Chen (陳昌倬)
Sent: Wednesday, March 17, 2021 12:29 AM
To: Alexey Trenikhun
Cc: [hidden email]; Flink User Mail List
Subject: Re: Checkpoint fail due to timeout

On Wed, Mar 17, 2021 at 05:45:38AM +0000, Alexey Trenikhun wrote:
> In my opinion looks similar. Were you able to tune-up Flink to make it work? I'm stuck with it, I wanted to scale up hoping to reduce backpressure, but to rescale I need to take savepoint, which never completes (at least takes longer than 3 hours).

You can use aligned checkpoint to scala your job. Just restarting from
checkpoint with the same jar file, and new parallelism shall do the
trick.


--
ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org
http://czchen.info/
Key fingerprint = BA04 346D C2E1 FE63 C790  8793 CC65 B0CD EC27 5D5B