Re: Making job fail on Checkpoint Expired?
Posted by
Timo Walther on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Making-job-fail-on-Checkpoint-Expired-tp34051p34053.html
Hi Robin,
this is a very good observation and maybe even unintended behavior.
Maybe Arvid in CC is more familiar with the checkpointing?
Regards,
Timo
On 02.04.20 15:37, Robin Cassan wrote:
> Hi all,
>
> I am wondering if there is a way to make a flink job fail (not cancel
> it) when one or several checkpoints have failed due to being expired
> (taking longer than the timeout) ?
> I am using Flink 1.9.2 and have set
> `*setTolerableCheckpointFailureNumber(1)*` which doesn't do the trick.
> Looking into the CheckpointFailureManager.java class, it looks like this
> only works when the checkpoint failure reason is
> `*CHECKPOINT_DECLINED*`, but the number of failures isn't incremented on
> `*CHECKPOINT_EXPIRED*`.
> Am I missing something?
>
> Thanks!