check-pointing does not follow interval setting on some clusters

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

check-pointing does not follow interval setting on some clusters

Yu Yang
Hi all, 

We use flink 1.9.1 for development, and observed irregular check-pointing interval in one of our clusters. That is unexpected, given that we have had "env.enableCheckpointing(Time.minutes(5).toMilliseconds)" in our code to set the checkpointing interval to 5 minutes. It is also intriguing that this issue appears on one cluster, and does not appear on the other cluster that we have. Any insights on this?

The screen shot below shows the irregular check-pointing interval, with "env.enableCheckpointing(Time.minutes(5).toMilliseconds)" in our code.  We can see that the check-pointing intervals are > 5 minutes 
Screen Shot 2020-02-03 at 12.31.21 AM.png

The screen shot below shows that check-pointing triggers at 5-minutes cadence in another cluster. The job's check-pointing interval is at 5-minutes.

Screen Shot 2020-02-03 at 12.31.01 AM.png

Thanks!

Regards, 
-Yu

Reply | Threaded
Open this post in threaded view
|

Re: check-pointing does not follow interval setting on some clusters

Fabian Hueske-2
Hi Yu,

This looks indeed strange.
There is a configuration that limits the number of concurrent checkpoints but given the end-to-end duration this cannot be the reason.

Is the JobManager in the first setup maybe overloaded?
Can you post your complete checkpointing configuration?

Best,
Fabian

Am Mo., 3. Feb. 2020 um 09:37 Uhr schrieb Yu Yang <[hidden email]>:
Hi all, 

We use flink 1.9.1 for development, and observed irregular check-pointing interval in one of our clusters. That is unexpected, given that we have had "env.enableCheckpointing(Time.minutes(5).toMilliseconds)" in our code to set the checkpointing interval to 5 minutes. It is also intriguing that this issue appears on one cluster, and does not appear on the other cluster that we have. Any insights on this?

The screen shot below shows the irregular check-pointing interval, with "env.enableCheckpointing(Time.minutes(5).toMilliseconds)" in our code.  We can see that the check-pointing intervals are > 5 minutes 
Screen Shot 2020-02-03 at 12.31.21 AM.png

The screen shot below shows that check-pointing triggers at 5-minutes cadence in another cluster. The job's check-pointing interval is at 5-minutes.

Screen Shot 2020-02-03 at 12.31.01 AM.png

Thanks!

Regards, 
-Yu

Reply | Threaded
Open this post in threaded view
|

Re: check-pointing does not follow interval setting on some clusters

Congxian Qiu
Hi,

This is indeed strange. 
From the screenshot, the checkpoints complete very soon. Could you please share the checkpoint configure and jobmanager log
Best,
Congxian


Fabian Hueske <[hidden email]> 于2020年2月4日周二 下午6:48写道:
Hi Yu,

This looks indeed strange.
There is a configuration that limits the number of concurrent checkpoints but given the end-to-end duration this cannot be the reason.

Is the JobManager in the first setup maybe overloaded?
Can you post your complete checkpointing configuration?

Best,
Fabian

Am Mo., 3. Feb. 2020 um 09:37 Uhr schrieb Yu Yang <[hidden email]>:
Hi all, 

We use flink 1.9.1 for development, and observed irregular check-pointing interval in one of our clusters. That is unexpected, given that we have had "env.enableCheckpointing(Time.minutes(5).toMilliseconds)" in our code to set the checkpointing interval to 5 minutes. It is also intriguing that this issue appears on one cluster, and does not appear on the other cluster that we have. Any insights on this?

The screen shot below shows the irregular check-pointing interval, with "env.enableCheckpointing(Time.minutes(5).toMilliseconds)" in our code.  We can see that the check-pointing intervals are > 5 minutes 
Screen Shot 2020-02-03 at 12.31.21 AM.png

The screen shot below shows that check-pointing triggers at 5-minutes cadence in another cluster. The job's check-pointing interval is at 5-minutes.

Screen Shot 2020-02-03 at 12.31.01 AM.png

Thanks!

Regards, 
-Yu