Re: Dynamic configuration of Flink checkpoint interval

Posted by Kai Fu on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Dynamic-configuration-of-Flink-checkpoint-interval-tp44059p44066.html

Hi JING,

Here is the issue link: https://issues.apache.org/jira/browse/FLINK-22805

On Mon, May 31, 2021 at 10:21 AM JING ZHANG <[hidden email]> wrote:
Hi Kai,

Happy to hear that. 
Would you please paste the JIRA link in the email after you create it. Maybe it could help other users who encounter the same problem. Thanks very much.

Best regards,
JING ZHANG

Kai Fu <[hidden email]> 于2021年5月30日周日 下午11:19写道:
Hi Jing,

Yup, what you're describing is what I want. I also tried the approach you suggested and it works. I'm going to take that approach for the moment and create a Jira issue for this feature.

On Sun, May 30, 2021 at 8:57 PM JING ZHANG <[hidden email]> wrote:
Hi Kai,

Do you try to find a way to hot update checkpoint interval or disable/enable checkpoint without stop and restart job?
Unfortunately, it is not supported yet, AFAIK. 
You're very welcome to create an issue and describe your needs here (Flink’s Jira) .
At present, you may would like to use the following temporary solution:
  1. set a bigger value as checkpoint interval, start your job
  2. do a savepoint after cold start is completed
  3. set a normal value as checkpoint interval, restart the job from savepoint

Best regards,
JING ZHANG

Kai Fu <[hidden email]> 于2021年5月30日周日 下午7:13写道:
Hi team,

We want to know if Flink has some dynamic configuration of the checkpoint interval. Our use case has a cold start phase where the entire dataset is replayed from the beginning until the most recent ones.

In the cold start phase, the resources are fully utilized and the backpressure is high for all upstream operators, causing the checkpoint timeout constantly. The real production traffic is far less than that and the current provisioned resource is capable of handling it. 

We're thinking if Flink can support the dynamic checkpoint config to bypass the checkpoint operation or make it less frequent on the cold start phase to speed up the process, while making the checkpoint normal again once the cold start is completed.

--
Best wishes,
- Kai


--
Best wishes,
- Kai


--
Best wishes,
- Kai