Cron style for checkpoint

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Cron style for checkpoint

shuwen zhou
Hi Community,
I would like to know if there is a existing function to support cron style checkpoint?
The case is, our data traffic is huge on HH:30 for each hour. We don't wont checkpoint to fall in that range of time. A cron like 15,45 * * * * to set for checkpoint would be nice. If a checkpoint is already in progress when minutes is 15 or 45, there would be a config value to trigger a new checkpoint or pass.

--
Best Wishes,

Reply | Threaded
Open this post in threaded view
|

Re:Cron style for checkpoint

bupt_ljy

Hi Shuwen,


As far as I know, Flink can only support checkpoint with a fixed interval. 


However I think the flexible mechanism of triggering checkpoint is worth working on, at least from my perspective. And it may not only be a cron style. In our business scenario, the data traffic usually reaches the peek of the day after 20:00, which we want to increase the interval of checkpoint otherwise it’ll introduce more disk and network IO.


Just want to share something about this :)



Best,

Jiayi Liao



At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote: >Hi Community, >I would like to know if there is a existing function to support cron style >checkpoint? >The case is, our data traffic is huge on HH:30 for each hour. We don't wont >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set >for checkpoint would be nice. If a checkpoint is already in progress when >minutes is 15 or 45, there would be a config value to trigger a new >checkpoint or pass. > >-- >Best Wishes, >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>


 

Reply | Threaded
Open this post in threaded view
|

Re: Cron style for checkpoint

shuwen zhou
Hi Jiayi,
It would be great if Flink could have a user defined interface for user to implement to control checkpoint behavior, at least for time related behavior.
I brought up a wish on JIRA [1], perhaps it described clearly enough.



On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote:

Hi Shuwen,


As far as I know, Flink can only support checkpoint with a fixed interval. 


However I think the flexible mechanism of triggering checkpoint is worth working on, at least from my perspective. And it may not only be a cron style. In our business scenario, the data traffic usually reaches the peek of the day after 20:00, which we want to increase the interval of checkpoint otherwise it’ll introduce more disk and network IO.


Just want to share something about this :)



Best,

Jiayi Liao



At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote: >Hi Community, >I would like to know if there is a existing function to support cron style >checkpoint? >The case is, our data traffic is huge on HH:30 for each hour. We don't wont >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set >for checkpoint would be nice. If a checkpoint is already in progress when >minutes is 15 or 45, there would be a config value to trigger a new >checkpoint or pass. > >-- >Best Wishes, >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>


 



--
Best Wishes,

Reply | Threaded
Open this post in threaded view
|

Re: Cron style for checkpoint

Congxian Qiu
Hi

Currently, Flink does not support such feature, from what you describe, does set an appropriate timeout for checkpoint can solve your problem?

Best,
Congxian


shuwen zhou <[hidden email]> 于2019年11月21日周四 下午12:06写道:
Hi Jiayi,
It would be great if Flink could have a user defined interface for user to implement to control checkpoint behavior, at least for time related behavior.
I brought up a wish on JIRA [1], perhaps it described clearly enough.



On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote:

Hi Shuwen,


As far as I know, Flink can only support checkpoint with a fixed interval. 


However I think the flexible mechanism of triggering checkpoint is worth working on, at least from my perspective. And it may not only be a cron style. In our business scenario, the data traffic usually reaches the peek of the day after 20:00, which we want to increase the interval of checkpoint otherwise it’ll introduce more disk and network IO.


Just want to share something about this :)



Best,

Jiayi Liao



At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote: >Hi Community, >I would like to know if there is a existing function to support cron style >checkpoint? >The case is, our data traffic is huge on HH:30 for each hour. We don't wont >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set >for checkpoint would be nice. If a checkpoint is already in progress when >minutes is 15 or 45, there would be a config value to trigger a new >checkpoint or pass. > >-- >Best Wishes, >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>


 



--
Best Wishes,

Reply | Threaded
Open this post in threaded view
|

Re: Cron style for checkpoint

Yun Tang

Hi Shuwen

 

Conceptually, checkpoints in Flink behaves more like a system mechanism to achieve fault tolerance and transparent for users. On the other hand, savepoint in Flink behaves more like a user control behavior, can savepoint not satisfy your demands for crontab?

 

Best

Yun Tang

 

From: Congxian Qiu <[hidden email]>
Date: Thursday, November 21, 2019 at 2:27 PM
To: shuwen zhou <[hidden email]>
Cc: Jiayi Liao <[hidden email]>, dev <[hidden email]>, user <[hidden email]>
Subject: Re: Cron style for checkpoint

 

Hi

 

Currently, Flink does not support such feature, from what you describe, does set an appropriate timeout for checkpoint can solve your problem?


Best,

Congxian

 

 

shuwen zhou <[hidden email]> 20191121日周四 下午12:06写道:

Hi Jiayi,

It would be great if Flink could have a user defined interface for user to implement to control checkpoint behavior, at least for time related behavior.

I brought up a wish on JIRA [1], perhaps it described clearly enough.

 

 

 

On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote:

Hi Shuwen,

 

As far as I know, Flink can only support checkpoint with a fixed interval. 

 

However I think the flexible mechanism of triggering checkpoint is worth working on, at least from my perspective. And it may not only be a cron style. In our business scenario, the data traffic usually reaches the peek of the day after 20:00, which we want to increase the interval of checkpoint otherwise it’ll introduce more disk and network IO.

 

Just want to share something about this :)

 

 

Best,

Jiayi Liao

 


At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote:
>Hi Community,
>I would like to know if there is a existing function to support cron style
>checkpoint?
>The case is, our data traffic is huge on HH:30 for each hour. We don't wont
>checkpoint to fall in that range of time. A cron like 15,45 * * * * to set
>for checkpoint would be nice. If a checkpoint is already in progress when
>minutes is 15 or 45, there would be a config value to trigger a new
>checkpoint or pass.
> 
>-- 
>Best Wishes,
>Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>




 


 

--

Best Wishes,

 

Reply | Threaded
Open this post in threaded view
|

Re: Cron style for checkpoint

shuwen zhou
Hi Yun and Congxian,
I would actually want checkpoint to avoid being triggered on a certain time. It still remains as system mechanism just avoid being triggered at a certain range of time. 
Waiting for the checkpoint to timeout still waste CPU&disk IO resources since it was being triggered. I would like it to avoid from being triggered at first.
I suppose use a cron style would not break checkpoint's system mechanism.
Savepoint, on the other hand, is not incremental update, trigger a savepoint every 10 mins will waste a lot of disk and another script is required to remove outdated savepoint. I suppose savepoint is being used in upgrade/restart scenario. 
A cron style checkpoint time config will provide a lot flexibility. Thanks.


On Thu, 21 Nov 2019 at 16:28, Yun Tang <[hidden email]> wrote:

Hi Shuwen

 

Conceptually, checkpoints in Flink behaves more like a system mechanism to achieve fault tolerance and transparent for users. On the other hand, savepoint in Flink behaves more like a user control behavior, can savepoint not satisfy your demands for crontab?

 

Best

Yun Tang

 

From: Congxian Qiu <[hidden email]>
Date: Thursday, November 21, 2019 at 2:27 PM
To: shuwen zhou <[hidden email]>
Cc: Jiayi Liao <[hidden email]>, dev <[hidden email]>, user <[hidden email]>
Subject: Re: Cron style for checkpoint

 

Hi

 

Currently, Flink does not support such feature, from what you describe, does set an appropriate timeout for checkpoint can solve your problem?


Best,

Congxian

 

 

shuwen zhou <[hidden email]> 20191121日周四 下午12:06写道:

Hi Jiayi,

It would be great if Flink could have a user defined interface for user to implement to control checkpoint behavior, at least for time related behavior.

I brought up a wish on JIRA [1], perhaps it described clearly enough.

 

 

 

On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote:

Hi Shuwen,

 

As far as I know, Flink can only support checkpoint with a fixed interval. 

 

However I think the flexible mechanism of triggering checkpoint is worth working on, at least from my perspective. And it may not only be a cron style. In our business scenario, the data traffic usually reaches the peek of the day after 20:00, which we want to increase the interval of checkpoint otherwise it’ll introduce more disk and network IO.

 

Just want to share something about this :)

 

 

Best,

Jiayi Liao

 


At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote:
>Hi Community,
>I would like to know if there is a existing function to support cron style
>checkpoint?
>The case is, our data traffic is huge on HH:30 for each hour. We don't wont
>checkpoint to fall in that range of time. A cron like 15,45 * * * * to set
>for checkpoint would be nice. If a checkpoint is already in progress when
>minutes is 15 or 45, there would be a config value to trigger a new
>checkpoint or pass.
> 
>-- 
>Best Wishes,
>Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>




 


 

--

Best Wishes,

 



--
Best Wishes,

Reply | Threaded
Open this post in threaded view
|

Re: Cron style for checkpoint

Congxian Qiu
Hi

thanks for your explanation, what you want is to disable periodic checkpoint in some time duration, and at other times the periodic checkpoint is doing as normal. Currently, Flink does not support this, as you've created an issue for this, we can track this in the issue side. for now, if you really want this, you can change the logic in `CheckpointCoordinator#triggerCheckpoint`.

Best,
Congxian


shuwen zhou <[hidden email]> 于2019年11月21日周四 下午4:57写道:
Hi Yun and Congxian,
I would actually want checkpoint to avoid being triggered on a certain time. It still remains as system mechanism just avoid being triggered at a certain range of time. 
Waiting for the checkpoint to timeout still waste CPU&disk IO resources since it was being triggered. I would like it to avoid from being triggered at first.
I suppose use a cron style would not break checkpoint's system mechanism.
Savepoint, on the other hand, is not incremental update, trigger a savepoint every 10 mins will waste a lot of disk and another script is required to remove outdated savepoint. I suppose savepoint is being used in upgrade/restart scenario. 
A cron style checkpoint time config will provide a lot flexibility. Thanks.


On Thu, 21 Nov 2019 at 16:28, Yun Tang <[hidden email]> wrote:

Hi Shuwen

 

Conceptually, checkpoints in Flink behaves more like a system mechanism to achieve fault tolerance and transparent for users. On the other hand, savepoint in Flink behaves more like a user control behavior, can savepoint not satisfy your demands for crontab?

 

Best

Yun Tang

 

From: Congxian Qiu <[hidden email]>
Date: Thursday, November 21, 2019 at 2:27 PM
To: shuwen zhou <[hidden email]>
Cc: Jiayi Liao <[hidden email]>, dev <[hidden email]>, user <[hidden email]>
Subject: Re: Cron style for checkpoint

 

Hi

 

Currently, Flink does not support such feature, from what you describe, does set an appropriate timeout for checkpoint can solve your problem?


Best,

Congxian

 

 

shuwen zhou <[hidden email]> 20191121日周四 下午12:06写道:

Hi Jiayi,

It would be great if Flink could have a user defined interface for user to implement to control checkpoint behavior, at least for time related behavior.

I brought up a wish on JIRA [1], perhaps it described clearly enough.

 

 

 

On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote:

Hi Shuwen,

 

As far as I know, Flink can only support checkpoint with a fixed interval. 

 

However I think the flexible mechanism of triggering checkpoint is worth working on, at least from my perspective. And it may not only be a cron style. In our business scenario, the data traffic usually reaches the peek of the day after 20:00, which we want to increase the interval of checkpoint otherwise it’ll introduce more disk and network IO.

 

Just want to share something about this :)

 

 

Best,

Jiayi Liao

 


At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote:
>Hi Community,
>I would like to know if there is a existing function to support cron style
>checkpoint?
>The case is, our data traffic is huge on HH:30 for each hour. We don't wont
>checkpoint to fall in that range of time. A cron like 15,45 * * * * to set
>for checkpoint would be nice. If a checkpoint is already in progress when
>minutes is 15 or 45, there would be a config value to trigger a new
>checkpoint or pass.
> 
>-- 
>Best Wishes,
>Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>




 


 

--

Best Wishes,

 



--
Best Wishes,