A question regarding to the checkpoint mechanism

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

A question regarding to the checkpoint mechanism

Li Wang-2
Hi all,

As far as I know, a stateful operator will checkpoint its current state to a persistent storage when it receives all the barrier from all of its upstream operators. My question is that does the operator doing the checkpoint need to pause processing the input tuples for the next batch until the checkpoint is done?  If yes, will it introduce significant processing latency when the state is large. If no, does this need the operator state to be immutable?

Thanks,
Li
Reply | Threaded
Open this post in threaded view
|

Re: A question regarding to the checkpoint mechanism

Li Wang-2
Hi All,

Any feedback is highly appreciated.

Thanks.
Li

> On Oct 15, 2016, at 11:17 AM, Li Wang <[hidden email]> wrote:
>
> Hi all,
>
> As far as I know, a stateful operator will checkpoint its current state to a persistent storage when it receives all the barrier from all of its upstream operators. My question is that does the operator doing the checkpoint need to pause processing the input tuples for the next batch until the checkpoint is done?  If yes, will it introduce significant processing latency when the state is large. If no, does this need the operator state to be immutable?
>
> Thanks,
> Li

Reply | Threaded
Open this post in threaded view
|

Re: A question regarding to the checkpoint mechanism

Tzu-Li (Gordon) Tai
Hi!

No, the operator does not need to pause processing input records while the checkpointing of its state is in progress.
The checkpointing of operator state is asynchronous. The operator state does not need to be immutable, since its a copy of the snapshot state that’s checkpointed.

Regards,
Gordon


On October 17, 2016 at 10:28:34 AM, Li Wang ([hidden email]) wrote:

Hi All,

Any feedback is highly appreciated.

Thanks.
Li

> On Oct 15, 2016, at 11:17 AM, Li Wang <[hidden email]> wrote:
>
> Hi all,
>
> As far as I know, a stateful operator will checkpoint its current state to a persistent storage when it receives all the barrier from all of its upstream operators. My question is that does the operator doing the checkpoint need to pause processing the input tuples for the next batch until the checkpoint is done? If yes, will it introduce significant processing latency when the state is large. If no, does this need the operator state to be immutable?
>
> Thanks,
> Li

Reply | Threaded
Open this post in threaded view
|

Re: A question regarding to the checkpoint mechanism

Li Wang-2
Hi Gordon,

Thanks for your prompt reply.
So do you mean when we are about to checkpoint the state of an operator, we first copy its state and then checkpoint the copied state while the operator continues processing?

Thanks,
Li


On Oct 17, 2016, at 11:10 AM, Tzu-Li (Gordon) Tai <[hidden email]> wrote:

Hi!

No, the operator does not need to pause processing input records while the checkpointing of its state is in progress.
The checkpointing of operator state is asynchronous. The operator state does not need to be immutable, since its a copy of the snapshot state that’s checkpointed.

Regards,
Gordon


On October 17, 2016 at 10:28:34 AM, Li Wang ([hidden email]) wrote:

Hi All, 

Any feedback is highly appreciated. 

Thanks. 
Li 

> On Oct 15, 2016, at 11:17 AM, Li Wang <[hidden email]> wrote: 
>  
> Hi all, 
>  
> As far as I know, a stateful operator will checkpoint its current state to a persistent storage when it receives all the barrier from all of its upstream operators. My question is that does the operator doing the checkpoint need to pause processing the input tuples for the next batch until the checkpoint is done? If yes, will it introduce significant processing latency when the state is large. If no, does this need the operator state to be immutable? 
>  
> Thanks, 
> Li

Reply | Threaded
Open this post in threaded view
|

Re: A question regarding to the checkpoint mechanism

Tzu-Li (Gordon) Tai
Users don’t need to explicitly make a copy of the state. Take checkpointing instance fields as operator state for example [1].
You simply return your current state in `snapshotState()`, and Flink will take care of snapshotting and persisting it to the state backend.
The persisting process does not block processing of input records if you implement the `CheckpointedAsynchronously` interface (which is usually the more desirable case).
The same goes for key-partitioned states.

Best Regards,
Gordon

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/state.html#checkpointing-instance-fields

On October 17, 2016 at 11:32:07 AM, Li Wang ([hidden email]) wrote:

Hi Gordon,

Thanks for your prompt reply.
So do you mean when we are about to checkpoint the state of an operator, we first copy its state and then checkpoint the copied state while the operator continues processing?

Thanks,
Li


On Oct 17, 2016, at 11:10 AM, Tzu-Li (Gordon) Tai <[hidden email]> wrote:

Hi!

No, the operator does not need to pause processing input records while the checkpointing of its state is in progress.
The checkpointing of operator state is asynchronous. The operator state does not need to be immutable, since its a copy of the snapshot state that’s checkpointed.

Regards,
Gordon


On October 17, 2016 at 10:28:34 AM, Li Wang ([hidden email]) wrote:

Hi All, 

Any feedback is highly appreciated. 

Thanks. 
Li 

> On Oct 15, 2016, at 11:17 AM, Li Wang <[hidden email]> wrote: 
>  
> Hi all, 
>  
> As far as I know, a stateful operator will checkpoint its current state to a persistent storage when it receives all the barrier from all of its upstream operators. My question is that does the operator doing the checkpoint need to pause processing the input tuples for the next batch until the checkpoint is done? If yes, will it introduce significant processing latency when the state is large. If no, does this need the operator state to be immutable? 
>  
> Thanks, 
> Li