(DEPRECATED) Apache Flink User Mailing List archive.

Flink Checkpointing in production

Classic

List

Threaded

2 messages Options

hassahma

Flink Checkpointing in production

Hi All,

We need two clarifications for using Flink 1.6.0. We have flink jobs running to handle 100's of tenants with sliding window of 24hrs and slide by 5 minutes.

1) If checkpointing is enabled and flink job crashes in the middle of spitting out results to kafka producer. Then if the job resumes from the previously known checkpoint, then what happens to the partial results that kafka producer has already sent out. Do we end up sending duplicates for the same window after recovery from checkpoint. Or this window never gets triggered again as we have forward in time ?

2) If one input event flink, produces 100 messages out in kafka producer after window expiry, then how will setFlushOnCheckpoint will work in kafka producer. I am confused how will it ensure that all records before checkpoint have been sent out because we are creating 100 output events from single input event.

Thanks for the help.

Best Regards,

vino yang

Re: Flink Checkpointing in production

Hi Ahmad,

Answer your question：

Ahmad Hassan <[hidden email]> 于2018年9月12日周三下午4:39写道：

Hi All,

We need two clarifications for using Flink 1.6.0. We have flink jobs running to handle 100's of tenants with sliding window of 24hrs and slide by 5 minutes.

1) If checkpointing is enabled and flink job crashes in the middle of spitting out results to kafka producer. Then if the job resumes from the previously known checkpoint, then what happens to the partial results that kafka producer has already sent out. Do we end up sending duplicates for the same window after recovery from checkpoint. Or this window never gets triggered again as we have forward in time ?

Which version of Kafka do you use? Kafka has supported producer transactions since version 0.11. If you use this version of the connector and choose the checkpoint mode to be exact-once, the entire pipeline will guarantee end-to-end consistency.

2) If one input event flink, produces 100 messages out in kafka producer after window expiry, then how will setFlushOnCheckpoint will work in kafka producer. I am confused how will it ensure that all records before checkpoint have been sent out because we are creating 100 output events from single input event.

After the first reply, if your source connector can also guarantee the end-to-end exact once semantics, then the whole process is no problem. Because Flink's checkpoint mechanism is a snapshot of global consistency. It is not just an independent snapshot of a particular operator.

Thanks for the help.

Best Regards,