Introduce Barriers in stream source

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Introduce Barriers in stream source

Darshan Singh
Hi,

I am implementing a source and I want to use checkpointing and would like to restore the job from these external checkpoints. I used Kafka for my tests and it worked fine.

However, I would like to know if I have my own source what do I need to do. I am sure that I will need to implement CheckpointedFunctions (initializeState and snapshotState) . Based on what I read and looked at implemetation of Kafka source I do not need to do anything else apart from this.
Maybe notifyCheckpointComplete also.

But I would like to confirm if I will need to implement something to create Barrier in my source.

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Introduce Barriers in stream source

Fabian Hueske-2
Hi,

It is sufficient to implement the CheckpointedFunction interface.
Since SourceFunctions emit records in a separate thread, you need to ensure that not record is emitted while the shapshotState method is called.
Flink provides a lock to synchronize data emission and state snapshotting. See the JavaDocs of SourceFunction for details.

Best, Fabian

2018-08-13 16:42 GMT+02:00 Darshan Singh <[hidden email]>:
Hi,

I am implementing a source and I want to use checkpointing and would like to restore the job from these external checkpoints. I used Kafka for my tests and it worked fine.

However, I would like to know if I have my own source what do I need to do. I am sure that I will need to implement CheckpointedFunctions (initializeState and snapshotState) . Based on what I read and looked at implemetation of Kafka source I do not need to do anything else apart from this.
Maybe notifyCheckpointComplete also.

But I would like to confirm if I will need to implement something to create Barrier in my source.

Thanks