Re: Flink - Once and once processing
Posted by
M Singh on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Flink-Once-and-once-processing-tp8206p8253.html
Thanks Till. I will take a look at your pointers. Mans
On Monday, August 1, 2016 6:27 AM, Till Rohrmann <[hidden email]> wrote:
Hi Mans,
Milind is right that in general external systems have to play along if you want to achieve exactly once processing guarantees while writing to these systems. Either by supporting idempotent operations or by allowing to roll back their state.
In the batch world, this usually means to overwrite data from a previously failed execution run completely or having a unique key which does not change across runs.
In the case of streaming we can achieve exactly once guarantees by committing the data to the external system only after we have taken a checkpoint and buffering the data in between. This guarantees that the changes are only materialized after we are sure that we can go back to a checkpoint where we've already seen all the elements which might have caused the sink output. You can take a look at the CassandraSink where we're exactly doing this.
Cheers,
Till