Re: Fault tolerance guarantees of Elasticsearch sink in flink-elasticsearch2?

Posted by Tzu-Li (Gordon) Tai on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Fault-tolerance-guarantees-of-Elasticsearch-sink-in-flink-elasticsearch2-tp10982p11044.html

Hi Andrew,

Your observations are correct. Like you mentioned, the current problem circles around how we deal with the pending buffered requests with accordance to Flink’s checkpointing.
I’ve filed a JIRA for this, as well as some thoughts for the solution in the description: https://issues.apache.org/jira/browse/FLINK-5487. What do you think?

Thank you for bringing this up! We should probably fix this soon.
There’s already some on-going effort in fixing some other aspects of proper at-least-once support in the Elasticsearch sinks, so I believe this will be brought to attention very soon too.

Cheers,
Gordon




On January 11, 2017 at 3:49:06 PM, Andrew Roberts ([hidden email]) wrote:

I’m trying to understand the guarantees made by Flink’s Elasticsearch sink in terms of message delivery. according to (1), the ES sink offers at-least-once guarantees. This page doesn’t differentiate between flink-elasticsearch and flink-elasticsearch2, so I have to assume for the moment that they both offer that guarantee. However, a look at the code (2) shows that the invoke() method puts the record into a buffer, and then that buffer is flushed to elasticsearch some time later.