How to achieve Exactly-once guarantees in Cassandra Sink

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

How to achieve Exactly-once guarantees in Cassandra Sink

Ranjit
This post was updated on .
Hi ,
I am using Cassandra Sink for storing data in Cassandra. I saw below lines of code in invoke() method of Cassandra sink

if (exception != null) {
        throw new IOException("invoke() failed", exception);
}


This mean due to any reason if one insert to Cassandra fails it will throw an exception and restart the pipeline.
In our pipeline we are expecting 1 million inserts per second. that means there is high possibility of missing one record (or in case of Cassandra node failure)
cant we create back pressure and slow down the Flink pipeline in such scenario?
or it is expected that we should always maintain enough Cassandra nodes to handle that amount of load?
Can you suggested me how to achieve Exactly-once guarantee in this scenario?