|
This post was updated on .
Hi ,
I am using Cassandra Sink for storing data in Cassandra. I saw below lines of code in invoke() method of Cassandra sink
if (exception != null) {
throw new IOException("invoke() failed", exception);
}
This mean due to any reason if one insert to Cassandra fails it will throw an exception and restart the pipeline.
In our pipeline we are expecting 1 million inserts per second. that means there is high possibility of missing one record (or in case of Cassandra node failure)
cant we create back pressure and slow down the Flink pipeline in such scenario?
or it is expected that we should always maintain enough Cassandra nodes to handle that amount of load?
Can you suggested me how to achieve Exactly-once guarantee in this scenario?
|