http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/State-in-external-db-dynamodb-tp5939p5967.html
Hi Shannon!
Welcome to the Flink community!
You are right, sinks need in general to be idempotent if you want "exactly-once" semantics, because there can be a replay of elements that were already written.
However, what you describe later, overwriting of a key with a new value (or the same value again) is pretty much sufficient. That means that when a duplicate write happens during replay, the value for the key is simply overwritten with the same value again.
As long as all computation is purely in Flink and you only write to the key/value store (rather than read from k/v, modify in Flink, write to k/v), you get the consistency that for example counts/aggregates never have duplicates.
If Flink needs to look up state from the database (because it is no longer in Flink), it is a bit more tricky. I assume that is where you are going with "Subsequently, when an event is processed, we must be able to quickly load up any evicted state". In that case, there are two things you can do:
(1) Only write to the DB upon a checkpoint, at which point it is known that no replay of that data will occur any more. Values from partially successful writes will be overwritten with correct value. I assume that is what you thought of when referring to the State Backend, because in some sense, that is what that state backend would do.
I think it is simpler to realize that in a custom sink, than developing a new state backend. Another Flink committer (Chesnay) has developed some nice tooling for that, to be merged into Flink soon.
(2) You could attach version numbers to every write, and increment the versions upon each checkpoint. That allows you to always refer to a consistent previous value, if some writes were made, but a failure occurred before the checkpoint completed.
I hope these answers apply to your case. Let us know if some things are still unclear, or if I misunderstood your question!
Greetings,
Stephan