http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/How-to-perform-this-join-operation-tp6088p6676.html
Hi Elias,
Samza and Flink operate at slightly different abstraction layers here. The Samza KeyValueStore basically has the interface of a Java HashMap. When accessing keyed state the key is always explicit in the method access and it only allows a simple put and get per key. Flink State, such as ValueState, ListState and ReducingState is implicitly scoped to the key of the input element and it allows different "shapes" of state, for example the ListState uses the efficient merge operation to add to the state instead of a get-update-put cycle that would be required in Samza KeyValueStore.
In code, this is what Samza does:
class Operator {
KeyValueStore<K, V> store = ...
void process(KV<Key, Value> element) {
value = store.get(element.getKey())
...
store.put(element.getKey(), ...)
}
}
while this is what Flink does:
class Operator {
ValueState<V> state = ...
void internalProcess(KV<Key, Value> element) {
state.setKey(element.getKey())
process(element)
}
void process(KV<Key, Value> element) {
value = state.get()
...
state.update(...)
}
}
In Flink we are dealing with the keys internally, which makes it easier for us to implement things like automatic scaling with rebalancing of keyed state (Till is working on this right now). Underneath we have something similar to the KeyValueStore, if you want, you could write a custom operator that deals with these details directly and and handles managing of keys. The thing we don't support right now is iterating over all keys/state for the locally held keys. I'm changing this, however, in this PR:
https://github.com/apache/flink/pull/1957.
Then you can do everything that you can do with the Samza KeyValueStore plus a bit more because we have more specific types of state that exploit features such as merge instead of put-update-get.
I hope this clarifies things a bit. :-)
Cheers,
Aljoscha