Question about key group / key state & parallelism

Posted by bastien dine on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Question-about-key-group-key-state-parallelism-tp25071.html

Hello everyone,

I have a question regarding the key state & parallelism of a process operation

Doc says : "You can think of Keyed State as Operator State that has been partitioned, or sharded, with exactly one state-partition per key. Each keyed-state is logically bound to a unique composite of <parallel-operator-instance, key>, and since each key “belongs” to exactly one parallel instance of a keyed operator, we can think of this simply as <operator, key>."

If I have less parallel operator instance (say 5) than my number of possible key (10), it means than every instance will "manage" 2 key state ? (is this spread evenly ?)
Is the logical bound fixed ? I mean, are the state always managed by the same instance, or does this depends on the available instance at the moment ? 

"During execution each parallel instance of a keyed operator works with the keys for one or more Key Groups."
-> this is related, does "works with the keys" means always the same keys ?

Best Regards,
Bastien

------------------

Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io