Sorry for the long delay. Many contributors are traveling due to Flink Forward.
Your use case should be well supported by Flink. Flink will partition and distribute the keys across all parallel instances of an operator and can handle very large stage (up to several TBs).
Best, Fabian
TechnoMage <[hidden email]> schrieb am Sa., 7. Apr. 2018, 10:57:
I have a use case that I wonder if Flink handles well:
1) 5M+ keys in a KeyedStream
2) Using RichFlatMap to track data on each key
Will Flink spread one operator’s partitions over multiple machines/taskmanager/jobmanager?