Hi,
Here is a question related to parallelism of keyed-process-function that is applied to the KeyedStream. For some code that looks like this
myStream.keyBy(...)
.process(new MyKeyedProcessFunction())
.process(<someOtherProcessFunction>).setParallelism(10)
On a Flink cluster with 5 TM nodes each with 10 task slots, and Job parallelism = 5, the 5 subtasks of
MyKeyedProcessFunction() do not get distributed across all the 5 TM nodes evenly. Those typically get assigned to one single TM node. However the 50 subtasks of <
someOtherProcessFunction> are always spread evenly across the 5 TMs.
Am I missing something? How can I get those MyKeyedProcessFunction() subtasks distributed across all TM nodes evenly?
Thanks
Arti