Parallel stream partitions

Posted by Nicholas Walton on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Parallel-stream-partitions-tp21576.html

Suppose I have a data stream of tuples <tick: Int, key: Int, Value: Double> with the sequence of ticks being 1,2,3,…. for each separate k.

I understand and keyBy(2) will partition the stream so each partition has the same key in each tuple. I now have a sequence of functions to apply to the streams say f(),g() and h() in that order. 

With parallelism set to 1 then each partition-stream passes through f then g then h (f | g | h) in order of tick.

I want to run each partition-stream in parallel, setting parallelism in the Web GUI. 

My question is how do I ensure  each partition stream passes through a fixed sequence (f | g | h)  rather than if parallelism is p running p instances each of f g & h with no guarantee that each partition-stream flows through a unique set of three instances  in tick-order, especially if p is greater than the largest value of key. 

A typical use case would be to maintain a moving average over each key 


I need to remove the crossover in the middle box, so [1] -> [1] -> [1] and [2] -> [2] -> [2], instead of  [1] -> [1] -> [1 or 2] .

Nick