Shuffling between map and keyBy operator

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Shuffling between map and keyBy operator

Marchant, Hayden
I have a streaming application that has a keyBy operator followed by an operator working on the keyed values (a custom sum operator). If the map operator and aggregate operator are running on same Task Manager , will Flink always serialize and deserialize the tuples, or is there an optimization in this case due to 'locality'?

(I was planning on deploying my Flink Streaming application to a single 'big' node in the hope that I can reduce latency by saving on both network and serde.)


Thanks,
Hayden Marchant


Reply | Threaded
Open this post in threaded view
|

Re: Shuffling between map and keyBy operator

Kurt Young
HiĀ Marchant,

I'm afraid that the serde cost still exists even if both operators run in same TaskManager.

Best,
Kurt

On Tue, Sep 5, 2017 at 9:26 PM, Marchant, Hayden <[hidden email]> wrote:
I have a streaming application that has a keyBy operator followed by an operator working on the keyed values (a custom sum operator). If the map operator and aggregate operator are running on same Task Manager , will Flink always serialize and deserialize the tuples, or is there an optimization in this case due to 'locality'?

(I was planning on deploying my Flink Streaming application to a single 'big' node in the hope that I can reduce latency by saving on both network and serde.)


Thanks,
Hayden Marchant