reduceGroup

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

reduceGroup

Mary m
Hi 
If groupeby+reduceGroup is used, does each groupeby+reduceGroup take place on a single partition? If yes, if we have more groups than the partitions, what happens?

Cheers,
Mary
Reply | Threaded
Open this post in threaded view
|

Re: reduceGroup

Till Rohrmann
Hi Mary,

the groupBy + reduceGroup works across all partitions of a DataSet. This means that elements from each partition are grouped (creating potentially a new partitioning) and then for each group the reduceGroup function is executed.

Cheers,
Till

On Thu, Apr 20, 2017 at 5:14 PM, Mary m <[hidden email]> wrote:
Hi 
If groupeby+reduceGroup is used, does each groupeby+reduceGroup take place on a single partition? If yes, if we have more groups than the partitions, what happens?

Cheers,
Mary