Your understanding of a group by is correct. It is equivalent to a key
by. I agree it would be a great feature to keep the Source's
partitioning but unfortunately as of now it is not yet supported.
Best,
Dawid
On 18/03/2021 18:28, Aeden Jameson wrote:
> It's my understanding that a group by is also a key by under the hood.
> As a result that will cause a shuffle operation to happen. Our source
> is a Kafka topic that is keyed so that any give partition contains all
> the data that is needed for any given consuming TM. Is there a way
> using FlinkSQL to eliminate the shuffle operation? Or I'm missing
> details other details that would make such a change undesirable?
>
> Thank you,
> Aeden