"keyed" aggregation

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

"keyed" aggregation

Christophe Jolif
Hi all,

I'm sourcing from a Kafka topic, using the key of the Kafka message to key the stream, then doing some aggregation on the keyed stream.

Now I want to sink back to a different Kafka topic but re-using the same key. The thing is that my aggregation "lost" the key. Obviously I can make sure my aggregation function keeps the key, but I find that a bit strange as it does not relate to aggregation.

Is there a best practice in that domain? How should the key be carried when moving from a kafka source to sink but doing some aggregation along the way?

Thanks,
--
Christophe
Reply | Threaded
Open this post in threaded view
|

Re: "keyed" aggregation

Till Rohrmann
Hi Christophe,

if you don't have a way to recompute the key from the aggregation result, then you have to write an aggregation function which explicitly keeps it (e.g. a tuple value where the first entry is the key and the second the aggregate value).

Cheers,
Till

On Fri, Jan 5, 2018 at 5:51 PM, Christophe Jolif <[hidden email]> wrote:
Hi all,

I'm sourcing from a Kafka topic, using the key of the Kafka message to key the stream, then doing some aggregation on the keyed stream.

Now I want to sink back to a different Kafka topic but re-using the same key. The thing is that my aggregation "lost" the key. Obviously I can make sure my aggregation function keeps the key, but I find that a bit strange as it does not relate to aggregation.

Is there a best practice in that domain? How should the key be carried when moving from a kafka source to sink but doing some aggregation along the way?

Thanks,
--
Christophe