Kafka KeyedStream source
Posted by
Niels Basjes on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Kafka-KeyedStream-source-tp10883.html
Hi,
In my scenario I have click stream data that I persist in Kafka.
I use the sessionId as the key to instruct Kafka to put everything with the same sessionId into the same Kafka partition. That way I already have all events of a visitor in a single kafka partition in a fixed order.
When I read this data into Flink I get a generic data stream ontop of which I have to do a keyBy before my processing can continue. Such a keyBy will redistribute the data again to later tasks that can do the actual work.
Is it possible to create an adapted version of the Kafka source that immediately produces a keyed data stream?
--
Best regards / Met vriendelijke groeten,
Niels Basjes