I think so, you can use sliding windows and basically do a WordCount-like job that counts the occurences in each window. Then, you would have a filter afterwards that filters out those elements where the count is lower than a given threshold.
Cheers,
Aljoscha
On Mon, 12 Dec 2016 at 22:30 Meghashyam Sandeep V <
[hidden email]> wrote:
Hi There,
I have a streaming job which has source as Kafka and sink as Cassandra. I have a use case where I wouldn't want to write some events to Cassandra when there are more than 100 events for a given 'id' (field in my Pojo) in 5mins. Is this a good usecase for SlidingWindows? Can I get the sliding count for each key and then decide whether to add it to sink or not?
Thanks,