counting elements in datastream

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

counting elements in datastream

subashbasnet
Hello all, 

If anyone had idea, what could be the probable way to count the elements of a current instance of the datastream. Is it possible?

DataStream<Tuple2<Point, Grid>> pointsWithGridCoordinates;



Regards,
Subash Basnet
Reply | Threaded
Open this post in threaded view
|

Re: counting elements in datastream

Sameer Wadkar
Use Count windows and keep emitting results say every 1000 elements and do a sum. Or do without windows something like this which has the disadvantage that it emits a new updated result for each new element (not a good thing if your volume is high)-


Or use tumbling time windows on processing time - https://github.com/sameeraxiomine/flinkinaction/blob/master/flinkinactionjava/src/main/java/com/manning/fia/c04/TimeWindowExample.java. Advantage over count windows is that you get a count every few (configured seconds) which you can then add up on your client side.

Since you do not need a keyBy operation you would do this directly on the DataStream instance without doing a keyBy but that way you get multiple counts per partition of the stream which you will need to add up.





On Thu, Aug 18, 2016 at 5:54 AM, subash basnet <[hidden email]> wrote:
Hello all, 

If anyone had idea, what could be the probable way to count the elements of a current instance of the datastream. Is it possible?

DataStream<Tuple2<Point, Grid>> pointsWithGridCoordinates;



Regards,
Subash Basnet