Re: Data point goes missing within iteration

Posted by Biplob Biswas on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Data-point-goes-missing-within-iteration-tp7776p7951.html

Hi,

Sorry for the late reply, was trying different stuff on my code. And from what I observed, its very weird for me.

So after experimentation, I found out that when I increase the number of centroids, the number of data points forwarded decreases, when I lower the umber of centroids, the datapoint number increases.
In my code, number of centroids is given by num_of_mc.

I have reduced my code to the essential part where the problem is happening. Please find the code in the link below.


To be precise, in the coflatmap map1 function I am supposed to get all the points, but I am not getting all the points which should be coming.

I know that I wouldn't get the output of first n points, which as you can see from my code is just filling up the array of centroids, so I am not collecting them.
But apart from that I should get every other point as I am collecting all the other point in all other cases (I have removed 2 of the cases)

Even when I print the count of points just when the point inserts into the map1 function it reduces if I decide on the number of centroids to be higher.


I dont have any information about the lost points and I still don't know why is it happening.


Here's the link to the part of the code where the problem arises
http://pastebin.com/PVnDJeAa

I hope I am able to make you understand clearly now with this cleaned up code without extra stuff.