Posted by
Biplob Biswas on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Data-point-goes-missing-within-iteration-tp7776p7951.html
Hi,
Sorry for the late reply, was trying different stuff on my code. And from what I observed, its very weird for me.
So after experimentation, I found out that when I increase the number of centroids, the number of data points forwarded decreases, when I lower the umber of centroids, the datapoint number increases.
In my code, number of centroids is given by num_of_mc.
I have reduced my code to the essential part where the problem is happening. Please find the code in the link below.
To be precise, in the coflatmap map1 function I am supposed to get all the points, but I am not getting all the points which should be coming.
I know that I wouldn't get the output of first n points, which as you can see from my code is just filling up the array of centroids, so I am not collecting them.
But apart from that I should get every other point as I am collecting all the other point in all other cases (I have removed 2 of the cases)
Even when I print the count of points just when the point inserts into the map1 function it reduces if I decide on the number of centroids to be higher.
I dont have any information about the lost points and I still don't know why is it happening.
Here's the link to the part of the code where the problem arises
http://pastebin.com/PVnDJeAaI hope I am able to make you understand clearly now with this cleaned up code without extra stuff.