Multi-dimensional[more than 2] input for KMeans Clustering in Apache flink

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Multi-dimensional[more than 2] input for KMeans Clustering in Apache flink

subashbasnet
Hello all, 

Currently I find only two-dimension input possible for the KMeans Clustering in flink. 

Is there any implementation already or what should be the approach to implement 
more than 2 dimensional input for KMeans in flink?
Or is there any other clustering method which taking more than two dimensional data as input implemented in flink?


Best Regards,
Subash Basnet
Reply | Threaded
Open this post in threaded view
|

Re: Multi-dimensional[more than 2] input for KMeans Clustering in Apache flink

Fabian Hueske-2
Hi Subash,

the KMeans implementation in Flink is meant to be a simple toy example and should not used for serious analysis tasks.
It shows how the DataSet API works by implementing a well-known algorithm.

Nonetheless, the example can be easily extended to work for three or more dimensions.
You would need to adapt the methods to compute the distance and the location of the new center.

Best, Fabian

2016-03-01 17:07 GMT+01:00 subash basnet <[hidden email]>:
Hello all, 

Currently I find only two-dimension input possible for the KMeans Clustering in flink. 

Is there any implementation already or what should be the approach to implement 
more than 2 dimensional input for KMeans in flink?
Or is there any other clustering method which taking more than two dimensional data as input implemented in flink?


Best Regards,
Subash Basnet

Reply | Threaded
Open this post in threaded view
|

Re: Multi-dimensional[more than 2] input for KMeans Clustering inApache flink

subashbasnet
In reply to this post by subashbasnet
Hello Fabian, 

Thanks! Is KMeans only the clustering implementation currently existing in flink. 


Best Regards,
Subash Basnet

On Tue, Mar 1, 2016 at 5:22 PM, Fabian Hueske <[hidden email]> wrote:
Boxbe This message is eligible for Automatic Cleanup! ([hidden email]) Add cleanup rule | More info

Hi Subash,

the KMeans implementation in Flink is meant to be a simple toy example and should not used for serious analysis tasks.
It shows how the DataSet API works by implementing a well-known algorithm.

Nonetheless, the example can be easily extended to work for three or more dimensions.
You would need to adapt the methods to compute the distance and the location of the new center.

Best, Fabian

2016-03-01 17:07 GMT+01:00 subash basnet <[hidden email]>:
Hello all, 

Currently I find only two-dimension input possible for the KMeans Clustering in flink. 

Is there any implementation already or what should be the approach to implement 
more than 2 dimensional input for KMeans in flink?
Or is there any other clustering method which taking more than two dimensional data as input implemented in flink?


Best Regards,
Subash Basnet