(DEPRECATED) Apache Flink User Mailing List archive.

flink ml - k-means

Classic

List

Threaded

7 messages Options

Pa Rö

flink ml - k-means

hi flink community,

at the time I write my master thesis in the field machine learning. My main task is to evaluated different k-means variants for large data sets (BigData). I would like test flink ml against Apache Mahout and Apache Hadoop MapReduce in areas of scalability and performance(time and space). What is the current state for the purpose of clustering, especially K-Means? Will there be in the near future a release information this?

best greetings
paul

Alexander Alexandrov

Re: flink ml - k-means

Yes, I expect to have one in the next few weeks (the code is actually there, but we need to port it to the Flink ML API). I suggest to follow the JIRA issue in the next weeks to check when this is done:

https://issues.apache.org/jira/browse/FLINK-1731

Regards,

Alexander

PS. Bear in mind that we will start with a vanilla implementation of K-Means. For a thorough evaluation you might want to also check variants like K-Means++.

2015-04-24 15:08 GMT+02:00 Pa Rö <[hidden email]>:

hi flink community,

at the time I write my master thesis in the field machine learning. My main task is to evaluated different k-means variants for large data sets (BigData). I would like test flink ml against Apache Mahout and Apache Hadoop MapReduce in areas of scalability and performance(time and space). What is the current state for the purpose of clustering, especially K-Means? Will there be in the near future a release information this?

best greetings
paul

Till Rohrmann

Re: flink ml - k-means

Hi Paul,

if you can't wait, a vanilla implementation is already contained as part of the Flink examples. You should find it under flink/flink-examples.

But we will try to add more clustering algorithms in the near future.

Cheers,
Till

On Apr 26, 2015 11:14 PM, "Alexander Alexandrov" <[hidden email]> wrote:

Yes, I expect to have one in the next few weeks (the code is actually there, but we need to port it to the Flink ML API). I suggest to follow the JIRA issue in the next weeks to check when this is done:

https://issues.apache.org/jira/browse/FLINK-1731

Regards,
Alexander

PS. Bear in mind that we will start with a vanilla implementation of K-Means. For a thorough evaluation you might want to also check variants like K-Means++.

2015-04-24 15:08 GMT+02:00 Pa Rö <[hidden email]>:
hi flink community,

at the time I write my master thesis in the field machine learning. My main task is to evaluated different k-means variants for large data sets (BigData). I would like test flink ml against Apache Mahout and Apache Hadoop MapReduce in areas of scalability and performance(time and space). What is the current state for the purpose of clustering, especially K-Means? Will there be in the near future a release information this?

best greetings
paul

Pa Rö

Re: flink ml - k-means

Hi Alexander and Till,

thanks for your informations, I look forward to the release.
I'm curious how well is flink ml against mahout und spark ml.

best regerds

Paul

2015-04-27 9:23 GMT+02:00 Till Rohrmann <[hidden email]>:

Hi Paul,

if you can't wait, a vanilla implementation is already contained as part of the Flink examples. You should find it under flink/flink-examples.

But we will try to add more clustering algorithms in the near future.

Cheers,
Till

On Apr 26, 2015 11:14 PM, "Alexander Alexandrov" <[hidden email]> wrote:
Yes, I expect to have one in the next few weeks (the code is actually there, but we need to port it to the Flink ML API). I suggest to follow the JIRA issue in the next weeks to check when this is done:

https://issues.apache.org/jira/browse/FLINK-1731

Regards,
Alexander

PS. Bear in mind that we will start with a vanilla implementation of K-Means. For a thorough evaluation you might want to also check variants like K-Means++.

2015-04-24 15:08 GMT+02:00 Pa Rö <[hidden email]>:
hi flink community,

at the time I write my master thesis in the field machine learning. My main task is to evaluated different k-means variants for large data sets (BigData). I would like test flink ml against Apache Mahout and Apache Hadoop MapReduce in areas of scalability and performance(time and space). What is the current state for the purpose of clustering, especially K-Means? Will there be in the near future a release information this?

best greetings
paul

Pa Rö

Re: flink ml - k-means

hi,

now i want implement kmeans with flink,

maybe you know a release date for flink ml kmeans?

best regards

paul

2015-04-27 9:36 GMT+02:00 Pa Rö <[hidden email]>:

Hi Alexander and Till,

thanks for your informations, I look forward to the release.
I'm curious how well is flink ml against mahout und spark ml.

best regerds
Paul

2015-04-27 9:23 GMT+02:00 Till Rohrmann <[hidden email]>:
Hi Paul,

if you can't wait, a vanilla implementation is already contained as part of the Flink examples. You should find it under flink/flink-examples.

But we will try to add more clustering algorithms in the near future.

Cheers,
Till

On Apr 26, 2015 11:14 PM, "Alexander Alexandrov" <[hidden email]> wrote:
Yes, I expect to have one in the next few weeks (the code is actually there, but we need to port it to the Flink ML API). I suggest to follow the JIRA issue in the next weeks to check when this is done:

https://issues.apache.org/jira/browse/FLINK-1731

Regards,
Alexander

PS. Bear in mind that we will start with a vanilla implementation of K-Means. For a thorough evaluation you might want to also check variants like K-Means++.

2015-04-24 15:08 GMT+02:00 Pa Rö <[hidden email]>:
hi flink community,

at the time I write my master thesis in the field machine learning. My main task is to evaluated different k-means variants for large data sets (BigData). I would like test flink ml against Apache Mahout and Apache Hadoop MapReduce in areas of scalability and performance(time and space). What is the current state for the purpose of clustering, especially K-Means? Will there be in the near future a release information this?

best greetings
paul

Stephan Ewen

Re: flink ml - k-means

Paul!

Can you use the KMeans example? The code is for three-dimensional points, but you should be able to generalize it easily.

That would be the fastest way to go. without waiting for any release dates...

Stephan

On Mon, May 11, 2015 at 2:46 PM, Pa Rö <[hidden email]> wrote:

hi,

now i want implement kmeans with flink,
maybe you know a release date for flink ml kmeans?

best regards
paul

2015-04-27 9:36 GMT+02:00 Pa Rö <[hidden email]>:
Hi Alexander and Till,

thanks for your informations, I look forward to the release.
I'm curious how well is flink ml against mahout und spark ml.

best regerds
Paul

2015-04-27 9:23 GMT+02:00 Till Rohrmann <[hidden email]>:
Hi Paul,

if you can't wait, a vanilla implementation is already contained as part of the Flink examples. You should find it under flink/flink-examples.

But we will try to add more clustering algorithms in the near future.

Cheers,
Till

On Apr 26, 2015 11:14 PM, "Alexander Alexandrov" <[hidden email]> wrote:
Yes, I expect to have one in the next few weeks (the code is actually there, but we need to port it to the Flink ML API). I suggest to follow the JIRA issue in the next weeks to check when this is done:

https://issues.apache.org/jira/browse/FLINK-1731

Regards,
Alexander

PS. Bear in mind that we will start with a vanilla implementation of K-Means. For a thorough evaluation you might want to also check variants like K-Means++.

2015-04-24 15:08 GMT+02:00 Pa Rö <[hidden email]>:
hi flink community,

at the time I write my master thesis in the field machine learning. My main task is to evaluated different k-means variants for large data sets (BigData). I would like test flink ml against Apache Mahout and Apache Hadoop MapReduce in areas of scalability and performance(time and space). What is the current state for the purpose of clustering, especially K-Means? Will there be in the near future a release information this?

best greetings
paul

Pa Rö

Re: flink ml - k-means

okay :)

now i use the following exsample code from here:
https://github.com/apache/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/clustering/KMeans.java

2015-05-11 21:56 GMT+02:00 Stephan Ewen <[hidden email]>:

Paul!

Can you use the KMeans example? The code is for three-dimensional points, but you should be able to generalize it easily.
That would be the fastest way to go. without waiting for any release dates...

Stephan

On Mon, May 11, 2015 at 2:46 PM, Pa Rö <[hidden email]> wrote:
hi,

now i want implement kmeans with flink,
maybe you know a release date for flink ml kmeans?

best regards
paul

2015-04-27 9:36 GMT+02:00 Pa Rö <[hidden email]>:
Hi Alexander and Till,

thanks for your informations, I look forward to the release.
I'm curious how well is flink ml against mahout und spark ml.

best regerds
Paul

2015-04-27 9:23 GMT+02:00 Till Rohrmann <[hidden email]>:
Hi Paul,

if you can't wait, a vanilla implementation is already contained as part of the Flink examples. You should find it under flink/flink-examples.

But we will try to add more clustering algorithms in the near future.

Cheers,
Till

On Apr 26, 2015 11:14 PM, "Alexander Alexandrov" <[hidden email]> wrote:
Yes, I expect to have one in the next few weeks (the code is actually there, but we need to port it to the Flink ML API). I suggest to follow the JIRA issue in the next weeks to check when this is done:

https://issues.apache.org/jira/browse/FLINK-1731

Regards,
Alexander

PS. Bear in mind that we will start with a vanilla implementation of K-Means. For a thorough evaluation you might want to also check variants like K-Means++.

2015-04-24 15:08 GMT+02:00 Pa Rö <[hidden email]>:
hi flink community,

at the time I write my master thesis in the field machine learning. My main task is to evaluated different k-means variants for large data sets (BigData). I would like test flink ml against Apache Mahout and Apache Hadoop MapReduce in areas of scalability and performance(time and space). What is the current state for the purpose of clustering, especially K-Means? Will there be in the near future a release information this?

best greetings
paul