Flink ML Use cases

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink ML Use cases

Abhishek Singh
I was looking forward to using Flink ML for my project where I think I can use SVM.

I have been able to run a bath job using flink ML and trained and tested my data.

Now I want to do the following:-  
1. Applying the above-trained model to a stream of events from Kafka  (Using Data Streams) :    For this, I want to know if Flink ML can be used with Data Streams.

2. Persisting the model: I may want to save the trained model for some time future.

Can the above 2 use cases be achieved using Apache Flink?     

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...
Reply | Threaded
Open this post in threaded view
|

Re: Flink ML Use cases

Sameer Wadkar
If you can save the model as a PMML file you can apply it on a stream using one of the java pmml libraries. 

Sent from my iPhone

On May 14, 2019, at 4:44 PM, Abhishek Singh <[hidden email]> wrote:

I was looking forward to using Flink ML for my project where I think I can use SVM.

I have been able to run a bath job using flink ML and trained and tested my data.

Now I want to do the following:-  
1. Applying the above-trained model to a stream of events from Kafka  (Using Data Streams) :    For this, I want to know if Flink ML can be used with Data Streams.

2. Persisting the model: I may want to save the trained model for some time future.

Can the above 2 use cases be achieved using Apache Flink?     

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...
Reply | Threaded
Open this post in threaded view
|

Re: Flink ML Use cases

Rong Rong
Hi Abhishek,

Based on your description, I think this FLIP proposal[1] seems to fit perfectly for your use case. 
you can also checkout the Github repo by Boris (CCed) for the PMML implementation[2]. This proposal is still under development [3], you are more than welcome to test out and share your feedbacks.

Thanks,
Rong


On Tue, May 14, 2019 at 4:44 PM Sameer Wadkar <[hidden email]> wrote:
If you can save the model as a PMML file you can apply it on a stream using one of the java pmml libraries. 

Sent from my iPhone

On May 14, 2019, at 4:44 PM, Abhishek Singh <[hidden email]> wrote:

I was looking forward to using Flink ML for my project where I think I can use SVM.

I have been able to run a bath job using flink ML and trained and tested my data.

Now I want to do the following:-  
1. Applying the above-trained model to a stream of events from Kafka  (Using Data Streams) :    For this, I want to know if Flink ML can be used with Data Streams.

2. Persisting the model: I may want to save the trained model for some time future.

Can the above 2 use cases be achieved using Apache Flink?     

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...
Reply | Threaded
Open this post in threaded view
|

Re: Flink ML Use cases

Abhishek Singh

Thanks a lot Rong and Sameer.

Looks like this is what I wanted.

I will try the above projects.  

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...


On Wed, May 15, 2019 at 8:00 AM Rong Rong <[hidden email]> wrote:
Hi Abhishek,

Based on your description, I think this FLIP proposal[1] seems to fit perfectly for your use case. 
you can also checkout the Github repo by Boris (CCed) for the PMML implementation[2]. This proposal is still under development [3], you are more than welcome to test out and share your feedbacks.

Thanks,
Rong


On Tue, May 14, 2019 at 4:44 PM Sameer Wadkar <[hidden email]> wrote:
If you can save the model as a PMML file you can apply it on a stream using one of the java pmml libraries. 

Sent from my iPhone

On May 14, 2019, at 4:44 PM, Abhishek Singh <[hidden email]> wrote:

I was looking forward to using Flink ML for my project where I think I can use SVM.

I have been able to run a bath job using flink ML and trained and tested my data.

Now I want to do the following:-  
1. Applying the above-trained model to a stream of events from Kafka  (Using Data Streams) :    For this, I want to know if Flink ML can be used with Data Streams.

2. Persisting the model: I may want to save the trained model for some time future.

Can the above 2 use cases be achieved using Apache Flink?     

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...
Reply | Threaded
Open this post in threaded view
|

Re: Flink ML Use cases

Abhishek Singh

Thanks again for the above resources.

I went through the project and also ran the example on my system to get a grasp of the architecture.

However, this project does not use Flink ML in it at all.

Also, after having done enough research on Flink ML, I also found that it does not let us persist the model, that's why I am not able to re-use the model trained using Flink ML.

It looks like Flink ML cannot really be used for real-life use cases as it neither lets us persist the trained model, nor can it help us to use the trained model on a DataStream.

Please correct me if I am wrong.




Regards,
Abhishek Kumar Singh
Search Engine Engineer
Mob :+91 7709735480 


...


On Wed, May 15, 2019 at 11:25 AM Abhishek Singh <[hidden email]> wrote:

Thanks a lot Rong and Sameer.

Looks like this is what I wanted.

I will try the above projects.  

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...


On Wed, May 15, 2019 at 8:00 AM Rong Rong <[hidden email]> wrote:
Hi Abhishek,

Based on your description, I think this FLIP proposal[1] seems to fit perfectly for your use case. 
you can also checkout the Github repo by Boris (CCed) for the PMML implementation[2]. This proposal is still under development [3], you are more than welcome to test out and share your feedbacks.

Thanks,
Rong


On Tue, May 14, 2019 at 4:44 PM Sameer Wadkar <[hidden email]> wrote:
If you can save the model as a PMML file you can apply it on a stream using one of the java pmml libraries. 

Sent from my iPhone

On May 14, 2019, at 4:44 PM, Abhishek Singh <[hidden email]> wrote:

I was looking forward to using Flink ML for my project where I think I can use SVM.

I have been able to run a bath job using flink ML and trained and tested my data.

Now I want to do the following:-  
1. Applying the above-trained model to a stream of events from Kafka  (Using Data Streams) :    For this, I want to know if Flink ML can be used with Data Streams.

2. Persisting the model: I may want to save the trained model for some time future.

Can the above 2 use cases be achieved using Apache Flink?     

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...
Reply | Threaded
Open this post in threaded view
|

Re: Flink ML Use cases

Fabian Hueske-2
Hi Abhishek,

Your observation is correct. Right now, the Flink ML module is in a half-baked state and is only supported in batch mode.
It is not integrated with the DataStream API. FLIP-23 proposes a feature that allows to evaluated an externally trained model (stored as PMML) on a stream of data.

There is another effort to implement a new machine learning API / environment based on the Table API. This will be supported for batch and streaming sources.
However, this effort just started and the features is not available yet.

Best, Fabian

Am So., 19. Mai 2019 um 11:54 Uhr schrieb Abhishek Singh <[hidden email]>:

Thanks again for the above resources.

I went through the project and also ran the example on my system to get a grasp of the architecture.

However, this project does not use Flink ML in it at all.

Also, after having done enough research on Flink ML, I also found that it does not let us persist the model, that's why I am not able to re-use the model trained using Flink ML.

It looks like Flink ML cannot really be used for real-life use cases as it neither lets us persist the trained model, nor can it help us to use the trained model on a DataStream.

Please correct me if I am wrong.




Regards,
Abhishek Kumar Singh
Search Engine Engineer
Mob :+91 7709735480 


...


On Wed, May 15, 2019 at 11:25 AM Abhishek Singh <[hidden email]> wrote:

Thanks a lot Rong and Sameer.

Looks like this is what I wanted.

I will try the above projects.  

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...


On Wed, May 15, 2019 at 8:00 AM Rong Rong <[hidden email]> wrote:
Hi Abhishek,

Based on your description, I think this FLIP proposal[1] seems to fit perfectly for your use case. 
you can also checkout the Github repo by Boris (CCed) for the PMML implementation[2]. This proposal is still under development [3], you are more than welcome to test out and share your feedbacks.

Thanks,
Rong


On Tue, May 14, 2019 at 4:44 PM Sameer Wadkar <[hidden email]> wrote:
If you can save the model as a PMML file you can apply it on a stream using one of the java pmml libraries. 

Sent from my iPhone

On May 14, 2019, at 4:44 PM, Abhishek Singh <[hidden email]> wrote:

I was looking forward to using Flink ML for my project where I think I can use SVM.

I have been able to run a bath job using flink ML and trained and tested my data.

Now I want to do the following:-  
1. Applying the above-trained model to a stream of events from Kafka  (Using Data Streams) :    For this, I want to know if Flink ML can be used with Data Streams.

2. Persisting the model: I may want to save the trained model for some time future.

Can the above 2 use cases be achieved using Apache Flink?     

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...
Reply | Threaded
Open this post in threaded view
|

Re: Flink ML Use cases

Abhishek Singh

Thanks for the confirmation, Fabian.


Regards,
Abhishek Kumar Singh
Search Engine Engineer
Mob :+91 7709735480 


...


On Sat, May 25, 2019 at 8:55 PM Fabian Hueske <[hidden email]> wrote:
Hi Abhishek,

Your observation is correct. Right now, the Flink ML module is in a half-baked state and is only supported in batch mode.
It is not integrated with the DataStream API. FLIP-23 proposes a feature that allows to evaluated an externally trained model (stored as PMML) on a stream of data.

There is another effort to implement a new machine learning API / environment based on the Table API. This will be supported for batch and streaming sources.
However, this effort just started and the features is not available yet.

Best, Fabian

Am So., 19. Mai 2019 um 11:54 Uhr schrieb Abhishek Singh <[hidden email]>:

Thanks again for the above resources.

I went through the project and also ran the example on my system to get a grasp of the architecture.

However, this project does not use Flink ML in it at all.

Also, after having done enough research on Flink ML, I also found that it does not let us persist the model, that's why I am not able to re-use the model trained using Flink ML.

It looks like Flink ML cannot really be used for real-life use cases as it neither lets us persist the trained model, nor can it help us to use the trained model on a DataStream.

Please correct me if I am wrong.




Regards,
Abhishek Kumar Singh
Search Engine Engineer
Mob :+91 7709735480 


...


On Wed, May 15, 2019 at 11:25 AM Abhishek Singh <[hidden email]> wrote:

Thanks a lot Rong and Sameer.

Looks like this is what I wanted.

I will try the above projects.  

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...


On Wed, May 15, 2019 at 8:00 AM Rong Rong <[hidden email]> wrote:
Hi Abhishek,

Based on your description, I think this FLIP proposal[1] seems to fit perfectly for your use case. 
you can also checkout the Github repo by Boris (CCed) for the PMML implementation[2]. This proposal is still under development [3], you are more than welcome to test out and share your feedbacks.

Thanks,
Rong


On Tue, May 14, 2019 at 4:44 PM Sameer Wadkar <[hidden email]> wrote:
If you can save the model as a PMML file you can apply it on a stream using one of the java pmml libraries. 

Sent from my iPhone

On May 14, 2019, at 4:44 PM, Abhishek Singh <[hidden email]> wrote:

I was looking forward to using Flink ML for my project where I think I can use SVM.

I have been able to run a bath job using flink ML and trained and tested my data.

Now I want to do the following:-  
1. Applying the above-trained model to a stream of events from Kafka  (Using Data Streams) :    For this, I want to know if Flink ML can be used with Data Streams.

2. Persisting the model: I may want to save the trained model for some time future.

Can the above 2 use cases be achieved using Apache Flink?     

Regards,
Abhishek Kumar Singh
Search Engineer
Mob :+91 7709735480 


...