I was looking forward to using Flink ML for my project where I think I can use SVM. I have been able to run a bath job using flink ML and trained and tested my data. Now I want to do the following:- 1. Applying the above-trained model to a stream of events from Kafka (Using Data Streams) : For this, I want to know if Flink ML can be used with Data Streams. 2. Persisting the model: I may want to save the trained model for some time future. Can the above 2 use cases be achieved using Apache Flink? Regards, Abhishek Kumar SinghSearch Engineer Mob :+91 7709735480 ... |
If you can save the model as a PMML file you can apply it on a stream using one of the java pmml libraries.
Sent from my iPhone
|
Hi Abhishek, Based on your description, I think this FLIP proposal[1] seems to fit perfectly for your use case. you can also checkout the Github repo by Boris (CCed) for the PMML implementation[2]. This proposal is still under development [3], you are more than welcome to test out and share your feedbacks. Thanks, Rong On Tue, May 14, 2019 at 4:44 PM Sameer Wadkar <[hidden email]> wrote:
|
Thanks a lot Rong and Sameer. Looks like this is what I wanted. I will try the above projects. Regards, Abhishek Kumar SinghSearch Engineer Mob :+91 7709735480 ... On Wed, May 15, 2019 at 8:00 AM Rong Rong <[hidden email]> wrote:
|
Thanks again for the above resources. I went through the project and also ran the example on my system to get a grasp of the architecture. However, this project does not use Flink ML in it at all. Also, after having done enough research on Flink ML, I also found that it does not let us persist the model, that's why I am not able to re-use the model trained using Flink ML. It looks like Flink ML cannot really be used for real-life use cases as it neither lets us persist the trained model, nor can it help us to use the trained model on a DataStream. Please correct me if I am wrong. Regards, Abhishek Kumar SinghSearch Engine Engineer Mob :+91 7709735480 ... On Wed, May 15, 2019 at 11:25 AM Abhishek Singh <[hidden email]> wrote:
|
Hi Abhishek, Your observation is correct. Right now, the Flink ML module is in a half-baked state and is only supported in batch mode. It is not integrated with the DataStream API. FLIP-23 proposes a feature that allows to evaluated an externally trained model (stored as PMML) on a stream of data. There is another effort to implement a new machine learning API / environment based on the Table API. This will be supported for batch and streaming sources. However, this effort just started and the features is not available yet. Best, Fabian Am So., 19. Mai 2019 um 11:54 Uhr schrieb Abhishek Singh <[hidden email]>:
|
Thanks for the confirmation, Fabian. Regards, Abhishek Kumar SinghSearch Engine Engineer Mob :+91 7709735480 ... On Sat, May 25, 2019 at 8:55 PM Fabian Hueske <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |