Hi, I’m wondering if there is a way to use FlinkML and make predictions continuously for test data coming from a DataStream.
I know FlinkML only supports the DataSet API (batch) at the moment, but is there a way to convert a DataStream into DataSets? I’m thinking of something like
(0. fit model in batch mode) 1. window the DataStream 2. convert the windowed stream to DataSets 3. use the FlinkML methods to make predictions BR, Hanna |
Hello Mäki, I think what you would like to do is train a model using batch, and use the Flink streaming API as a way to serve your model and make predictions. While we don't have an integrated way to do that in FlinkML currently, I definitely think that's possible. I know Marton Balassi has been working on something like this for the ALS algorithm, but I can't find the code right now on mobile. The general idea is to keep your model as state and use it to make predictions on a stream of incoming data. Model serving is definitely something we'll be working on in the future, I'll have a master student working on exactly that next semester. -- Sent from a mobile device. May contain autocorrect errors. On Dec 21, 2016 5:24 PM, "Mäki Hanna" <[hidden email]> wrote:
|
I'm interested in that code you mentioned too, I hope you can find it. Regards, Matt
|
Thanks for mentioning it, Theo. Look at these examples: https://github.com/streamline-eu/ML-Pipelines/commit/314e3d940f1f1ac7b762ba96067e13d806476f57 On Wed, Dec 21, 2016 at 9:38 PM, <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |