Hello all, I want to perform linear regression using FlinkML's MultipleLinearRegression() function on streaming data. This function takes a DataSet as an input and I cannot create a DataSet inside the MapFunction of a DataStream. How can I use this function on my DataStream? |
Hey Piyush, Would you like to train or predict on the streaming data? Best, Marton On Wed, May 11, 2016 at 3:44 PM, Piyush Shrivastava <[hidden email]> wrote:
|
Hi Márton, I want to train and get the residuals. On Wednesday, 11 May 2016 7:19 PM, Márton Balassi <[hidden email]> wrote: Hey Piyush, Would you like to train or predict on the streaming data? Best, Marton On Wed, May 11, 2016 at 3:44 PM, Piyush Shrivastava <[hidden email]> wrote:
|
Currently I am not aware of streaming learners support, you would need to implement that yourself at this point. As for streaming predictors for batch learners I have some preview code that you might like. [1] On Wed, May 11, 2016 at 3:52 PM, Piyush Shrivastava <[hidden email]> wrote:
|
Actually model portability and persistence is a serious limitation to practical use of FlinkML in streaming. If you know what you're doing, you can write a blunt serializer for your model, write it in a file and rebuild the model stream-side with deserialized informations. I tried it for an SVM model and there were no obstacles. It's ugly but it works. 2016-05-11 16:18 GMT+02:00 Márton Balassi <[hidden email]>:
|
Free forum by Nabble | Edit this page |