Hi all,
Sorry, this is me again with another question. Maybe I did not search deep enough, but it seems the FlinkML API is still pure batch. If I read https://cwiki.apache.org/confluence/display/FLINK/FlinkML%3A+Vision+and+Roadmap it seems there was the intend to "exploit the streaming nature of Flink, and provide functionality designed specifically for data streams" but from my external point of view, I don't see much happening here. Is there work in progress towards that? I would personally see two use-cases around streaming, first one around updating an existing model that was build in batch, second one would be triggering prediction not through a batch job but in a stream job. Are these things that are in the works? or maybe already feasible despite the API looking like purely batch branded? Thanks, -- Christophe
|
Hi Christophe, it is true that FlinkML only targets batch workloads. Also, there has not been any development since a long time.If you dig through the mailing list thread, you'll find a link to a Google doc that discusses other possible directions. Best, Fabian 2018-02-05 16:43 GMT+01:00 Christophe Jolif <[hidden email]>:
|
Fabian, I suspect I should not be the only one that would love to apply machine learning as part of a Flink Processing? Waiting for FLIP-23 what are the "best" practices today? Thanks again for your help, -- Christophe On Mon, Feb 5, 2018 at 6:01 PM, Fabian Hueske <[hidden email]> wrote:
|
That's correct. It's not possible to persist data in memory across jobs in Flink's batch API. Best, Fabian 2018-02-05 18:28 GMT+01:00 Christophe Jolif <[hidden email]>:
|
Free forum by Nabble | Edit this page |