Hi mates ! I'm in the beginning of the road of building a recommendation pipeline on top of Flink. I'm going to register a list of UDF python functions on job startups where each UDF is an ML model. Over time new model versions appear in the ML registry and I would like to update my UDF functions on the fly without need to restart the whole job. Could you tell me, whether it's possible or not ? Maybe the community can give advice on how such tasks can be solved using Flink and what other approaches exist. Thanks a lot for your help and advice ! |
Hi Rinat, Which API are you using? If you use datastream API, the common way to simulate side inputs (which is what you need) is to use a broadcast. There is an example on SO [1]. On Sat, Oct 10, 2020 at 7:12 PM Sharipov, Rinat <[hidden email]> wrote:
-- Arvid Heise | Senior Java Developer Follow us @VervericaData -- Join Flink Forward - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbHRegistered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng |
Hi Arvid, thx for your reply. We are already using the approach with control streams to propagate business rules through our data-pipeline. Because all our models are powered by Python, I'm going to use Table API and register UDF functions, where each UDF is a separate model. So my question is - can I update the UDF function on the fly without a job restart ? Because new model versions become available on a daily basis and we should use them as soon as possible. Thx ! пн, 12 окт. 2020 г. в 11:32, Arvid Heise <[hidden email]>:
|
Hi Rinat, Do you want to replace the UDFs with new ones on the fly or just want to update the model which could be seen as instance variables inside the UDF? For the former case, it's not supported AFAIK. For the latter case, I think you could just update the model in the UDF periodically or according to some custom strategy. It's the behavior of the UDF. Regards, Dian
|
Hi Dian, thx for your reply ! I was wondering to replace UDF on the fly from Flink, of course I'm pretty sure that it's possible to implement update logic directly in Python, thx for idea Regards, Rinat пн, 12 окт. 2020 г. в 14:20, Dian Fu <[hidden email]>:
|
Free forum by Nabble | Edit this page |