Dynamic statefun topologies

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Dynamic statefun topologies

Frédérique Mittelstaedt
Hi!

Thanks for all the great work on both Flink and Statefun. I saw this recent email thread (https://lists.apache.org/thread.html/re984157869f5efd136cda9d679889e6ba2f132213ae7afff715783e2%40%3Cuser.flink.apache.org%3E) and we’re looking at a similar problem.

Right now, statefun requires you to specify a yaml config upon start-up that sets up all the bindings. If you want to change the config, you effectively need to restart your Flink instance.

We’re looking to use Flink statefun with dynamic topologies, i.e. based on a config change in another system, we want to create/update/delete bindings. 

From the email thread, it looks like there’s going to be support for dynamic function dispatch by name patterns which is pretty cool, but it sounds like you still need to redeploy if you add a new ingress or egress. Is that correct?

Are there plans to support such a dynamic use case? Or is there already a way to achieve this that I’m not aware of?

For now, we’re considering to generate the yaml dynamically and whenever a change is necessary, restart Flink with the new config. We can create a savepoint before teardown and resume from it after restart, but that adds quite a bit of complexity and potential points of failure. Would this approach be a recommended method for achieving this type of dynamic config right now?

Alternatively, I also saw that you can deploy jars to the Flink cluster, but the code samples all seem to be for JVM functions. Is it possible to submit remote function jobs as jars to Flink? If so, how do you do that / do you have a link to an example?

Thanks a lot for your help & all the best,
Frédérique

Reply | Threaded
Open this post in threaded view
|

Re: Dynamic statefun topologies

Igal Shilman
Hi Frédérique!

Thank you for your kind words! let me try to answer your questions:

From the email thread, it looks like there’s going to be support for dynamic function dispatch by name patterns which is pretty cool, but it sounds like you still need to redeploy if you add a new ingress or egress. Is that correct?
This is correct.

Are there plans to support such a dynamic use case? Or is there already a way to achieve this that I’m not aware of?
There are no plans to support that at the moment, but we would be very happy to learn more about your use case, and possibly re-prioritize
this.

For now, we’re considering to generate the yaml dynamically and whenever a change is necessary, restart Flink with the new config. We can create a savepoint before teardown and resume from it after restart, but that adds quite a bit of complexity and potential points of failure. Would this approach be a recommended method for achieving this type of dynamic config right now?
Your suggestion can definitely work, and I think that it really depends on what you are trying to achieve.
For example: are you trying to add/remove topics to consume from Kafka? or are you trying to add completely different ingresses? or possibly the topics that you consume from, do not change, but you would like to change the routing dynamically?

 
Alternatively, I also saw that you can deploy jars to the Flink cluster, but the code samples all seem to be for JVM functions. Is it possible to submit remote function jobs as jars to Flink? If so, how do you do that / do you have a link to an example?
This is possible with remote functions as-well, as long as the module.yaml can be found in the classpath, or alternatively using the datastream integration[1].


Kind regards,
Igal.


On Tue, Feb 2, 2021 at 11:52 AM Frédérique Mittelstaedt <[hidden email]> wrote:
Hi!

Thanks for all the great work on both Flink and Statefun. I saw this recent email thread (https://lists.apache.org/thread.html/re984157869f5efd136cda9d679889e6ba2f132213ae7afff715783e2%40%3Cuser.flink.apache.org%3E) and we’re looking at a similar problem.

Right now, statefun requires you to specify a yaml config upon start-up that sets up all the bindings. If you want to change the config, you effectively need to restart your Flink instance.

We’re looking to use Flink statefun with dynamic topologies, i.e. based on a config change in another system, we want to create/update/delete bindings. 

From the email thread, it looks like there’s going to be support for dynamic function dispatch by name patterns which is pretty cool, but it sounds like you still need to redeploy if you add a new ingress or egress. Is that correct?

Are there plans to support such a dynamic use case? Or is there already a way to achieve this that I’m not aware of?

For now, we’re considering to generate the yaml dynamically and whenever a change is necessary, restart Flink with the new config. We can create a savepoint before teardown and resume from it after restart, but that adds quite a bit of complexity and potential points of failure. Would this approach be a recommended method for achieving this type of dynamic config right now?

Alternatively, I also saw that you can deploy jars to the Flink cluster, but the code samples all seem to be for JVM functions. Is it possible to submit remote function jobs as jars to Flink? If so, how do you do that / do you have a link to an example?

Thanks a lot for your help & all the best,
Frédérique