Understading Flink statefun deployment

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Understading Flink statefun deployment

slinkydeveloper
This post was updated on .
Hi everybody,
I'm quite new to Flink and Flink Statefun and I'm trying to understand the
deployment techniques on k8s.
I wish to understand if it's feasible to deploy a statefun project
separating the different functions on separate deployments (in order to
have some functions as remote and some as embedded) all connected to the
same master. The idea is that I can scale the deployments independently
using the Kubernetes HPA and these instances cooperate automatically using
the same master. For example, given a flow like kafka -> fn a -> fn b ->
kafka:

* Remote function A (plus ingress) in deployment fn-a, where the function
process is deployed as another container in the same pod
* embedded function B (plus egress) in deployment fn-b
* master deployment in flink-master

Does that make sense at all in Flink architecture? If it's feasible, do you
have any example?

FG

--
Francesco Guardiani
Website: https://slinkydeveloper.com/
Twitter: https://twitter.com/SlinkyGuardiani

Github: https://github.com/slinkydeveloper
Reply | Threaded
Open this post in threaded view
|

Re: Understading Flink statefun deployment

Igal Shilman
Hi Francesco,

It is absolutely possible to deploy some functions as embedded and some as remote, and scale them independently, while technically being part of the same
stateful function application instance (I think that what you meant by "sharing the same master").

One possible way to do it in k8s, is to have separate deployments:
1) Embedded functions would be bundled with the Docker image that starts the Flink cluster. (flink-statefun docker image)
2) Remote functions would be packaged in a separate image deployed as a separate kubernetes deployment, and reachable via a k8s service.

For the second part checkout out the demo at the end of the keynote [1] and also [2][3][4]


Good luck,
Igal.



 

On Tue, Jun 9, 2020 at 10:50 AM Francesco Guardiani <[hidden email]> wrote:
Hi everybody,
I'm quite new to Flink and Flink Statefun and I'm trying to understand the deployment techniques on k8s.
I wish to understand if it's feasible to deploy a statefun project separating the different functions on separate deployments (in order to have some functions as remote and some as embedded) all connected to the same master. The idea is that I can scale the deployments independently using the Kubernetes HPA and these instances cooperate automatically using the same master. For example, given a flow like kafka -> fn a -> fn b -> kafka:

* Remote function A (plus ingress) in deployment fn-a, where the function process is deployed as another container in the same pod
* embedded function B (plus egress) in deployment fn-b
* master deployment in flink-master

Does that make sense at all in Flink architecture? If it's feasible, do you have any example?

FG

--
Reply | Threaded
Open this post in threaded view
|

Re: Understading Flink statefun deployment

slinkydeveloper
Hi Igal, thanks for your help.
If I understood correctly, the flink deployments (not the functions) needs
to use the same image right? Which means that the flink master and all
workers still needs to use the same image which includes the module.yaml and
the jar with embedded modules of the full project, right?
I was looking for something different: scale the workers independently,
together with the functions. I'm trying to experiment something here:
https://github.com/slinkydeveloper/playing-with-statefun
<https://github.com/slinkydeveloper/playing-with-statefun>  

In this project I'm trying to deploy separately:

* ingress
* egress
* "mapper" function
* "greeter" function

They're all "embedded functions" and I wish to deploy each piece separately
in a separate deployment.
In the next iteration of the experiment I wish to create "remote functions"
deploying them in the same pod of the workers, so the worker talks to the
image using localhost.

Hence my question: Is it possible to deploy one worker per function/group of
functions and compose my application of multiple heterogeneous worker
images? If not, does it make sense to do it or it's just no sense for
statefun architecture?

FG



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Understading Flink statefun deployment

Seth Wiesman
Hi Francesco,

No, that architecture is not possible. I'm not sure if you've used Flink's DataStream API but embedded functions under the hood are very much like lightweight process functions. If you have a single DataStream application with two process functions you cannot scale their workers independently because they are sharing the same workers. The solution, in that case, is to separate them into separate jobs which communicate through a message bus such as Kafka.

What you are describing much more aligns with remote function deployments. The functions are deployed as stateless services and can scale independently. You could for instance put functions behind a load balancer or use a FaaS. The statefun runtime and workers now only manage message passing and state storage. Even if one function becomes heavy its compute can be trivially scaled. Note that while there is only a remote python sdk at the moment, the runtime only sees a single http endpoint which could be implemented in any language. In future releases the community hopes to quickly add more language sdks such as Java, NodeJs, Rust, and Golang.

Seth

On Thu, Jun 11, 2020 at 1:30 AM slinkydeveloper <[hidden email]> wrote:
Hi Igal, thanks for your help.
If I understood correctly, the flink deployments (not the functions) needs
to use the same image right? Which means that the flink master and all
workers still needs to use the same image which includes the module.yaml and
the jar with embedded modules of the full project, right?
I was looking for something different: scale the workers independently,
together with the functions. I'm trying to experiment something here:
https://github.com/slinkydeveloper/playing-with-statefun
<https://github.com/slinkydeveloper/playing-with-statefun

In this project I'm trying to deploy separately:

* ingress
* egress
* "mapper" function
* "greeter" function

They're all "embedded functions" and I wish to deploy each piece separately
in a separate deployment.
In the next iteration of the experiment I wish to create "remote functions"
deploying them in the same pod of the workers, so the worker talks to the
image using localhost.

Hence my question: Is it possible to deploy one worker per function/group of
functions and compose my application of multiple heterogeneous worker
images? If not, does it make sense to do it or it's just no sense for
statefun architecture?

FG



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/