Using Stateful Functions within a Flink pipeline

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Using Stateful Functions within a Flink pipeline

Annemarie Burger
I was wondering if it is possible to use a Stateful Function within a Flink
pipeline. I know they work with different API's, so I was wondering if it is
possible to have a DataStream as ingress for a Stateful Function.

Some context: I'm working on a streaming graph analytics system, and want to
save the state of the graph within a window. Stateful functions could then
allow me to process these distributed graph states by making use of the SF
messaging.

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Using Stateful Functions within a Flink pipeline

Oytun Tez
Hi Annemarie,

Unfortunately this is not possible at the moment, but DataStream as in/egress is in the plans as much as I know.



 --

MotaWord
Oytun Tez
M O T A W O R D CTO & Co-Founder
[hidden email]
     


On Wed, Apr 22, 2020 at 10:26 AM Annemarie Burger <[hidden email]> wrote:
I was wondering if it is possible to use a Stateful Function within a Flink
pipeline. I know they work with different API's, so I was wondering if it is
possible to have a DataStream as ingress for a Stateful Function.

Some context: I'm working on a streaming graph analytics system, and want to
save the state of the graph within a window. Stateful functions could then
allow me to process these distributed graph states by making use of the SF
messaging.

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Using Stateful Functions within a Flink pipeline

Igal Shilman
In reply to this post by Annemarie Burger
Hi Annemarie,
There are plans to make stateful functions more easily embeddable within a Flink Job, 
perhaps skipping ingress/egress routing abstracting all together and basically exposing the 
core Flink job that is the heart of stateful functions. 
Although these plans are not concrete yet I believe this would be brought to discussion with the community in the upcoming weeks.

Currently, you can split your pipeline to a preprocessing Flink job, and a stateful functions job.

Good luck,
Igal.

On Wed, Apr 22, 2020 at 4:26 PM Annemarie Burger <[hidden email]> wrote:
I was wondering if it is possible to use a Stateful Function within a Flink
pipeline. I know they work with different API's, so I was wondering if it is
possible to have a DataStream as ingress for a Stateful Function.

Some context: I'm working on a streaming graph analytics system, and want to
save the state of the graph within a window. Stateful functions could then
allow me to process these distributed graph states by making use of the SF
messaging.

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Using Stateful Functions within a Flink pipeline

Oytun Tez
@Igal, this sounds more comprehensive (better) than just opening DataStreams: "basically exposing the core Flink job that is the heart of stateful functions. "

Great!



 --

MotaWord
Oytun Tez
M O T A W O R D CTO & Co-Founder
[hidden email]
     


On Wed, Apr 22, 2020 at 10:56 AM Igal Shilman <[hidden email]> wrote:
Hi Annemarie,
There are plans to make stateful functions more easily embeddable within a Flink Job, 
perhaps skipping ingress/egress routing abstracting all together and basically exposing the 
core Flink job that is the heart of stateful functions. 
Although these plans are not concrete yet I believe this would be brought to discussion with the community in the upcoming weeks.

Currently, you can split your pipeline to a preprocessing Flink job, and a stateful functions job.

Good luck,
Igal.

On Wed, Apr 22, 2020 at 4:26 PM Annemarie Burger <[hidden email]> wrote:
I was wondering if it is possible to use a Stateful Function within a Flink
pipeline. I know they work with different API's, so I was wondering if it is
possible to have a DataStream as ingress for a Stateful Function.

Some context: I'm working on a streaming graph analytics system, and want to
save the state of the graph within a window. Stateful functions could then
allow me to process these distributed graph states by making use of the SF
messaging.

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Using Stateful Functions within a Flink pipeline

m@xi
In reply to this post by Igal Shilman
Dear Igal,

Can you elaborate more on your proposed solution of splitting the pipeline?

If possible, providing some skeleton pseudocode would be awesome!

Thanks in advance.

Best,
Max



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Using Stateful Functions within a Flink pipeline

Igal Shilman
Hi Max,

Sorry for not being clearer earlier, by splitting the pipeline I mean: having a pre-processing job that does whatever transformations necessary with the DataStream and outputs to Kafka / Kinesis,
and then having a separate StateFun deployment that consumes from that transformed Kafka / Kinesis [1] topic. 


Good luck,
Igal.

On Sun, Apr 26, 2020 at 6:43 PM m@xi <[hidden email]> wrote:
Dear Igal,

Can you elaborate more on your proposed solution of splitting the pipeline?

If possible, providing some skeleton pseudocode would be awesome!

Thanks in advance.

Best,
Max



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Using Stateful Functions within a Flink pipeline

Annemarie Burger
Hi Igal,

Thanks for your responses.
Regarding "having a pre-processing job that does whatever transformations
necessary with the DataStream and outputs to Kafka / Kinesis, and then
having a separate StateFun deployment that consumes from that transformed
Kafka / Kinesis topic."
I was wondering how to do this when using windows. I would like to call the
Stateful Function job on each separate window  of the stream. This because
the stateful function only needs the information from within the window for
its processing. Do you have any idea how to achieve this?

Best,
Annemarie



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/