Distributed Incremental Streaming Graph Analytics: State Accessing/Message Passing Options

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Distributed Incremental Streaming Graph Analytics: State Accessing/Message Passing Options

Annemarie Burger
I'm working on a system to process streaming graphs in Flink. I am trying to
maintain the state of the graph within a time window, so I can then run
graph algorithms on it. The goal is to do this with incremental updates, so
the state does not have to be fully recomputed for each window. I figured
keying on source vertex and then storing the adjacent edges in the
ProcessWindowFunction state could be a potential way to achieve this.
However, for this scenario, I am looking for proper ways to distributively
access this streaming graph state from downstream operators (other than
those maintaining the state). So, essentially, how to access state that is
stored in another node than the one doing the processing.
I also read about Stateful Functions, which I believe could be another
potential way to store the windowed graph state. What do you believe is the
better, more efficient option? Also, are there any other options I should
consider?
Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Distributed Incremental Streaming Graph Analytics: State Accessing/Message Passing Options

rmetzger0
Hey,

I would recommend using Stateful Functions for that use case.

how to access state that is stored in another node than the one doing the processing.

This is not possible in an efficient and nice way in Flink. There a hacks (using queryable state), but I would not recommend them. 


On Mon, Apr 20, 2020 at 10:30 PM burgeraw <[hidden email]> wrote:
I'm working on a system to process streaming graphs in Flink. I am trying to
maintain the state of the graph within a time window, so I can then run
graph algorithms on it. The goal is to do this with incremental updates, so
the state does not have to be fully recomputed for each window. I figured
keying on source vertex and then storing the adjacent edges in the
ProcessWindowFunction state could be a potential way to achieve this.
However, for this scenario, I am looking for proper ways to distributively
access this streaming graph state from downstream operators (other than
those maintaining the state). So, essentially, how to access state that is
stored in another node than the one doing the processing.
I also read about Stateful Functions, which I believe could be another
potential way to store the windowed graph state. What do you believe is the
better, more efficient option? Also, are there any other options I should
consider?
Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/