Re: Question about Writing Incremental Graph Algorithms using Apache Flink Gelly

Posted by kaansancak on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Question-about-Writing-Incremental-Graph-Algorithms-using-Apache-Flink-Gelly-tp34253p34369.html

If the vertex type is POJO what happens during the union of the graph? Is there a persistent approach, or can we define a function handle such occasions?

Would there be a performance difference between two cases:

1)
Graph graph = … // From edges list

graph = graph.runScatterGatherIteration();

Graph secondGraph = … // From second edge list

graph = graph.union(secondGraph).runScatterGatherIteration()

2)

Graph graph = … // From edges list

graph = graph.runScatterGatherIteration();

graph.addEdges(second_edge_list)

graph = graph.runScatterGatherIteration();


Before starting the second scatter-gather, I want to set/reset some fields of the vertex value of the vertices that are effected by edge additions/deletions (or union). It would be good to have a callback function that touches the end-points of the edges that are added/deleted.

Best
Kaan



On Apr 15, 2020, at 11:07 AM, Till Rohrmann <[hidden email]> wrote:

Hi Kaan,

I think what you are proposing is something like this:

Graph<Long, Double, Double> graph = ... // get first batch

Graph<Long, Double, Double> graphAfterFirstSG = graph.runScatterGatherIteration();

Graph<Long, Double, Double> secondBatch = ... // get second batch

// Adjust the result of SG iteration with secondBatch

Graph<Long, Double, Double> updatedGraph = graphAfterFirstSG.union/difference(secondBatch));

updatedGraph.runScatterGatherIteration();

Then I believe this should work.

Cheers,
Till

On Wed, Apr 15, 2020 at 1:14 AM Kaan Sancak <[hidden email]> wrote:
Thanks for the useful information! It seems like a good and fun idea to experiment. I will definitely give it a try.

I have a very close upcoming deadline and I have already implemented the Scatter-Gather iteration algorithm.

I have another question on whether we can chain Scatter-Gather or Vertex-Centric iterations.
Let’s say that we have an initial batch/dataset, we run a Scatter-Gather and obtain graph.
Using another batch we added/deleted vertices to the graph we obtained. 
Now we run another Scatter-Gather on the modified graph.

This is no streaming but a naive way to simulate batch updates that are happening concurrently.
Do you think it is a feasible way to do this way? 

Best
Kaan

On Apr 13, 2020, at 11:16 PM, Tzu-Li (Gordon) Tai <[hidden email]> wrote:

Hi,

As you mentioned, Gelly Graph's are backed by Flink DataSets, and therefore
work primarily on static graphs. I don't think it'll be possible to
implement incremental algorithms described in your SO question.

Have you tried looking at Stateful Functions, a recent new API added to
Flink?
It supports arbitrary messaging between functions, which may allow you to
build what you have in mind.
Take a look at Seth's an Igal's comments here [1], where there seems to be a
similar incremental graph-processing use case for sessionization.

Cheers,
Gordon

[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Complex-graph-based-sessionization-potential-use-for-stateful-functions-td34000.html#a34017



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/