Hi Kaan,
For the first issue, I think the two implementation should have difference and the first should be slower, but I think which one to use should be depend on your algorithm if it could compute incrementally only with the changed edges. However, as far as I know I think most graph algorithm does not satisfy this property, therefore I think you might have to use the first one.
For the second issue, I think you might use Graph.getVertices() and graph.getEdges() to get the underlying vertices and edges dataset of the graph, then you could do any operations with the two datasets, like join the vertices dataset with the second edge list, and finally create a new Graph with new Graph(updated_vertices, edges, env).
Best,
Yun
------------------------------------------------------------------
Send Time:2020 Apr. 16 (Thu.) 17:17
Subject:Re: Question about Writing Incremental Graph Algorithms using Apache Flink Gelly
If the vertex type is POJO what happens during the union of the graph? Is there a persistent approach, or can we define a function handle such occasions?Would there be a performance difference between two cases:
1)
Graph graph = … // From edges list
graph = graph.runScatterGatherIteration();
Graph secondGraph = … // From second edge list
graph = graph.union(secondGraph).runScatterGatherIteration()
2)
Graph graph = … // From edges list
graph = graph.runScatterGatherIteration();
graph.addEdges(second_edge_list)
graph = graph.runScatterGatherIteration();
Before starting the second scatter-gather, I want to set/reset some fields of the vertex value of the vertices that are effected by edge additions/deletions (or union). It would be good to have a callback function that touches the end-points of the edges that are added/deleted.
Best
Kaan
Hi Kaan,
I think what you are proposing is something like this:
Graph<Long, Double, Double> graph = ... // get first batch
Graph<Long, Double, Double> graphAfterFirstSG = graph.runScatterGatherIteration();
Graph<Long, Double, Double> secondBatch = ... // get second batch
// Adjust the result of SG iteration with secondBatch
Graph<Long, Double, Double> updatedGraph = graphAfterFirstSG.union/difference(secondBatch));
updatedGraph.runScatterGatherIteration();
Then I believe this should work.
Cheers,
Till
Thanks for the useful information! It seems like a good and fun idea to experiment. I will definitely give it a try.
I have a very close upcoming deadline and I have already implemented the Scatter-Gather iteration algorithm.
I have another question on whether we can chain Scatter-Gather or Vertex-Centric iterations.
Let’s say that we have an initial batch/dataset, we run a Scatter-Gather and obtain graph.
Using another batch we added/deleted vertices to the graph we obtained.
Now we run another Scatter-Gather on the modified graph.
This is no streaming but a naive way to simulate batch updates that are happening concurrently.
Do you think it is a feasible way to do this way?
Best
Kaan
On Apr 13, 2020, at 11:16 PM, Tzu-Li (Gordon) Tai <
[hidden email]> wrote: