Re: Spargel pagerank with sinks

Posted by Stephan Ewen on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Spargel-pagerank-with-sinks-tp92p93.html

Hi!

With "sinks" in the graph, you mean vertices with no out-links?

There might be a simple trick, by adding to each vertex an edge-to-self (put an entry in the diagonal of the adjacency matrix).

I have not thought through the implications 100%.
@ssc Can you elaborate on this? 


What would always work is that you gather statistics about how much probability is accumulated in the sinks and redistribute it across the other nodes.

The iteration aggregators allow you to do this. They can sum up the probability in the message sender function (when there is no outgoing edge),
and re-add it to the non-sink nodes (by accessing the aggregate from the previous iteration).

Have a look at the function "registerAggregator()" on the "VertexCentricIteration", and the Functions "getIterationAggregator()" and "getPreviousIterationAggregate()" on the VertexUpdateFunction and the MessagingFunction.


Stephan

On Thu, Sep 18, 2014 at 5:01 PM, Attila Bernáth <[hidden email]> wrote:
Dear All,

I wonder how to write the pagerank program in the spargel API if there
might be sinks in the graph.

What is the nicest way to solve this?

Thank you for your answer.

Attila