Dear All,
I wonder how to write the pagerank program in the spargel API if there might be sinks in the graph. What is the nicest way to solve this? Thank you for your answer. Attila |
Hi! With "sinks" in the graph, you mean vertices with no out-links? There might be a simple trick, by adding to each vertex an edge-to-self (put an entry in the diagonal of the adjacency matrix). I have not thought through the implications 100%. @ssc Can you elaborate on this? What would always work is that you gather statistics about how much probability is accumulated in the sinks and redistribute it across the other nodes. The iteration aggregators allow you to do this. They can sum up the probability in the message sender function (when there is no outgoing edge), and re-add it to the non-sink nodes (by accessing the aggregate from the previous iteration). Have a look at the function "registerAggregator()" on the "VertexCentricIteration", and the Functions "getIterationAggregator()" and "getPreviousIterationAggregate()" on the VertexUpdateFunction and the MessagingFunction. Stephan On Thu, Sep 18, 2014 at 5:01 PM, Attila Bernáth <[hidden email]> wrote: Dear All, |
Dear Stephan,
Thank you for your answer. > With "sinks" in the graph, you mean vertices with no out-links? Yes, I do. > There might be a simple trick, by adding to each vertex an edge-to-self (put > an entry in the diagonal of the adjacency matrix). > > I have not thought through the implications 100%. > @ssc Can you elaborate on this? I don't think that this works. > > > What would always work is that you gather statistics about how much > probability is accumulated in the sinks and redistribute it across the other > nodes. > > The iteration aggregators allow you to do this. They can sum up the > probability in the message sender function (when there is no outgoing edge), > and re-add it to the non-sink nodes (by accessing the aggregate from the > previous iteration). thought that the aggregate from the previous iteration is of no use. I will try this. Thanks. Attila > > Have a look at the function "registerAggregator()" on the > "VertexCentricIteration", and the Functions "getIterationAggregator()" and > "getPreviousIterationAggregate()" on the VertexUpdateFunction and the > MessagingFunction. > > > Stephan > > On Thu, Sep 18, 2014 at 5:01 PM, Attila Bernáth <[hidden email]> > wrote: >> >> Dear All, >> >> I wonder how to write the pagerank program in the spargel API if there >> might be sinks in the graph. >> >> What is the nicest way to solve this? >> >> Thank you for your answer. >> >> Attila > > |
Hi Attila,
The common solution for handling vertices with out outlinks is to aggregate the probability mass that is assigned to them and to uniformly redistribute this among all vertices in every iteration. You can have a look at Giraph's RandomWalkVertex which uses this approach: https://github.com/apache/giraph/blob/release-1.0/giraph-examples/src/main/java/org/apache/giraph/examples/RandomWalkVertex.java Best, Sebastian On 09/19/2014 10:36 AM, Attila Bernáth wrote: > Dear Stephan, > > Thank you for your answer. > >> With "sinks" in the graph, you mean vertices with no out-links? > Yes, I do. > >> There might be a simple trick, by adding to each vertex an edge-to-self (put >> an entry in the diagonal of the adjacency matrix). >> >> I have not thought through the implications 100%. >> @ssc Can you elaborate on this? > I don't think that this works. > > >> >> >> What would always work is that you gather statistics about how much >> probability is accumulated in the sinks and redistribute it across the other >> nodes. >> >> The iteration aggregators allow you to do this. They can sum up the >> probability in the message sender function (when there is no outgoing edge), >> and re-add it to the non-sink nodes (by accessing the aggregate from the >> previous iteration). > Thank you for the idea. I was trying to use an aggregator, but I > thought that the aggregate from the previous iteration is of no use. > > I will try this. > > Thanks. > > Attila > > >> >> Have a look at the function "registerAggregator()" on the >> "VertexCentricIteration", and the Functions "getIterationAggregator()" and >> "getPreviousIterationAggregate()" on the VertexUpdateFunction and the >> MessagingFunction. >> >> >> Stephan >> >> On Thu, Sep 18, 2014 at 5:01 PM, Attila Bernáth <[hidden email]> >> wrote: >>> >>> Dear All, >>> >>> I wonder how to write the pagerank program in the spargel API if there >>> might be sinks in the graph. >>> >>> What is the nicest way to solve this? >>> >>> Thank you for your answer. >>> >>> Attila >> >> |
Free forum by Nabble | Edit this page |