33 segments problem with configuration set

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

33 segments problem with configuration set

otherwise777
Hello Community,

I'm trying to make a function to determine the betweenness of the Vertices in a Graph. I'm using Gelly for this and a custom shortestpath function
This is my input graph: http://prntscr.com/d7y51y

What i've done is use collect() on the vertice values and loop over the list to determine the shortest path from those nodes to the rest of the nodes, in the loop i use a Union function to put everything in one big DataSet called "collectionDataSet"
Here's the code: http://paste.thezomg.com/19919/79295357/

After the 9th iteration i get the error: Too few memory segments provided. Hash Table needs at least 33 memory segments.
I had this problem before, and it was fixed by increasing the TASK_MANAGER_NETWORK_NUM_BUFFERS_KEY. Currently that's at 16000 after increasing it a couple of times, but the error keeps popping up after the 9th iteration.
When i don't use the Union the error won't pop up

The full stack trace can be found here: http://paste.thezomg.com/19921/79296084/

I tried different methods like using a join, or using reduce right after the union, but it didn't change anything on the result.
Are there other settings i need to adjust? And why is this exactly happening

Reply | Threaded
Open this post in threaded view
|

Re: 33 segments problem with configuration set

otherwise777
Some additional information i just realized, it crashes on this line of code:
collectionDataSet.print();

I tried placing it inside of the loop, it crashes at the 7th iteration now
Reply | Threaded
Open this post in threaded view
|

Re: 33 segments problem with configuration set

Vasiliki Kalavri
Dear Wouter,

first of all, as I noted in another thread already, betweenness centrality is an extremely demanding algorithm and a distributed data engine such as Flink is probably not the best system to implement it into. On top of that, the message-passing model for graph computations would generate an enormous amount of messages which translates to high memory requirements.

Now, looking at your code, there is a nested loop, which Flink cannot handle. That is, you have a for-loop and inside that you're running a Flink iteration. Neither for-loops nor nested loops are currently supported in Flink, thus, I believe you will have to re-think the logic of your algorithm.

Cheers,
-Vasia.

On 16 November 2016 at 15:10, otherwise777 <[hidden email]> wrote:
Some additional information i just realized, it crashes on this line of code:
collectionDataSet.print();

I tried placing it inside of the loop, it crashes at the 7th iteration now



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/33-segments-problem-with-configuration-set-tp10144p10149.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: 33 segments problem with configuration set

otherwise777
Hello Vasia,

thank you for your fast reply,

I am aware that determining the betweenness is very demanding, however i still want to give a try at it to a certain extent in Flink, not using Flink is currently not an option since my project is partly about Flink.

I will rethink my login, i guess it's back to the drawing board