RuntimeException Gelly API: Memory ran out. Compaction failed.

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

RuntimeException Gelly API: Memory ran out. Compaction failed.

Mihail Vieru
Hi,

I'm getting the following RuntimeException for an adaptation of the
SingleSourceShortestPaths example using the Gelly API (see attachment).
It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked)
from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and
computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric
iteration
(org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4
|
org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4)
switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed.
numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow
segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory:
7208960 Message: Index: 8, Size: 7
     at
org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
     at
org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
     at
org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
     at
org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
     at
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
     at
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
     at java.lang.Thread.run(Thread.java:745)


Best,
Mihail

SingleSourceShortestPathsExampleUnweighted.java (5K) Download Attachment
SingleSourceShortestPaths.java (3K) Download Attachment
SingleSourceShortestPathsDataUnweighted.java (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Mihail Vieru
And the correct SSSPUnweighted attached.

On 17.03.2015 01:23, Mihail Vieru wrote:

> Hi,
>
> I'm getting the following RuntimeException for an adaptation of the
> SingleSourceShortestPaths example using the Gelly API (see
> attachment). It's been adapted for unweighted graphs having vertices
> with Long values.
>
> As an input graph I'm using the social network graph (~200MB unpacked)
> from here: https://snap.stanford.edu/data/higgs-twitter.html
>
> For the small SSSPDataUnweighted graph (also attached) it terminates
> and computes the distances correctly.
>
>
> 03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric
> iteration
> (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4
> |
> org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4)
> switched to FAILED
> java.lang.RuntimeException: Memory ran out. Compaction failed.
> numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow
> segments: 176 bucketSize: 217 Overall memory: 20316160 Partition
> memory: 7208960 Message: Index: 8, Size: 7
>     at
> org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
>     at
> org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
>     at
> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
>     at
> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
>     at
> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
>     at
> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
>     at java.lang.Thread.run(Thread.java:745)
>
>
> Best,
> Mihail


SingleSourceShortestPathsUnweighted.java (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Robert Waury
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail


Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Mihail Vieru
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail

On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail



Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Robert Waury
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail




Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Mihail Vieru
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail

On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail





Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Stephan Ewen
This job probably suffers from overly conservative memory assignment, giving the solution set too little memory.

Can you try to make the solution set "unmanaged", excluding it from Flink's memory management? That may help with the problem.




On Wed, Mar 18, 2015 at 3:54 PM, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail






Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Vasiliki Kalavri
In reply to this post by Mihail Vieru
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail






Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Robert Waury

Hi Vasia,

How much memory does your job use?

I think the problem is as Stephan says a too conservative allocation but that it will work if you throw enough memory at it.

Or did your setup succeed with an amount of memory comparable to Mihail's and mine?

My main point is that it shouldn't take 10x more memory than the input size for such a job.

Cheers,
Robert

On Mar 18, 2015 5:06 PM, "Vasiliki Kalavri" <[hidden email]> wrote:
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail






Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Vasiliki Kalavri
Hi Robert,

my setup has even less memory than your setup, ~900MB in total.

When using the local environment (running the job through your IDE), the available of memory is split equally between the JobManager and TaskManager. Then, the default memory kept for network buffers is subtracted from the TaskManager's part. 
Finally, the TaskManager is assigned 70% (by default) of what is left. 
In my case, this was 255MB.

So, I'm guessing that either the options you're passing to eclipse are not properly read (I haven't tried it myself) or that there's something wrong in the way you're generating the graph. That's why I asked how you produce the vertex dataset.

Cheers,
V.



On 18 March 2015 at 18:27, Robert Waury <[hidden email]> wrote:

Hi Vasia,

How much memory does your job use?

I think the problem is as Stephan says a too conservative allocation but that it will work if you throw enough memory at it.

Or did your setup succeed with an amount of memory comparable to Mihail's and mine?

My main point is that it shouldn't take 10x more memory than the input size for such a job.

Cheers,
Robert

On Mar 18, 2015 5:06 PM, "Vasiliki Kalavri" <[hidden email]> wrote:
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail







Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Mihail Vieru
In reply to this post by Vasiliki Kalavri
Hi Vasia,

I have used a simple job (attached) to generate a file which looks like this:

0 0
1 1
2 2
...
456629 456629
456630 456630

I need the vertices to be generated from a file for my future work.

Cheers,
Mihail


On 18.03.2015 17:04, Vasiliki Kalavri wrote:
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail








GenerateVerticesOneToN.java (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Vasiliki Kalavri
Hi Mihail,

I used your code to generate the vertex file, then gave this and the edge list as input to your SSSP implementation and still couldn't reproduce the exception. I'm using the same local setup as I describe above.
I'm not aware of any recent changes that might be relevant, but, just in case, are you using the latest master?

Cheers,
V.

On 18 March 2015 at 19:21, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

I have used a simple job (attached) to generate a file which looks like this:

0 0
1 1
2 2
...
456629 456629
456630 456630

I need the vertices to be generated from a file for my future work.

Cheers,
Mihail



On 18.03.2015 17:04, Vasiliki Kalavri wrote:
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail








Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Mihail Vieru
Hi Vasia,

yes, I am using the latest master. I just did a pull again and the problem persists. Perhaps Robert could confirm as well.

I've set the solution set to unmanaged in SSSPUnweighted as Stephan proposed and the job finishes. So I am able to proceed using this workaround.

An odd thing occurs now though. The distances aren't computed correctly for the SNAP graph and remain the one set in InitVerticesMapper(). For the small graph in SSSPDataUnweighted they are OK. I'm currently investigating this behavior.

Cheers,
Mihail

On 18.03.2015 20:55, Vasiliki Kalavri wrote:
Hi Mihail,

I used your code to generate the vertex file, then gave this and the edge list as input to your SSSP implementation and still couldn't reproduce the exception. I'm using the same local setup as I describe above.
I'm not aware of any recent changes that might be relevant, but, just in case, are you using the latest master?

Cheers,
V.

On 18 March 2015 at 19:21, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

I have used a simple job (attached) to generate a file which looks like this:

0 0
1 1
2 2
...
456629 456629
456630 456630

I need the vertices to be generated from a file for my future work.

Cheers,
Mihail



On 18.03.2015 17:04, Vasiliki Kalavri wrote:
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail









Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Vasiliki Kalavri
hmm, I'm starting to run out of ideas...
What's your source ID parameter? I ran mine with 0. 
About the result, you call both createVertexCentricIteration() and runVertexCentricIteration() on the initialized graph, right?

On 18 March 2015 at 22:33, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

yes, I am using the latest master. I just did a pull again and the problem persists. Perhaps Robert could confirm as well.

I've set the solution set to unmanaged in SSSPUnweighted as Stephan proposed and the job finishes. So I am able to proceed using this workaround.

An odd thing occurs now though. The distances aren't computed correctly for the SNAP graph and remain the one set in InitVerticesMapper(). For the small graph in SSSPDataUnweighted they are OK. I'm currently investigating this behavior.

Cheers,
Mihail


On 18.03.2015 20:55, Vasiliki Kalavri wrote:
Hi Mihail,

I used your code to generate the vertex file, then gave this and the edge list as input to your SSSP implementation and still couldn't reproduce the exception. I'm using the same local setup as I describe above.
I'm not aware of any recent changes that might be relevant, but, just in case, are you using the latest master?

Cheers,
V.

On 18 March 2015 at 19:21, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

I have used a simple job (attached) to generate a file which looks like this:

0 0
1 1
2 2
...
456629 456629
456630 456630

I need the vertices to be generated from a file for my future work.

Cheers,
Mihail



On 18.03.2015 17:04, Vasiliki Kalavri wrote:
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail










Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Mihail Vieru
I'm also using 0 as sourceID. The exact program arguments:

0 /home/vieru/dev/flink-experiments/data/social_network.edgelist /home/vieru/dev/flink-experiments/data/social_network.verticeslist /home/vieru/dev/flink-experiments/sssp-output-higgstwitter 10

And yes, I call both methods on the initialized Graph mappedInput. I don't understand why the distances are computed correctly for the small graph (also read from files) but not for the larger one.
The messages appear to be wrong in the latter case.

On 18.03.2015 21:55, Vasiliki Kalavri wrote:
hmm, I'm starting to run out of ideas...
What's your source ID parameter? I ran mine with 0. 
About the result, you call both createVertexCentricIteration() and runVertexCentricIteration() on the initialized graph, right?

On 18 March 2015 at 22:33, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

yes, I am using the latest master. I just did a pull again and the problem persists. Perhaps Robert could confirm as well.

I've set the solution set to unmanaged in SSSPUnweighted as Stephan proposed and the job finishes. So I am able to proceed using this workaround.

An odd thing occurs now though. The distances aren't computed correctly for the SNAP graph and remain the one set in InitVerticesMapper(). For the small graph in SSSPDataUnweighted they are OK. I'm currently investigating this behavior.

Cheers,
Mihail


On 18.03.2015 20:55, Vasiliki Kalavri wrote:
Hi Mihail,

I used your code to generate the vertex file, then gave this and the edge list as input to your SSSP implementation and still couldn't reproduce the exception. I'm using the same local setup as I describe above.
I'm not aware of any recent changes that might be relevant, but, just in case, are you using the latest master?

Cheers,
V.

On 18 March 2015 at 19:21, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

I have used a simple job (attached) to generate a file which looks like this:

0 0
1 1
2 2
...
456629 456629
456630 456630

I need the vertices to be generated from a file for my future work.

Cheers,
Mihail



On 18.03.2015 17:04, Vasiliki Kalavri wrote:
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail











Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Vasiliki Kalavri
Well, one thing I notice is that your vertices and edges args are flipped. Might be the source of error :-)

On 18 March 2015 at 23:04, Mihail Vieru <[hidden email]> wrote:
I'm also using 0 as sourceID. The exact program arguments:

0 /home/vieru/dev/flink-experiments/data/social_network.edgelist /home/vieru/dev/flink-experiments/data/social_network.verticeslist /home/vieru/dev/flink-experiments/sssp-output-higgstwitter 10

And yes, I call both methods on the initialized Graph mappedInput. I don't understand why the distances are computed correctly for the small graph (also read from files) but not for the larger one.
The messages appear to be wrong in the latter case.


On 18.03.2015 21:55, Vasiliki Kalavri wrote:
hmm, I'm starting to run out of ideas...
What's your source ID parameter? I ran mine with 0. 
About the result, you call both createVertexCentricIteration() and runVertexCentricIteration() on the initialized graph, right?

On 18 March 2015 at 22:33, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

yes, I am using the latest master. I just did a pull again and the problem persists. Perhaps Robert could confirm as well.

I've set the solution set to unmanaged in SSSPUnweighted as Stephan proposed and the job finishes. So I am able to proceed using this workaround.

An odd thing occurs now though. The distances aren't computed correctly for the SNAP graph and remain the one set in InitVerticesMapper(). For the small graph in SSSPDataUnweighted they are OK. I'm currently investigating this behavior.

Cheers,
Mihail


On 18.03.2015 20:55, Vasiliki Kalavri wrote:
Hi Mihail,

I used your code to generate the vertex file, then gave this and the edge list as input to your SSSP implementation and still couldn't reproduce the exception. I'm using the same local setup as I describe above.
I'm not aware of any recent changes that might be relevant, but, just in case, are you using the latest master?

Cheers,
V.

On 18 March 2015 at 19:21, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

I have used a simple job (attached) to generate a file which looks like this:

0 0
1 1
2 2
...
456629 456629
456630 456630

I need the vertices to be generated from a file for my future work.

Cheers,
Mihail



On 18.03.2015 17:04, Vasiliki Kalavri wrote:
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail












Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Vasiliki Kalavri
haha, yes, actually I just confirmed!
If I flip my args, I get the error you mention in the first e-mail. you're trying to generate a graph giving the edge list as a vertex list and this is a way too big dataset for your memory settings (cmp. ~15m edges vs. the actual 400k).

I hope that clear everything out :-)

Cheers,
V.

On 18 March 2015 at 23:44, Vasiliki Kalavri <[hidden email]> wrote:
Well, one thing I notice is that your vertices and edges args are flipped. Might be the source of error :-)

On 18 March 2015 at 23:04, Mihail Vieru <[hidden email]> wrote:
I'm also using 0 as sourceID. The exact program arguments:

0 /home/vieru/dev/flink-experiments/data/social_network.edgelist /home/vieru/dev/flink-experiments/data/social_network.verticeslist /home/vieru/dev/flink-experiments/sssp-output-higgstwitter 10

And yes, I call both methods on the initialized Graph mappedInput. I don't understand why the distances are computed correctly for the small graph (also read from files) but not for the larger one.
The messages appear to be wrong in the latter case.


On 18.03.2015 21:55, Vasiliki Kalavri wrote:
hmm, I'm starting to run out of ideas...
What's your source ID parameter? I ran mine with 0. 
About the result, you call both createVertexCentricIteration() and runVertexCentricIteration() on the initialized graph, right?

On 18 March 2015 at 22:33, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

yes, I am using the latest master. I just did a pull again and the problem persists. Perhaps Robert could confirm as well.

I've set the solution set to unmanaged in SSSPUnweighted as Stephan proposed and the job finishes. So I am able to proceed using this workaround.

An odd thing occurs now though. The distances aren't computed correctly for the SNAP graph and remain the one set in InitVerticesMapper(). For the small graph in SSSPDataUnweighted they are OK. I'm currently investigating this behavior.

Cheers,
Mihail


On 18.03.2015 20:55, Vasiliki Kalavri wrote:
Hi Mihail,

I used your code to generate the vertex file, then gave this and the edge list as input to your SSSP implementation and still couldn't reproduce the exception. I'm using the same local setup as I describe above.
I'm not aware of any recent changes that might be relevant, but, just in case, are you using the latest master?

Cheers,
V.

On 18 March 2015 at 19:21, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

I have used a simple job (attached) to generate a file which looks like this:

0 0
1 1
2 2
...
456629 456629
456630 456630

I need the vertices to be generated from a file for my future work.

Cheers,
Mihail



On 18.03.2015 17:04, Vasiliki Kalavri wrote:
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail













Reply | Threaded
Open this post in threaded view
|

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Mihail Vieru
noooo waaaaay... that was it!? :))) Big thanks! :)
The result is also correct now.

Cheers,
M.

On 18.03.2015 22:49, Vasiliki Kalavri wrote:
haha, yes, actually I just confirmed!
If I flip my args, I get the error you mention in the first e-mail. you're trying to generate a graph giving the edge list as a vertex list and this is a way too big dataset for your memory settings (cmp. ~15m edges vs. the actual 400k).

I hope that clear everything out :-)

Cheers,
V.

On 18 March 2015 at 23:44, Vasiliki Kalavri <[hidden email]> wrote:
Well, one thing I notice is that your vertices and edges args are flipped. Might be the source of error :-)

On 18 March 2015 at 23:04, Mihail Vieru <[hidden email]> wrote:
I'm also using 0 as sourceID. The exact program arguments:

0 /home/vieru/dev/flink-experiments/data/social_network.edgelist /home/vieru/dev/flink-experiments/data/social_network.verticeslist /home/vieru/dev/flink-experiments/sssp-output-higgstwitter 10

And yes, I call both methods on the initialized Graph mappedInput. I don't understand why the distances are computed correctly for the small graph (also read from files) but not for the larger one.
The messages appear to be wrong in the latter case.


On 18.03.2015 21:55, Vasiliki Kalavri wrote:
hmm, I'm starting to run out of ideas...
What's your source ID parameter? I ran mine with 0. 
About the result, you call both createVertexCentricIteration() and runVertexCentricIteration() on the initialized graph, right?

On 18 March 2015 at 22:33, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

yes, I am using the latest master. I just did a pull again and the problem persists. Perhaps Robert could confirm as well.

I've set the solution set to unmanaged in SSSPUnweighted as Stephan proposed and the job finishes. So I am able to proceed using this workaround.

An odd thing occurs now though. The distances aren't computed correctly for the SNAP graph and remain the one set in InitVerticesMapper(). For the small graph in SSSPDataUnweighted they are OK. I'm currently investigating this behavior.

Cheers,
Mihail


On 18.03.2015 20:55, Vasiliki Kalavri wrote:
Hi Mihail,

I used your code to generate the vertex file, then gave this and the edge list as input to your SSSP implementation and still couldn't reproduce the exception. I'm using the same local setup as I describe above.
I'm not aware of any recent changes that might be relevant, but, just in case, are you using the latest master?

Cheers,
V.

On 18 March 2015 at 19:21, Mihail Vieru <[hidden email]> wrote:
Hi Vasia,

I have used a simple job (attached) to generate a file which looks like this:

0 0
1 1
2 2
...
456629 456629
456630 456630

I need the vertices to be generated from a file for my future work.

Cheers,
Mihail



On 18.03.2015 17:04, Vasiliki Kalavri wrote:
Hi Mihail, Robert,

I've tried reproducing this, but I couldn't.
I'm using the same twitter input graph from SNAP that you link to and also Scala IDE.
The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).

The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this:

Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
new MapFunction<Long, Long>() {
public Long map(Long value) {
return Long.MAX_VALUE;
}
}, env);
 
Since the twitter input is an edge list, how do you generate the vertex dataset in your case?

Thanks,
-Vasia.

On 18 March 2015 at 16:54, Mihail Vieru <[hidden email]> wrote:
Hi,

great! Thanks!

I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended.

Cheers,
Mihail


On 18.03.2015 15:40, Robert Waury wrote:
Hi,

I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation.

I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do.

https://issues.apache.org/jira/browse/FLINK-1734


Cheers,
Robert

On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <[hidden email]> wrote:
Hi Robert,

thank you for your reply.

I'm starting the job from the Scala IDE. So only one JobManager and one TaskManager in the same JVM.
I've doubled the memory in the eclipse.ini settings but I still get the Exception.

-vmargs
-Xmx2048m
-Xms100m
-XX:MaxPermSize=512m

Best,
Mihail


On 17.03.2015 10:11, Robert Waury wrote:
Hi,

can you tell me how much memory your job has and how many workers you are running?

From the trace it seems the internal hash table allocated only 7 MB for the graph data and therefore runs out of memory pretty quickly.

Skewed data could also be an issue but with a minimum of 5 pages and a maximum of 8 it seems to be distributed fairly even to the different partitions.

Cheers,
Robert

On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <[hidden email]> wrote:
And the correct SSSPUnweighted attached.


On 17.03.2015 01:23, Mihail Vieru wrote:
Hi,

I'm getting the following RuntimeException for an adaptation of the SingleSourceShortestPaths example using the Gelly API (see attachment). It's been adapted for unweighted graphs having vertices with Long values.

As an input graph I'm using the social network graph (~200MB unpacked) from here: https://snap.stanford.edu/data/higgs-twitter.html

For the small SSSPDataUnweighted graph (also attached) it terminates and computes the distances correctly.


03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric iteration (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 | org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) switched to FAILED
java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: 7208960 Message: Index: 8, Size: 7
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
    at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
    at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
    at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
    at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
    at java.lang.Thread.run(Thread.java:745)


Best,
Mihail