Gelly ran out of memory

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Gelly ran out of memory

Flavio Pompermaier
Hi to all,

I tried to run my gelly job on Flink 0.9-SNAPSHOT and I was having an EOFException, so I tried on 0.10-SNAPSHOT and now I have the following error:

Caused by: java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 73 maxPartition: 80 number of overflow segments: 0 bucketSize: 570 Overall memory: 102367232 Partition memory: 81100800 Message: null
at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:465)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:211)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:272)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)

Probably I'm doing something wrong but I can't understand how to estimate the required memory for my Gelly job..

Best,
Flavio
Reply | Threaded
Open this post in threaded view
|

Re: Gelly ran out of memory

Andra Lungu
Hi Flavio,

These kinds of exceptions generally arise from the fact that you ran out of `user` memory. You can try to increase that a bit.
In your flink-conf.yaml try adding
# The memory fraction allocated system -user
taskmanager.memory.fraction: 0.4

This will give 0.6 of the unit of memory to the user and 0.4 to the system.

Tell me if that helped.
Andra

On Thu, Aug 20, 2015 at 12:02 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,

I tried to run my gelly job on Flink 0.9-SNAPSHOT and I was having an EOFException, so I tried on 0.10-SNAPSHOT and now I have the following error:

Caused by: java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 73 maxPartition: 80 number of overflow segments: 0 bucketSize: 570 Overall memory: 102367232 Partition memory: 81100800 Message: null
at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:465)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:211)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:272)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)

Probably I'm doing something wrong but I can't understand how to estimate the required memory for my Gelly job..

Best,
Flavio

Reply | Threaded
Open this post in threaded view
|

Re: Gelly ran out of memory

Stephan Ewen
Actually, you ran out of "Flink Managed Memory", not user memory. User memory shortage manifests itself as Java OutofMemoryError.

At this point, the Delta iterations cannot spill. They additionally suffer a bit from memory fragmentation.
A possible workaround is to use the option "setSolutionSetUnmanaged(true)" on the iteration. That will eliminate the fragmentation issue, at least.

Stephan


On Thu, Aug 20, 2015 at 12:06 PM, Andra Lungu <[hidden email]> wrote:
Hi Flavio,

These kinds of exceptions generally arise from the fact that you ran out of `user` memory. You can try to increase that a bit.
In your flink-conf.yaml try adding
# The memory fraction allocated system -user
taskmanager.memory.fraction: 0.4

This will give 0.6 of the unit of memory to the user and 0.4 to the system.

Tell me if that helped.
Andra

On Thu, Aug 20, 2015 at 12:02 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,

I tried to run my gelly job on Flink 0.9-SNAPSHOT and I was having an EOFException, so I tried on 0.10-SNAPSHOT and now I have the following error:

Caused by: java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 73 maxPartition: 80 number of overflow segments: 0 bucketSize: 570 Overall memory: 102367232 Partition memory: 81100800 Message: null
at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:465)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:211)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:272)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)

Probably I'm doing something wrong but I can't understand how to estimate the required memory for my Gelly job..

Best,
Flavio


Reply | Threaded
Open this post in threaded view
|

Re: Gelly ran out of memory

Flavio Pompermaier
Using Stephan advice make things goin on! Then I had another exception about a non existing vertex but that is another story :)

Thanks to all for the support!

On Thu, Aug 20, 2015 at 12:09 PM, Stephan Ewen <[hidden email]> wrote:
Actually, you ran out of "Flink Managed Memory", not user memory. User memory shortage manifests itself as Java OutofMemoryError.

At this point, the Delta iterations cannot spill. They additionally suffer a bit from memory fragmentation.
A possible workaround is to use the option "setSolutionSetUnmanaged(true)" on the iteration. That will eliminate the fragmentation issue, at least.

Stephan


On Thu, Aug 20, 2015 at 12:06 PM, Andra Lungu <[hidden email]> wrote:
Hi Flavio,

These kinds of exceptions generally arise from the fact that you ran out of `user` memory. You can try to increase that a bit.
In your flink-conf.yaml try adding
# The memory fraction allocated system -user
taskmanager.memory.fraction: 0.4

This will give 0.6 of the unit of memory to the user and 0.4 to the system.

Tell me if that helped.
Andra

On Thu, Aug 20, 2015 at 12:02 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,

I tried to run my gelly job on Flink 0.9-SNAPSHOT and I was having an EOFException, so I tried on 0.10-SNAPSHOT and now I have the following error:

Caused by: java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 73 maxPartition: 80 number of overflow segments: 0 bucketSize: 570 Overall memory: 102367232 Partition memory: 81100800 Message: null
at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:465)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:211)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:272)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)

Probably I'm doing something wrong but I can't understand how to estimate the required memory for my Gelly job..

Best,
Flavio


Reply | Threaded
Open this post in threaded view
|

Re: Gelly ran out of memory

Henry Saputra
In reply to this post by Stephan Ewen
Hi Stephan, this looks like a bug to me. Shouldn't the memory manager
switch to out of managed area if it is out of memory space?

- Henry

On Thu, Aug 20, 2015 at 3:09 AM, Stephan Ewen <[hidden email]> wrote:

> Actually, you ran out of "Flink Managed Memory", not user memory. User
> memory shortage manifests itself as Java OutofMemoryError.
>
> At this point, the Delta iterations cannot spill. They additionally suffer a
> bit from memory fragmentation.
> A possible workaround is to use the option "setSolutionSetUnmanaged(true)"
> on the iteration. That will eliminate the fragmentation issue, at least.
>
> Stephan
>
>
> On Thu, Aug 20, 2015 at 12:06 PM, Andra Lungu <[hidden email]> wrote:
>>
>> Hi Flavio,
>>
>> These kinds of exceptions generally arise from the fact that you ran out
>> of `user` memory. You can try to increase that a bit.
>> In your flink-conf.yaml try adding
>> # The memory fraction allocated system -user
>> taskmanager.memory.fraction: 0.4
>>
>> This will give 0.6 of the unit of memory to the user and 0.4 to the
>> system.
>>
>> Tell me if that helped.
>> Andra
>>
>> On Thu, Aug 20, 2015 at 12:02 PM, Flavio Pompermaier
>> <[hidden email]> wrote:
>>>
>>> Hi to all,
>>>
>>> I tried to run my gelly job on Flink 0.9-SNAPSHOT and I was having an
>>> EOFException, so I tried on 0.10-SNAPSHOT and now I have the following
>>> error:
>>>
>>> Caused by: java.lang.RuntimeException: Memory ran out. Compaction failed.
>>> numPartitions: 32 minPartition: 73 maxPartition: 80 number of overflow
>>> segments: 0 bucketSize: 570 Overall memory: 102367232 Partition memory:
>>> 81100800 Message: null
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:465)
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414)
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325)
>>> at
>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:211)
>>> at
>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:272)
>>> at
>>> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Probably I'm doing something wrong but I can't understand how to estimate
>>> the required memory for my Gelly job..
>>>
>>> Best,
>>> Flavio
>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Gelly ran out of memory

Stephan Ewen
One of the ideas behind memory management is to recognize when the memory runs out. Simply using regular Java heap means that an operator would consume more than it should, and eventually cause an OOM error again.

The only way of reacting to the fact that the memory ran out is to reduce memory usage, for example by either spilling or compacting. The solution set cannot spill at this point, so if compaction does not help, it gives up.

On Thu, Aug 20, 2015 at 8:50 PM, Henry Saputra <[hidden email]> wrote:
Hi Stephan, this looks like a bug to me. Shouldn't the memory manager
switch to out of managed area if it is out of memory space?

- Henry

On Thu, Aug 20, 2015 at 3:09 AM, Stephan Ewen <[hidden email]> wrote:
> Actually, you ran out of "Flink Managed Memory", not user memory. User
> memory shortage manifests itself as Java OutofMemoryError.
>
> At this point, the Delta iterations cannot spill. They additionally suffer a
> bit from memory fragmentation.
> A possible workaround is to use the option "setSolutionSetUnmanaged(true)"
> on the iteration. That will eliminate the fragmentation issue, at least.
>
> Stephan
>
>
> On Thu, Aug 20, 2015 at 12:06 PM, Andra Lungu <[hidden email]> wrote:
>>
>> Hi Flavio,
>>
>> These kinds of exceptions generally arise from the fact that you ran out
>> of `user` memory. You can try to increase that a bit.
>> In your flink-conf.yaml try adding
>> # The memory fraction allocated system -user
>> taskmanager.memory.fraction: 0.4
>>
>> This will give 0.6 of the unit of memory to the user and 0.4 to the
>> system.
>>
>> Tell me if that helped.
>> Andra
>>
>> On Thu, Aug 20, 2015 at 12:02 PM, Flavio Pompermaier
>> <[hidden email]> wrote:
>>>
>>> Hi to all,
>>>
>>> I tried to run my gelly job on Flink 0.9-SNAPSHOT and I was having an
>>> EOFException, so I tried on 0.10-SNAPSHOT and now I have the following
>>> error:
>>>
>>> Caused by: java.lang.RuntimeException: Memory ran out. Compaction failed.
>>> numPartitions: 32 minPartition: 73 maxPartition: 80 number of overflow
>>> segments: 0 bucketSize: 570 Overall memory: 102367232 Partition memory:
>>> 81100800 Message: null
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:465)
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414)
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325)
>>> at
>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:211)
>>> at
>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:272)
>>> at
>>> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Probably I'm doing something wrong but I can't understand how to estimate
>>> the required memory for my Gelly job..
>>>
>>> Best,
>>> Flavio
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Gelly ran out of memory

Henry Saputra
Ah yes, I agree about the purpose of memory management, what I was wondering is that could an operator do something like explicitly spill to disk when it fails to get required memory from memory segment?


On Friday, August 21, 2015, Stephan Ewen <[hidden email]> wrote:
One of the ideas behind memory management is to recognize when the memory runs out. Simply using regular Java heap means that an operator would consume more than it should, and eventually cause an OOM error again.

The only way of reacting to the fact that the memory ran out is to reduce memory usage, for example by either spilling or compacting. The solution set cannot spill at this point, so if compaction does not help, it gives up.

On Thu, Aug 20, 2015 at 8:50 PM, Henry Saputra <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;henry.saputra@gmail.com&#39;);" target="_blank">henry.saputra@...> wrote:
Hi Stephan, this looks like a bug to me. Shouldn't the memory manager
switch to out of managed area if it is out of memory space?

- Henry

On Thu, Aug 20, 2015 at 3:09 AM, Stephan Ewen <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;sewen@apache.org&#39;);" target="_blank">sewen@...> wrote:
> Actually, you ran out of "Flink Managed Memory", not user memory. User
> memory shortage manifests itself as Java OutofMemoryError.
>
> At this point, the Delta iterations cannot spill. They additionally suffer a
> bit from memory fragmentation.
> A possible workaround is to use the option "setSolutionSetUnmanaged(true)"
> on the iteration. That will eliminate the fragmentation issue, at least.
>
> Stephan
>
>
> On Thu, Aug 20, 2015 at 12:06 PM, Andra Lungu <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;lungu.andra@gmail.com&#39;);" target="_blank">lungu.andra@...> wrote:
>>
>> Hi Flavio,
>>
>> These kinds of exceptions generally arise from the fact that you ran out
>> of `user` memory. You can try to increase that a bit.
>> In your flink-conf.yaml try adding
>> # The memory fraction allocated system -user
>> taskmanager.memory.fraction: 0.4
>>
>> This will give 0.6 of the unit of memory to the user and 0.4 to the
>> system.
>>
>> Tell me if that helped.
>> Andra
>>
>> On Thu, Aug 20, 2015 at 12:02 PM, Flavio Pompermaier
>> <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;pompermaier@okkam.it&#39;);" target="_blank">pompermaier@...> wrote:
>>>
>>> Hi to all,
>>>
>>> I tried to run my gelly job on Flink 0.9-SNAPSHOT and I was having an
>>> EOFException, so I tried on 0.10-SNAPSHOT and now I have the following
>>> error:
>>>
>>> Caused by: java.lang.RuntimeException: Memory ran out. Compaction failed.
>>> numPartitions: 32 minPartition: 73 maxPartition: 80 number of overflow
>>> segments: 0 bucketSize: 570 Overall memory: 102367232 Partition memory:
>>> 81100800 Message: null
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:465)
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414)
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325)
>>> at
>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:211)
>>> at
>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:272)
>>> at
>>> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Probably I'm doing something wrong but I can't understand how to estimate
>>> the required memory for my Gelly job..
>>>
>>> Best,
>>> Flavio
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Gelly ran out of memory

Stephan Ewen
The operator needs to implement spilling, which the SolutionSet currently doesn't (as the only operator).

On Fri, Aug 21, 2015 at 5:15 PM, Henry Saputra <[hidden email]> wrote:
Ah yes, I agree about the purpose of memory management, what I was wondering is that could an operator do something like explicitly spill to disk when it fails to get required memory from memory segment?


On Friday, August 21, 2015, Stephan Ewen <[hidden email]> wrote:
One of the ideas behind memory management is to recognize when the memory runs out. Simply using regular Java heap means that an operator would consume more than it should, and eventually cause an OOM error again.

The only way of reacting to the fact that the memory ran out is to reduce memory usage, for example by either spilling or compacting. The solution set cannot spill at this point, so if compaction does not help, it gives up.

On Thu, Aug 20, 2015 at 8:50 PM, Henry Saputra <[hidden email]> wrote:
Hi Stephan, this looks like a bug to me. Shouldn't the memory manager
switch to out of managed area if it is out of memory space?

- Henry

On Thu, Aug 20, 2015 at 3:09 AM, Stephan Ewen <[hidden email]> wrote:
> Actually, you ran out of "Flink Managed Memory", not user memory. User
> memory shortage manifests itself as Java OutofMemoryError.
>
> At this point, the Delta iterations cannot spill. They additionally suffer a
> bit from memory fragmentation.
> A possible workaround is to use the option "setSolutionSetUnmanaged(true)"
> on the iteration. That will eliminate the fragmentation issue, at least.
>
> Stephan
>
>
> On Thu, Aug 20, 2015 at 12:06 PM, Andra Lungu <[hidden email]> wrote:
>>
>> Hi Flavio,
>>
>> These kinds of exceptions generally arise from the fact that you ran out
>> of `user` memory. You can try to increase that a bit.
>> In your flink-conf.yaml try adding
>> # The memory fraction allocated system -user
>> taskmanager.memory.fraction: 0.4
>>
>> This will give 0.6 of the unit of memory to the user and 0.4 to the
>> system.
>>
>> Tell me if that helped.
>> Andra
>>
>> On Thu, Aug 20, 2015 at 12:02 PM, Flavio Pompermaier
>> <[hidden email]> wrote:
>>>
>>> Hi to all,
>>>
>>> I tried to run my gelly job on Flink 0.9-SNAPSHOT and I was having an
>>> EOFException, so I tried on 0.10-SNAPSHOT and now I have the following
>>> error:
>>>
>>> Caused by: java.lang.RuntimeException: Memory ran out. Compaction failed.
>>> numPartitions: 32 minPartition: 73 maxPartition: 80 number of overflow
>>> segments: 0 bucketSize: 570 Overall memory: 102367232 Partition memory:
>>> 81100800 Message: null
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:465)
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414)
>>> at
>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325)
>>> at
>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:211)
>>> at
>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:272)
>>> at
>>> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Probably I'm doing something wrong but I can't understand how to estimate
>>> the required memory for my Gelly job..
>>>
>>> Best,
>>> Flavio
>>
>>
>