Hi everyone,
I am currently looking into how Flink can coexist and interoperate with other frameworks in a cluster, such as plain single-machine processes or Spark. Tachyon seems to be nice solution to exchange data between them.
However, I think it is a problem that Flink's taskmanagers allocate their managed memory upfront - in contrast to Spark, as far as I know. If I want a taskmanager to yield its main memory, so that another process can use that memory, is there any other option besides shutting that taskmanager down? Would it be beneficial to use YARN? Thanks for your help!
Cheers, Sebastian |
Hi Sebastian, There is no way to return memory from a Flink process except shutting the process down. I think YARN could help in your setup. In a YARN setup, you can flexibly start and stop Flink sessions with different configurations (memory, TMs, slots) or run a single job. When running a single job, Flink will allocate resources and free them after the job is done. Best, Fabian 2015-12-09 9:46 GMT+01:00 Kruse, Sebastian <[hidden email]>:
|
@Sebastian: Getting memory away from the JVM is tricky always, completely independent of pre-allocation of managed memory or lazy allocation. But here is something that may work: - Start Flink in streaming mode - that will make it allocate managed memory lazily - Set the memory to offheap memory. That way the JVM heap is small. The off-heap memory is returned when no longer used deallocated - this releases memory much better than JVM shrinking the heap. On Wed, Dec 9, 2015 at 10:06 AM, Fabian Hueske <[hidden email]> wrote:
|
Streaming mode with on-heap memory won't help because the JVM allocates all memory but doesn't convert it to managed memory internally, right? Is offheap memory actually freed after it has been allocated as managed memory? Does this happen after a job finishes? 2015-12-09 10:44 GMT+01:00 Stephan Ewen <[hidden email]>:
|
Off heap memory is freed when the memory consuming operators release the memory. The Java process releases that memory then on the next GC, as far as I know. On Wed, Dec 9, 2015 at 11:01 AM, Fabian Hueske <[hidden email]> wrote:
|
Thanks for your answers. So the problem with on-heap memory would be that the JVM would not shrink its already allocated heap even if it is largely unused? Pertaining to the streaming-mode: If I run Flink in that mode, can I still submit batch jobs? Because that's what I want to do.
Thanks, Sebastian From: [hidden email] <[hidden email]> on behalf of Stephan Ewen <[hidden email]>
Sent: Wednesday, December 9, 2015 11:15 To: [hidden email] Subject: Re: Taskmanager memory Off heap memory is freed when the memory consuming operators release the memory.
The Java process releases that memory then on the next GC, as far as I know.
On Wed, Dec 9, 2015 at 11:01 AM, Fabian Hueske
<[hidden email]> wrote:
|
Yes, streaming mode supports batch jobs as well. The difference is that in streaming mode, managed memory is lazily allocated. This is because the streaming runtime does not use managed memory but only heap memory. 2015-12-09 11:55 GMT+01:00 Kruse, Sebastian <[hidden email]>:
|
BTW, for 1.0, this is consolidated into one single mode... On Wed, Dec 9, 2015 at 1:45 PM, Fabian Hueske <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |