(DEPRECATED) Apache Flink User Mailing List archive.

Re: One TaskManager per node or multiple TaskManager per node

Posted by Ethan Li on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/One-TaskManager-per-node-or-multiple-TaskManager-per-node-tp25509p25558.html

It makes sense. Thank you very much, Jamie!

On Jan 15, 2019, at 12:48 PM, Jamie Grier <[hidden email]> wrote:

Ethan, it depends on what you mean by easy ;) It just depends a lot on what infra tools you already have in place. On bare metal it's probably safe to say there is no "easy" way. You need a lot of automation to make it easy.

Bastien, IMO, #1 applies to batch jobs as well.

On Tue, Jan 15, 2019 at 6:27 AM bastien dine <[hidden email]> wrote:
Hello Jamie,

Does #1 apply to batch jobs too ?

Regards,

------------------

Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io

Le lun. 14 janv. 2019 à 20:39, Jamie Grier <[hidden email]> a écrit :
There are a lot of different ways to deploy Flink. It would be easier to answer your question with a little more context about your use case but in general I would advocate the following:

1) Don't run a "permanent" Flink cluster and then submit jobs to it. Instead what you should do is run an "ephemeral" cluster per job if possible. This keeps jobs completely isolated from each other which helps a lot with understanding performance, debugging, looking at logs, etc.
2) Given that you can do #1 and you are running on bare metal (as opposed to in containers) then run one TM per physical machine.

There are many ways to accomplish the above depending on your deployment infrastructure (YARN, K8S, bare metal, VMs, etc) so it's hard to give detailed input but general you'll have the best luck if you don't run multiple jobs in the same TM/JVM.

In terms of the TM memory usage you can set that up by configuring it in the flink-conf.yaml file. The config key you are looking or is taskmanager.heap.size: https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#taskmanager-heap-size

On Mon, Jan 14, 2019 at 8:05 AM Ethan Li <[hidden email]> wrote:
Hello,

I am setting up a standalone flink cluster and I am wondering what’s the best way to distribute TaskManagers. Do we usually launch one TaskManager (with many slots) per node or multiple TaskManagers per node (with smaller number of slots per tm) ? Also with one TaskManager per node, I am seeing that TM launches with only 30GB JVM heap by default while the node has 180 GB. Why is it not launching with more memory since there is a lot available?

Thank you very much!

- Ethan