Re: One TaskManager per node or multiple TaskManager per node
Posted by
Ethan Li on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/One-TaskManager-per-node-or-multiple-TaskManager-per-node-tp25509p25558.html
It makes sense. Thank you very much, Jamie!
Ethan, it depends on what you mean by easy ;) It just depends a lot on what infra tools you already have in place. On bare metal it's probably safe to say there is no "easy" way. You need a lot of automation to make it easy.
Bastien, IMO, #1 applies to batch jobs as well.
Hello Jamie,
Does #1 apply to batch jobs too ?
Regards,
------------------
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
There are a lot of different ways to deploy Flink. It would be easier to answer your question with a little more context about your use case but in general I would advocate the following:
1) Don't run a "permanent" Flink cluster and then submit jobs to it. Instead what you should do is run an "ephemeral" cluster per job if possible. This keeps jobs completely isolated from each other which helps a lot with understanding performance, debugging, looking at logs, etc.
2) Given that you can do #1 and you are running on bare metal (as opposed to in containers) then run one TM per physical machine.
There are many ways to accomplish the above depending on your deployment infrastructure (YARN, K8S, bare metal, VMs, etc) so it's hard to give detailed input but general you'll have the best luck if you don't run multiple jobs in the same TM/JVM.
Hello,
I am setting up a standalone flink cluster and I am wondering what’s the best way to distribute TaskManagers. Do we usually launch one TaskManager (with many slots) per node or multiple TaskManagers per node (with smaller number of slots per tm) ? Also with one TaskManager per node, I am seeing that TM launches with only 30GB JVM heap by default while the node has 180 GB. Why is it not launching with more memory since there is a lot available?
Thank you very much!
- Ethan