HA Standalone Cluster configuration

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

HA Standalone Cluster configuration

Edward

Questions about standalone cluster configuration:

  1. Is it considered bad practice to have standby JobManagers co-located on the same machines as TaskManagers?
  2. Is it considered bad practice to have zookeeper installed on the same machines as the JobManager leader and standby machines? (the docs say "In production setups, it is recommended to manage your own ZooKeeper installation.", but I'm assuming it's still okay to co-locate ZK on with JobManager?)
  3. In another thread, I read that the rule of thumb for taskmanager.numberOfTaskSlots = number of cores. Doesn't this ignore cases where threads have a high proportion of idle time (i.e. waiting on an I/O call)? If the total number of task slot limits my degree of parallelism, but most parallel copies of a subtask are idle at any given time, it seems that I would want to have # of task slots equal to some multiple of the number of cores.

Thanks,
Edward
Reply | Threaded
Open this post in threaded view
|

Re: HA Standalone Cluster configuration

Stefan Richter
Hi,

I think 

1. should not be a problem if the machine has enough capacities to run both.
2. is not truly harmful if you have more than one Zookeeper node, but in case the machine of your JM goes down, it also takes off one ZK node. It is no problem if the remaining ZK nodes can take over to recover your job manager, but running JM on a different node than the ZK nodes can potentially leave you with one more ZK node when the JM machine goes down and it really matters to have ZK available for recovery.
3. yes, that is why it is only called „rule of thumb“. You can always tune the number of slots for the specifics of your job, one if wich can be I/O-heavy vs compute-heavy.

Best,
Stefan

Am 21.06.2017 um 18:39 schrieb Edward Buck <[hidden email]>:

Questions about standalone cluster configuration:

  1. Is it considered bad practice to have standby JobManagers co-located on the same machines as TaskManagers?
  2. Is it considered bad practice to have zookeeper installed on the same machines as the JobManager leader and standby machines? (the docs say "In production setups, it is recommended to manage your own ZooKeeper installation.", but I'm assuming it's still okay to co-locate ZK on with JobManager?)
  3. In another thread, I read that the rule of thumb for taskmanager.numberOfTaskSlots = number of cores. Doesn't this ignore cases where threads have a high proportion of idle time (i.e. waiting on an I/O call)? If the total number of task slot limits my degree of parallelism, but most parallel copies of a subtask are idle at any given time, it seems that I would want to have # of task slots equal to some multiple of the number of cores.

Thanks,
Edward