Rules of Thumb for Setting Parallelism

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Rules of Thumb for Setting Parallelism

Rex Fenley
Hello,

I'm running a Job on AWS EMR with the TableAPI that does a long series of Joins, GroupBys, and Aggregates and I'd like to know how to best tune parallelism.

In my case, I have 8 EMR core nodes setup each with 4vCores and 8Gib of memory. There's a job we have to run that has ~30 table operators. Given this, how should I calculate what to set the systems parallelism to?

I also plan on running a second job on the same system, but just with 6 operators. Will this change the calculation for parallelism at all?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US

Reply | Threaded
Open this post in threaded view
|

Re: Rules of Thumb for Setting Parallelism

Till Rohrmann
Hi Rex,

as a rule of thumb I recommend configuring your TMs with as many slots as they have cores. So in your case your cluster would have 32 slots. Then depending on the workload of your jobs you should distribute them across both jobs (so that the total adds up to 32). A high number of operators does not necessarily mean that it needs more slots since operators can share the same slot. It mostly depends on the workload of your job. If the job should be too slow, then you would have to increase the cluster resources.

Cheers,
Till

On Fri, Nov 6, 2020 at 12:21 AM Rex Fenley <[hidden email]> wrote:
Hello,

I'm running a Job on AWS EMR with the TableAPI that does a long series of Joins, GroupBys, and Aggregates and I'd like to know how to best tune parallelism.

In my case, I have 8 EMR core nodes setup each with 4vCores and 8Gib of memory. There's a job we have to run that has ~30 table operators. Given this, how should I calculate what to set the systems parallelism to?

I also plan on running a second job on the same system, but just with 6 operators. Will this change the calculation for parallelism at all?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US

Reply | Threaded
Open this post in threaded view
|

Re: Rules of Thumb for Setting Parallelism

Rex Fenley
Great, thanks!

So just to confirm, configure # of task slots to # of core nodes x # of vCPUs?

I'm not sure what you mean by "distribute them across both jobs (so that the total adds up to 32)". Is it configurable how many task slots a job can receive, so in this case I'd provide ~30/36 * 32 task slots for one job and ~6/36 * 32 for another job, but even them out to sum to 32 task slots?

Thanks

On Fri, Nov 6, 2020 at 10:01 AM Till Rohrmann <[hidden email]> wrote:
Hi Rex,

as a rule of thumb I recommend configuring your TMs with as many slots as they have cores. So in your case your cluster would have 32 slots. Then depending on the workload of your jobs you should distribute them across both jobs (so that the total adds up to 32). A high number of operators does not necessarily mean that it needs more slots since operators can share the same slot. It mostly depends on the workload of your job. If the job should be too slow, then you would have to increase the cluster resources.

Cheers,
Till

On Fri, Nov 6, 2020 at 12:21 AM Rex Fenley <[hidden email]> wrote:
Hello,

I'm running a Job on AWS EMR with the TableAPI that does a long series of Joins, GroupBys, and Aggregates and I'd like to know how to best tune parallelism.

In my case, I have 8 EMR core nodes setup each with 4vCores and 8Gib of memory. There's a job we have to run that has ~30 table operators. Given this, how should I calculate what to set the systems parallelism to?

I also plan on running a second job on the same system, but just with 6 operators. Will this change the calculation for parallelism at all?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US



--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US

Reply | Threaded
Open this post in threaded view
|

Re: Rules of Thumb for Setting Parallelism

Till Rohrmann
Hi Rex,

You should configure the number of slots per TaskManager to be the number of cores of a machine/node. In total you will then have a cluster with #slots = #cores per machine x #machines.

If you have a cluster with 4 nodes and 8 slots each, then you have a total of 32 slots. Now if you have a job A which you start with a parallelism of 20, then you have 12 slots left. Hence, you could make use of these 12 slots by starting a job B with a parallelism 12.

Cheers,
Till

On Fri, Nov 6, 2020 at 7:20 PM Rex Fenley <[hidden email]> wrote:
Great, thanks!

So just to confirm, configure # of task slots to # of core nodes x # of vCPUs?

I'm not sure what you mean by "distribute them across both jobs (so that the total adds up to 32)". Is it configurable how many task slots a job can receive, so in this case I'd provide ~30/36 * 32 task slots for one job and ~6/36 * 32 for another job, but even them out to sum to 32 task slots?

Thanks

On Fri, Nov 6, 2020 at 10:01 AM Till Rohrmann <[hidden email]> wrote:
Hi Rex,

as a rule of thumb I recommend configuring your TMs with as many slots as they have cores. So in your case your cluster would have 32 slots. Then depending on the workload of your jobs you should distribute them across both jobs (so that the total adds up to 32). A high number of operators does not necessarily mean that it needs more slots since operators can share the same slot. It mostly depends on the workload of your job. If the job should be too slow, then you would have to increase the cluster resources.

Cheers,
Till

On Fri, Nov 6, 2020 at 12:21 AM Rex Fenley <[hidden email]> wrote:
Hello,

I'm running a Job on AWS EMR with the TableAPI that does a long series of Joins, GroupBys, and Aggregates and I'd like to know how to best tune parallelism.

In my case, I have 8 EMR core nodes setup each with 4vCores and 8Gib of memory. There's a job we have to run that has ~30 table operators. Given this, how should I calculate what to set the systems parallelism to?

I also plan on running a second job on the same system, but just with 6 operators. Will this change the calculation for parallelism at all?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US



--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US

Reply | Threaded
Open this post in threaded view
|

Re: Rules of Thumb for Setting Parallelism

Rex Fenley
Awesome, thanks!

On Sat, Nov 7, 2020 at 6:43 AM Till Rohrmann <[hidden email]> wrote:
Hi Rex,

You should configure the number of slots per TaskManager to be the number of cores of a machine/node. In total you will then have a cluster with #slots = #cores per machine x #machines.

If you have a cluster with 4 nodes and 8 slots each, then you have a total of 32 slots. Now if you have a job A which you start with a parallelism of 20, then you have 12 slots left. Hence, you could make use of these 12 slots by starting a job B with a parallelism 12.

Cheers,
Till

On Fri, Nov 6, 2020 at 7:20 PM Rex Fenley <[hidden email]> wrote:
Great, thanks!

So just to confirm, configure # of task slots to # of core nodes x # of vCPUs?

I'm not sure what you mean by "distribute them across both jobs (so that the total adds up to 32)". Is it configurable how many task slots a job can receive, so in this case I'd provide ~30/36 * 32 task slots for one job and ~6/36 * 32 for another job, but even them out to sum to 32 task slots?

Thanks

On Fri, Nov 6, 2020 at 10:01 AM Till Rohrmann <[hidden email]> wrote:
Hi Rex,

as a rule of thumb I recommend configuring your TMs with as many slots as they have cores. So in your case your cluster would have 32 slots. Then depending on the workload of your jobs you should distribute them across both jobs (so that the total adds up to 32). A high number of operators does not necessarily mean that it needs more slots since operators can share the same slot. It mostly depends on the workload of your job. If the job should be too slow, then you would have to increase the cluster resources.

Cheers,
Till

On Fri, Nov 6, 2020 at 12:21 AM Rex Fenley <[hidden email]> wrote:
Hello,

I'm running a Job on AWS EMR with the TableAPI that does a long series of Joins, GroupBys, and Aggregates and I'd like to know how to best tune parallelism.

In my case, I have 8 EMR core nodes setup each with 4vCores and 8Gib of memory. There's a job we have to run that has ~30 table operators. Given this, how should I calculate what to set the systems parallelism to?

I also plan on running a second job on the same system, but just with 6 operators. Will this change the calculation for parallelism at all?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US



--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US



--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US