Hello, I have a Job that's a series of Joins, GroupBys, and Aggs and it's bottlenecked in one of the joins. The join's cardinality is ~300 million rows on the left and ~200 million rows on the right all with unique keys. I'm seeing this in the plan for that bottlenecked Join.
Join(joinType=[InnerJoin], where=[(user_id = id0)], select=[id,
group_id, user_id, uuid, owner, id0, deleted_at],
leftInputSpec=[HasUniqueKey], rightInputSpec=[JoinKeyContainsUniqueKey])
My first question is, what is the difference between leftInputSpec=[HasUniqueKey]and rightInputSpec=[JoinKeyContainsUniqueKey]? Is the left side not using the join key for hashing the join but instead using its pk id, which would be underperformant? Is there anything else about this that stands out? Thanks! -- Rex Fenley | Software Engineer - Mobile and Backend Remind.com | BLOG | FOLLOW US | LIKE US |
Hi Rex, "HasUniqueKey" means that the left input has a unique key. "JoinKeyContainsUniqueKey" means that the join key of the right side contains the unique key of this relation. Hence, it looks normal to me. Cheers, Till On Fri, Nov 6, 2020 at 7:29 PM Rex Fenley <[hidden email]> wrote:
|
Thank you for the clarification. On Sat, Nov 7, 2020 at 7:37 AM Till Rohrmann <[hidden email]> wrote:
-- Rex Fenley | Software Engineer - Mobile and Backend Remind.com | BLOG | FOLLOW US | LIKE US |
Free forum by Nabble | Edit this page |