Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables |
Date | |
Msg-id | CAFjFpRfneFG3H+F6BaiXemMrKF+FY-POpx3Ocy+RiH3yBmXSNw@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables |
List | pgsql-hackers |
On Tue, Sep 19, 2017 at 2:35 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Sep 18, 2017 at 8:02 AM, Ashutosh Bapat > <ashutosh.bapat@enterprisedb.com> wrote: >> partition pruning might need partexprs look up relevant quals, but >> nullable_partexprs doesn't have any use there. So may be we should add >> nullable_partexpr to RelOptInfo as part of 0002 (partition-wise join >> implementation) instead of 0001. What do you think? > > +1. Done. > >>> - I'm not entirely sure whether maintaining partexprs and >>> nullable_partexprs is the right design. If I understand correctly, >>> whether or not a partexpr is nullable is really a per-RTI property, >>> not a per-expression property. You could consider something like >>> "Relids nullable_rels". >> >> That's true. However in order to decide whether an expression falls on >> nullable side of a join, we will need to call pull_varnos() on it and >> check the output against nullable_rels. Separating the expressions >> themselves avoids that step. > > Good point. Also, I'm not sure about cases like this: > > SELECT * FROM (SELECT b.x, b.y FROM a LEFT JOIN b ON a.x = b.x WHERE > a.y = b.y) w LEFT JOIN c ON w.x = c.x AND w.y = c.y; > > Suppose the relations are all partitioned by (x, y) but that the = > operator is not strict. A partition-wise join is valid between a and > b, but we can't regard w as partitioned any more, because w.x might > contain nulls in partitions where the partitioning scheme wouldn't > allow them. On the other hand, if the subquery were to select a.x, > a.y then clearly it would be fine: there would be no possibility of a > NULL having been substituted for a proper value. > > What if the subquery selected a.x, b.y? Initially, I thought that > would be OK too, because of the fact that the a.y = b.y clause is in > the WHERE clause rather than the join condition. But on further > thought I think that probably doesn't work, because with = being a > non-strict operator there's no guarantee that it would remove any > nulls introduced by the left join. Of course, if the subselect had a > WHERE clause saying that b.x/b.y IS NOT NULL then having the SELECT > list mention those columns would be fine. > I am actually not sure whether we can use partition-wise join for a LEFT JOIN b when the partition key equalities are spread across ON and WHERE clauses. I am not able to find any example against it, but I am not able to prove it as well. The reference I used for partition-wise join [1], mentions JOIN conditions i.e. ON clause conditions. But all the examples used in that paper are that of INNER join. So, I am not sure what exactly the authors meant by JOIN conditions. Right now I am restricting the patch to work with only conditions in the ON clause. Practically most of the operators are strict. OUTER join's WHERE clause has any partition key equality with strict operator, optimizer will turn that OUTER join into an INNER one, turning all clauses into join clauses. That will enable partition-wise join. So, the current restriction doesn't restrict any practical cases. OTOH, I have seen that treating ON and WHERE clauses as same for an OUTER join leads to surprising results. So, I am leaning to treat them separate for partition-wise join as well and only use ON clause conditions for partition-wise join. If we get complaints about partition-wise join not being picked we will fix them after proving that it's not harmful. Lifting that restriction is not so difficult. have_partition_key_equijoin() ignores "pushed down" quals. We have to just change that condition. Your last sentence about a clause b.x IS NOT NULL or b.y IS NOT NULL is interesting. If those conditions are in ON clause, we may still have a result where b.x and b.y as NULL when no row in "a" matches a row in "b". If those conditions are in WHERE clause, I think optimizer will turn the join into an INNER join irrespective of whether the equality operator is strict. > >> If partition-wise join is disabled, partition-wise aggregates, >> strength reduction of MergeAppend won't be possible on a join tree, >> but those will be possible on a base relation. Even if partition-wise >> join enabled, one may want to disable other partition-wise >> optimizations individually. So, they are somewhat independent >> switches. I don't think we should bundle all of those into one. >> Whatever names we choose for those GUCs, I think they should have same >> naming convention e.g. "partition_wise_xyz". I am open to suggestions >> about the names. > > I think the chances of you getting multiple GUCs for different > partition-wise optimizations past Tom are pretty low. We do have enable_hashjoin and enable_hashagg to control use of hashing for aggregate and join. On similar lines we can have three GUCs to enable use of partition-wise strategy, one for each of join, aggregation and sorting. Having granular switches would be useful for debugging and may be to turn partition-wise strategies off when they are not optimal. Do we want a switch to turn ON/OFF partition pruning? Said, that I am fine with single GUC controlling all. We won't set any partitioning information in RelOptInfo if that GUC is turned OFF. [1] https://pdfs.semanticscholar.org/27c2/ba75f8b6a39d4bce85d5579dace609c9abaa.pdf -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: