Re: join removal - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: join removal |
Date | |
Msg-id | 603c8f070908161431u7b07bf30g2c0ddd3eef220ebd@mail.gmail.com Whole thread Raw |
In response to | Re: join removal (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: join removal
|
List | pgsql-hackers |
On Sun, Aug 9, 2009 at 12:19 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> distinct_col_search() is going to return the relevant equality >> operator from the argument list, which is ultimately going to come >> from the RestrictInfo for the join clause. So I need to see whether >> that's compatible with the index, but equality_ops_are_compatible() >> wants two equality operators, and what I have is one equality operator >> and one operator class. > > For that you just check if the operator is a member of the class. > (You might need to verify that it's an equality operator in the class > too; not clear if the context is enough to be sure that it's not '<' > for example.) It seems that the needed checks are very similar to the ones that we already implement when setting restrictinfo->mergeopfamilies. That is filled in by get_mergejoin_opfamilies(), which checks for btree opfamilies where the strategy number is BTEqualStrategyNumber. This might cease to be the correct check in the (not-too-distant?) future if we end up implementing other kinds of unique indices, but right now btrees are all there is. One possibility would be to have relation_is_distinct_for() call get_mergejoin_opfamilies() for each operator; then for each index we can check whether the opfamily of the relevant index column is in the returned list. This seems a bit wasteful, though, since I believe that relation_is_distinct_for() would be called from joinpath.c, which has access to restrictinfo->mergeopfamilies already. I'm wondering whether it would make more sense to modify the proposed API for relation_is_distinct_for() in some way so that we don't lose this information. It seems to me that the overall process here is something like this (recalling that I'm focusing only on removing LEFT joins at this point): 1. Given a joinrel, innerrel, and outerrel, find the list of RestrictInfos for which (a) restrictinfo->mergeopfamilies != NIL, (b) restrictinfo->outer_is_left is well-defined (as per logic in select_mergejoin_clauses), and (c) the outer side is a Var. If this list is NIL, then give up; join removal is not possible. 2. Check whether any attributes from the outer side are used above the join; if so, then give up; join removal is not possible. 3. Extract the column numbers from the Vars found in step 1(C) and the mergeopfamilies found in step 1(A). 4. Look a unique, non-expression index (which must also have index->indpred == NIL or index->predOK) for which every column number appears in the list of column numbers computed in step 3, with one of the corresponding opfamilies also found in step (2). If one is found, then the join is removable. Thoughts? ...Robert
pgsql-hackers by date: