Thread: IN joining

IN joining

From

Dennis Haney

Date:

05 March 2004, 13:29:08

Hi

I have a problem understanding the code to make certain in join are 
performed properly. Specifically I have problems understading when 
IN_UNIQUE_{INNER,OUTER} is a valid jointype.
Its in joinrels.c:make_join_rel.

Consider this example:

SELECT * FROM a,b WHERE a.id = b.id AND (a.id) IN (SELECT c.id FROM c)

the possible execution trees are {{a,b}, {c}}, {{a,c},{b}} and the code 
seems to also permit {{b,c},{a}}. It is the latter I'm having problems with.

When joining {b} and {c} it will fall through and suggest a 
IN_UNIQUE_{INNER,OUTER} jointype.

My logic is this: {c} \in {c,b} so it is a valid plan according to the 
first check.
We have an issue according to the second check and we haven't done the 
work before according to the 3rd and 4th checks.
Since the lefthand of the IN {a} is not in either {b} or {c} we skip the 
IN_JOIN{_REVERSE}.
But since one of the relations is equal to the right side {c} of the IN 
we determine that IN_UNIQUE_{INNER,OUTER} is a valid jointype.

Now, the next join between {a} and {b,c} is the one I fail to understand 
when it can ever happen...

{c} \in {a,b,c} so it is a valid plan according to the first check.
We have an issue according to the second check.
Since we have no trace of the IN's left hand {a} in {b,c} 3rd and 4th 
check says we have not done the work?!?
The final checks fail because {c} != {b,c}, thus we determine it is an 
invalid plan.

My question is: When is it ever a valid jointype to use 
IN_UNIQUE_{INNER,OUTER}? Or am I missing something?


-- 
Dennis

Re: IN joining

From

Tom Lane

Date:

05 March 2004, 17:05:04

Dennis Haney <davh@diku.dk> writes:
> Consider this example:
> SELECT * FROM a,b WHERE a.id = b.id AND (a.id) IN (SELECT c.id FROM c)
> the possible execution trees are {{a,b}, {c}}, {{a,c},{b}} and the code 
> seems to also permit {{b,c},{a}}.

No, it does not --- as you say, that would give wrong answers.  That
case is eliminated by the tests following this comment:
            * JOIN_IN technique will work if outerrel includes LHS and            * innerrel is exactly RHS; conversely
JOIN_REVERSE_INhandles            * RHS/LHS.            *            * JOIN_UNIQUE_OUTER will work if outerrel is
exactlyRHS;            * conversely JOIN_UNIQUE_INNER will work if innerrel is            * exactly RHS.
 

Joining {b,c} to {a} does not meet any of those four allowed cases.
        regards, tom lane

Re: IN joining

From

Dennis Haney

Date:

05 March 2004, 17:32:05

Tom Lane wrote: <blockquote cite="mid11205.1078520699@sss.pgh.pa.us" type="cite"><pre wrap="">Dennis Haney <a
class="moz-txt-link-rfc2396E"href="mailto:davh@diku.dk"><davh@diku.dk></a> writes: </pre><blockquote
type="cite"><prewrap="">Consider this example:
 
SELECT * FROM a,b WHERE a.id = b.id AND (a.id) IN (SELECT c.id FROM c)
the possible execution trees are {{a,b}, {c}}, {{a,c},{b}} and the code 
seems to also permit {{b,c},{a}}.   </pre></blockquote><pre wrap="">
No, it does not --- as you say, that would give wrong answers.  That
case is eliminated by the tests following this comment:
            * JOIN_IN technique will work if outerrel includes LHS and            * innerrel is exactly RHS; conversely
JOIN_REVERSE_INhandles            * RHS/LHS.            *            * JOIN_UNIQUE_OUTER will work if outerrel is
exactlyRHS;            * conversely JOIN_UNIQUE_INNER will work if innerrel is            * exactly RHS.
 

Joining {b,c} to {a} does not meet any of those four allowed cases. </pre></blockquote> Exactly my point... So why ever
bothercreating the {b,c} node which is legal by the above definition?<br /><br /><br /><pre class="moz-signature"
cols="72">--
 
Dennis
use Inline C => q{void p(char*g){
printf("Just Another %s Hacker\n",g);}};p("Perl");
</pre>

Re: IN joining

From

Tom Lane

Date:

05 March 2004, 19:08:56

Dennis Haney <davh@diku.dk> writes:
>> Joining {b,c} to {a} does not meet any of those four allowed cases.
>> 
> Exactly my point... So why ever bother creating the {b,c} node which is 
> legal by the above definition?

We don't, because there is no such join clause.
        regards, tom lane

Re: IN joining

From

Dennis Haney

Date:

08 March 2004, 10:03:45

Tom Lane wrote: <blockquote cite="mid12170.1078528135@sss.pgh.pa.us" type="cite"><pre wrap="">Dennis Haney <a
class="moz-txt-link-rfc2396E"href="mailto:davh@diku.dk"><davh@diku.dk></a> writes: </pre><blockquote
type="cite"><blockquotetype="cite"><pre wrap="">Joining {b,c} to {a} does not meet any of those four allowed cases.
 
     </pre></blockquote><pre wrap="">Exactly my point... So why ever bother creating the {b,c} node which is 
legal by the above definition?   </pre></blockquote><pre wrap="">
We don't, because there is no such join clause.
 </pre></blockquote> No, but we create the equality via the implied equality mechanism...<br /><br /> select * from a,
bwhere a.id = b.id3 and a.id in (select c.id2 from c);<br /><br /> rtable is (after in-optimization):<br /> resno  
refname        relid   inFromCl<br /> -----   ---------       -----   --------<br /> 1       a     17143          
inFromCl<br/> 2       b     17151           inFromCl<br /> 3       IN_subquery     [subquery]<br /> 4       c    
17147          inFromCl<br /><br /> in gdb:<br /> break joinrels.c:563<br /> commands<br /> call
bms_is_subset(ininfo->lefthand,rel1->relids)<br /> call bms_equal(ininfo->righthand, rel2->relids)<br />
callbms_is_subset(ininfo->lefthand, rel2->relids)<br /> call bms_equal(ininfo->righthand, rel1->relids)<br
/>x/t rel1->relids.words<br /> x/t rel2->relids.words<br /> x/t joinrelids.words<br /> p jointype<br /> printf
"%s\n",pretty_format_node_dump(nodeToString(((RestrictInfo*)((RestrictInfo*)restrictlist)->clause)->clause))<br
/>end<br /><br /> then we get this join:<br /><br /> Breakpoint 4, make_join_rel (root=0x8307bc8, rel1=0x8316920,
rel2=0x8316b10,jointype=JOIN_UNIQUE_INNER)<br />     at joinrels.c:563<br /> 563             switch (jointype)<br />
$92= 0 '\0'<br /> $93 = 1 '\001'<br /> $94 = 0 '\0'<br /> $95 = 0 '\0'<br /> 0x83169ac:     
00000000000000000000000000000100<br/> 0x8316b9c:      00000000000000000000000000010000<br /> 0x832670c:     
00000000000000000000000000010100<br/> $96 = JOIN_UNIQUE_INNER<br />    {OPEXPR<br />    :opno 96<br />    :opfuncid
0<br/>    :opresulttype 16<br />    :opretset false<br />    :args (<br />       {VAR<br />       :varno 4<br />      
:varattno1<br />       :vartype 23<br />       :vartypmod -1<br />       :varlevelsup 0<br />       :varnoold 4<br />
     :varoattno 1<br />       }<br /><br />       {VAR<br />       :varno 2<br />       :varattno 1<br />      
:vartype23<br />       :vartypmod -1<br />       :varlevelsup 0<br />       :varnoold 2<br />       :varoattno 1<br />
     }<br />    )<br />    }<br /><br /><br /><pre class="moz-signature" cols="72">-- 
 
Dennis
</pre>

Re: IN joining

From

Tom Lane

Date:

08 March 2004, 11:44:06

Dennis Haney <davh@diku.dk> writes:
>>> Exactly my point... So why ever bother creating the {b,c} node which is 
>>> legal by the above definition?
>> 
>> We don't, because there is no such join clause.
>> 
> No, but we create the equality via the implied equality mechanism...

> select * from a, b where a.id = b.id3 and a.id in (select c.id2 from c);

Oh, I had forgotten that your original example involved an implied
equality.  I don't see that anything is wrong though.  The join path
that will result from considering the implied equality will be like
((UNIQUE-ified subselect) INNER JOIN b) INNER JOIN a

which is perfectly legal and perhaps even a winner.  Once you stick a
UNIQUE on top of the IN's subselect, you can treat the IN as exactly
like a plain equality join.

[ thinks a bit... ]  Actually I guess there is a problem here: we won't
actually generate that plan, because this test is too strict:
           /*            * If we already joined IN's RHS to any part of its LHS in            * either input path, then
thisjoin is not constrained (the            * necessary work was done at a lower level).            */           if
(bms_overlap(ininfo->lefthand,rel1->relids) &&               bms_is_subset(ininfo->righthand, rel1->relids))
  continue;           if (bms_overlap(ininfo->lefthand, rel2->relids) &&               bms_is_subset(ininfo->righthand,
rel2->relids))              continue;
 

I think it should be
           /*            * If we already joined IN's RHS to anything else in            * either input path, then this
joinis not constrained (the            * necessary work was done at a lower level).            */           if
(bms_is_subset(ininfo->righthand,rel1->relids) &&               !bms_equal(ininfo->righthand, rel1->relids))
  continue;           if (bms_is_subset(ininfo->righthand, rel2->relids) &&               !bms_equal(ininfo->righthand,
rel2->relids))              continue;
 

Comments?
        regards, tom lane

Re: IN joining

From

Dennis Haney

Date:

08 March 2004, 12:04:55

Tom Lane wrote:

[SNIP: a repetion of my first post ;) ]

>I think it should be
>
>            /*
>             * If we already joined IN's RHS to anything else in
>             * either input path, then this join is not constrained (the
>             * necessary work was done at a lower level).
>             */
>            if (bms_is_subset(ininfo->righthand, rel1->relids) &&
>                !bms_equal(ininfo->righthand, rel1->relids))
>                continue;
>            if (bms_is_subset(ininfo->righthand, rel2->relids) &&
>                !bms_equal(ininfo->righthand, rel2->relids))
>                continue;
>
>Comments?
>  
>
It's good.
It was pretty much what I was thinking was wrong to begin with.
Whether the generated plans are valid is a different issue ;)

-- 
Dennis