Home > mailing lists

Re: Removing unneeded self joins - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Removing unneeded self joins
Date	November 27, 2020 16:49:57
Msg-id	16aa5d06-d8c6-416a-cb71-21663019944e@iki.fi Whole thread
In response to	Re: Removing unneeded self joins ("Andrey V. Lepikhov" <a.lepikhov@postgrespro.ru>)
Responses	Re: Removing unneeded self joins
List	pgsql-hackers

Tree view

On 31/10/2020 11:26, Andrey V. Lepikhov wrote:
> +            /*
> +             * Process restrictlist to seperate out the self join quals from
> +             * the other quals. e.g x = x goes to selfjoinquals and a = b to
> +             * otherjoinquals.
> +             */
> +            split_selfjoin_quals(root, restrictlist, &selfjoinquals,
> +                                 &otherjoinquals);
> +
> +            if (list_length(selfjoinquals) == 0)
> +            {
> +                /*
> +                 * Have a chance to remove join if target list contains vars from
> +                 * the only one relation.
> +                 */
> +                if (list_length(otherjoinquals) == 0)
> +                {
> +                    /* Can't determine uniqueness without any quals. */
> +                    continue;
> +
> +                }
> +                else if (!tlist_contains_rel_exprs(root, joinrelids, inner))
> +                {
> +                    if (!innerrel_is_unique(root, joinrelids, outer->relids,
> +                                         inner, JOIN_INNER, otherjoinquals,
> +                                         false))
> +                        continue;
> +                }
> +                else
> +                    /*
> +                     * The target list contains vars from both inner and outer
> +                     * relations.
> +                     */
> +                    continue;
> +            }
> +
> +            /*
> +             * Determine if the inner table can duplicate outer rows.  We must
> +             * bypass the unique rel cache here since we're possibly using a
> +             * subset of join quals. We can use 'force_cache' = true when all
> +             * join quals are selfjoin quals.  Otherwise we could end up
> +             * putting false negatives in the cache.
> +             */
> +            else if (!innerrel_is_unique(root, joinrelids, outer->relids,
> +                                         inner, JOIN_INNER, selfjoinquals,
> +                                         list_length(otherjoinquals) == 0))
> +                continue;

I don't understand the logic here. If 'selfjoinquals' is empty, it means 
that there is no join qual between the two relations, right? How can we 
ever remove the join in that case? And how does the target list affect 
that? Can you give an example query of that?

> --- a/src/test/regress/expected/join.out
> +++ b/src/test/regress/expected/join.out
> @@ -4553,11 +4553,13 @@ explain (costs off)
>  select p.* from
>    (parent p left join child c on (p.k = c.k)) join parent x on p.k = x.k
>    where p.k = 1 and p.k = 2;
> -        QUERY PLAN        
> ---------------------------
> +                   QUERY PLAN                   
> +------------------------------------------------
>   Result
>     One-Time Filter: false
> -(2 rows)
> +   ->  Index Scan using parent_pkey on parent x
> +         Index Cond: (k = 1)
> +(4 rows)
>  
>  -- bug 5255: this is not optimizable by join removal
>  begin;

That doesn't seem like an improvement...

My general impression of this patch is that it's a lot of code to deal 
with a rare special case. At the beginning of this thread there was 
discussion on the overhead that this might add to planning queries that 
don't benefit, but adding a lot of code isn't nice either, even if the 
performance is acceptable. That's not necessarily a show-stopper, but it 
does make me much less excited about this. I'm not sure what to suggest 
to do about that, except a really vague "Can you make is simpler?"

- Heikki

pgsql-hackers by date:

From: Stephen Frost
Date: 27 November 2020, 16:28:20
Subject: Re: A few new options for CHECKPOINT

From: Alvaro Herrera
Date: 27 November 2020, 16:53:07
Subject: Re: remove spurious CREATE INDEX CONCURRENTLY wait

Re: Removing unneeded self joins - Mailing list pgsql-hackers

Previous

Next