Re: Removing unneeded self joins - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Removing unneeded self joins |
Date | |
Msg-id | 16aa5d06-d8c6-416a-cb71-21663019944e@iki.fi Whole thread Raw |
In response to | Re: Removing unneeded self joins ("Andrey V. Lepikhov" <a.lepikhov@postgrespro.ru>) |
Responses |
Re: Removing unneeded self joins
|
List | pgsql-hackers |
On 31/10/2020 11:26, Andrey V. Lepikhov wrote: > + /* > + * Process restrictlist to seperate out the self join quals from > + * the other quals. e.g x = x goes to selfjoinquals and a = b to > + * otherjoinquals. > + */ > + split_selfjoin_quals(root, restrictlist, &selfjoinquals, > + &otherjoinquals); > + > + if (list_length(selfjoinquals) == 0) > + { > + /* > + * Have a chance to remove join if target list contains vars from > + * the only one relation. > + */ > + if (list_length(otherjoinquals) == 0) > + { > + /* Can't determine uniqueness without any quals. */ > + continue; > + > + } > + else if (!tlist_contains_rel_exprs(root, joinrelids, inner)) > + { > + if (!innerrel_is_unique(root, joinrelids, outer->relids, > + inner, JOIN_INNER, otherjoinquals, > + false)) > + continue; > + } > + else > + /* > + * The target list contains vars from both inner and outer > + * relations. > + */ > + continue; > + } > + > + /* > + * Determine if the inner table can duplicate outer rows. We must > + * bypass the unique rel cache here since we're possibly using a > + * subset of join quals. We can use 'force_cache' = true when all > + * join quals are selfjoin quals. Otherwise we could end up > + * putting false negatives in the cache. > + */ > + else if (!innerrel_is_unique(root, joinrelids, outer->relids, > + inner, JOIN_INNER, selfjoinquals, > + list_length(otherjoinquals) == 0)) > + continue; I don't understand the logic here. If 'selfjoinquals' is empty, it means that there is no join qual between the two relations, right? How can we ever remove the join in that case? And how does the target list affect that? Can you give an example query of that? > --- a/src/test/regress/expected/join.out > +++ b/src/test/regress/expected/join.out > @@ -4553,11 +4553,13 @@ explain (costs off) > select p.* from > (parent p left join child c on (p.k = c.k)) join parent x on p.k = x.k > where p.k = 1 and p.k = 2; > - QUERY PLAN > --------------------------- > + QUERY PLAN > +------------------------------------------------ > Result > One-Time Filter: false > -(2 rows) > + -> Index Scan using parent_pkey on parent x > + Index Cond: (k = 1) > +(4 rows) > > -- bug 5255: this is not optimizable by join removal > begin; That doesn't seem like an improvement... My general impression of this patch is that it's a lot of code to deal with a rare special case. At the beginning of this thread there was discussion on the overhead that this might add to planning queries that don't benefit, but adding a lot of code isn't nice either, even if the performance is acceptable. That's not necessarily a show-stopper, but it does make me much less excited about this. I'm not sure what to suggest to do about that, except a really vague "Can you make is simpler?" - Heikki
pgsql-hackers by date: