Re: Reduce "Var IS [NOT] NULL" quals during constant folding - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: Reduce "Var IS [NOT] NULL" quals during constant folding
Date
Msg-id 82f00d16-b72b-4b09-aef6-3a157fe7d633@gmail.com
Whole thread Raw
In response to Reduce "Var IS [NOT] NULL" quals during constant folding  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Reduce "Var IS [NOT] NULL" quals during constant folding
List pgsql-hackers
On 2/7/2025 03:24, Richard Guo wrote:
> On Tue, Jul 1, 2025 at 10:57 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> I like the general idea of this work. But I wonder, why is a new hash
>> table designed to store only the notnullattnums field? From the
>> discussion, it is not apparent why not to cache all (or most of) the
>> data needed for get_relation_info. In cases where multiple subqueries
>> reference the same table, it could save some cycles and memory.
> 
> I think this idea was already thoroughly discussed earlier in this
> thread when Robert proposed moving get_relation_info() to an earlier
> stage.  One reason against it is that not every RTE_RELATION relation
> will be actively part of the query.  Collecting the whole bundle of
> catalog information for such relations is wasteful and can negatively
> impact performance.
I'm trying to understand the phrase "not every relation ...". Could you 
clarify that? I know that Postgres can eliminate some self-joins and 
outer joins, and might determine that a WHERE clause is always false, 
etc. However, these cases seem to be rare, especially when users refine 
their queries. Additionally, AFAICS, this is not an issue for partition 
pruning.

Generally, I believe these optimisations should have a positive impact. 
So, I think "not actively participate" might mean something different.

I must say that I appreciate Tom's idea and see significant benefits in 
making the parse tree a read-only structure. In complex queries, it can 
be frustrating to make copies of the parse tree, leading to complaints 
from users about insufficient memory allocation. This is why, in our 
enterprise fork, we support a specific option to avoid copying the parse 
tree multiple times.

Therefore, it would be better to find a way to refactor the 
`preprocess_relation_rtes` function to gather table statistics lazily 
into the hash table when they are needed. For example, we could do this 
at the moment of creating the `RelOptInfo` or before a subquery pull-up, 
without modifying the RTE at all.

-- 
regards, Andrei Lepikhov



pgsql-hackers by date:

Previous
From: Nazir Bilal Yavuz
Date:
Subject: Explicitly enable meson features in CI
Next
From: Daniel Gustafsson
Date:
Subject: Re: Explicitly enable meson features in CI