Re: PostgreSQL 18 GA press release draft - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: PostgreSQL 18 GA press release draft
Date
Msg-id CAH2-WzkKgCEnXyaaXPB-a1JrdC9uRPvSEiAhEvb6M+vMTSTdaQ@mail.gmail.com
Whole thread Raw
In response to PostgreSQL 18 GA press release draft  ("Jonathan S. Katz" <jkatz@postgresql.org>)
List pgsql-hackers
On Tue, Sep 9, 2025 at 11:13 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
> Attached is a draft of the PostgreSQL 18 GA press release.

I have some feedback on this sentence:

"It can also automatically optimize queries using `OR` or `IN (VALUES
...)` in `WHERE` clauses for faster execution".

This is factually correct, but I think that it gives too much
importance to the `IN (VALUES ...)` transformation added by commit
c0962a11. IMV we shouldn't mention anything about transformations that
affect queries that use IN(), since it only applies to `IN (VALUES
...)` -- which is a rather limited special case. Especially because
this IN(VALUES()) case is limited to transforming queries that only
have true constants in the VALUES() clause -- it cannot work with
parameters at all.

I say this in part because I've noticed that existing press reports
about this functionality (which were based on the beta1 announcement)
say that it affects IN() queries in general, which isn't true. Again,
I know that you haven't made that same mistake here -- but a lot of
people will read `IN (VALUES ...)` as "any and all IN() lists".
They'll tend to interpret "VALUES" as "some values that appear in an
IN()", and not "a VALUES() clause that appears in an IN()".

The work from commits d4378c00 and ae456916 is truly important, and
definitely merits prominent mention in the press release. That'll
transform a query written as "SELECT * FROM tenk1 WHERE tenthous = 1
OR tenthous = 3 OR tenthous = 42 OR tenthous = 0" into a
representation that was previous only used when the query was written
"SELECT * FROM tenk1 WHERE tenthous IN (1,3,42,0)" (namely, it
transforms the original such that we can use the ScalarArrayOpExpr
representation).

This transformation is particularly useful in cases where it'll allow
us to get an index-only scan plan instead of a BitmapOr plan (with one
bitmap index scan child node for each of the 3 "tenthous" values from
the query), which is the only plan we could ever get on earlier
releases. The transformation process for these OR cases *can* work
with dynamic parameters (unlike the VALUES() stuff), and so can even
be used on the inner side of a join (see also commit 627d6341, which
dealt with making it possible to use OR transformation with joins).

Putting it all together, I suggest the following alternative:

"It can also automatically transform queries with `OR` constructs in
their `WHERE` clause into a logically equivalent IN() representation
that can be pushed down to index scan nodes, leading to significantly
faster execution".

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Natalya Aksman
Date:
Subject: Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Next
From: Robert Haas
Date:
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)