Thread: pgsql: Improve nbtree skip scan primitive scan scheduling.

pgsql: Improve nbtree skip scan primitive scan scheduling.

From
Peter Geoghegan
Date:
Improve nbtree skip scan primitive scan scheduling.

Don't allow nbtree scans with skip arrays to end any primitive scan on
its first leaf page without giving some consideration to how many times
the scan's arrays advanced while changing at least one skip array
(though continue not caring about the number of array advancements that
only affected SAOP arrays, even during skip scans with SAOP arrays).
Now when a scan performs more than 3 such array advancements in the
course of reading a single leaf page, it is taken as a signal that the
next page is unlikely to be skippable.  We'll therefore continue the
ongoing primitive index scan, at least until we can perform a recheck
against the next page's finaltup.

Testing has shown that this new heuristic occasionally makes all the
difference with skip scans that were expected to rely on the "passed
first page" heuristic added by commit 9a2e2a28.  Without it, there is a
remaining risk that certain kinds of skip scans will never quite manage
to clear the initial hurdle of performing a primitive scan that lasts
beyond its first leaf page (or that such a skip scan will only clear
that initial hurdle when it has already wasted noticeably-many cycles
due to inefficient primitive scan scheduling).

Follow-up to commits 92fe23d9 and 9a2e2a28.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=RVdG3zWytFWBsyW7fWH7zveFvTHed5JKEsuTT0RCO_A@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/21a152b37f36c9563d1b0b058fe1436baf578b4c

Modified Files
--------------
src/backend/access/nbtree/nbtsearch.c | 16 +++++++
src/backend/access/nbtree/nbtutils.c  | 90 +++++++++++++++++++++++------------
src/include/access/nbtree.h           |  3 +-
3 files changed, 78 insertions(+), 31 deletions(-)


Re: pgsql: Improve nbtree skip scan primitive scan scheduling.

From
Mark Dilger
Date:
Peter, Matthias, thanks kindly for the good work on skipscans!

While admiring the recent performance improvements, I found a test case which fails after commit 21a152b37f36c9563d1b0b058fe1436baf578b4c.  Please find a reproducible test case, attached.

Thanks!



On Fri, Apr 4, 2025 at 10:58 AM Peter Geoghegan <pg@bowt.ie> wrote:
Improve nbtree skip scan primitive scan scheduling.

Don't allow nbtree scans with skip arrays to end any primitive scan on
its first leaf page without giving some consideration to how many times
the scan's arrays advanced while changing at least one skip array
(though continue not caring about the number of array advancements that
only affected SAOP arrays, even during skip scans with SAOP arrays).
Now when a scan performs more than 3 such array advancements in the
course of reading a single leaf page, it is taken as a signal that the
next page is unlikely to be skippable.  We'll therefore continue the
ongoing primitive index scan, at least until we can perform a recheck
against the next page's finaltup.

Testing has shown that this new heuristic occasionally makes all the
difference with skip scans that were expected to rely on the "passed
first page" heuristic added by commit 9a2e2a28.  Without it, there is a
remaining risk that certain kinds of skip scans will never quite manage
to clear the initial hurdle of performing a primitive scan that lasts
beyond its first leaf page (or that such a skip scan will only clear
that initial hurdle when it has already wasted noticeably-many cycles
due to inefficient primitive scan scheduling).

Follow-up to commits 92fe23d9 and 9a2e2a28.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=RVdG3zWytFWBsyW7fWH7zveFvTHed5JKEsuTT0RCO_A@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/21a152b37f36c9563d1b0b058fe1436baf578b4c

Modified Files
--------------
src/backend/access/nbtree/nbtsearch.c | 16 +++++++
src/backend/access/nbtree/nbtutils.c  | 90 +++++++++++++++++++++++------------
src/include/access/nbtree.h           |  3 +-
3 files changed, 78 insertions(+), 31 deletions(-)



--

Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

Re: pgsql: Improve nbtree skip scan primitive scan scheduling.

From
Peter Geoghegan
Date:
On Sat, Apr 26, 2025 at 10:39 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> Peter, Matthias, thanks kindly for the good work on skipscans!

Thanks!

> I found a test case which fails after commit 21a152b37f36c9563d1b0b058fe1436baf578b4c.  Please find a reproducible
testcase, attached. 

The bug isn't actually in commit
21a152b37f36c9563d1b0b058fe1436baf578b4c -- it's just an accident that
the mechanism added by that commit happens to make your test case
fail. The underlying issue was introduced in commit 8a510275, "Further
optimize nbtree search scan key comparisons".

This looks to have been a silly oversight in our handling of NULL
tuple datums within _bt_check_compare. Attached provisional fix makes
your test case pass.

--
Peter Geoghegan

Attachment

Re: pgsql: Improve nbtree skip scan primitive scan scheduling.

From
Mark Dilger
Date:


On Sat, Apr 26, 2025 at 8:53 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Sat, Apr 26, 2025 at 10:39 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> Peter, Matthias, thanks kindly for the good work on skipscans!

Thanks!

> I found a test case which fails after commit 21a152b37f36c9563d1b0b058fe1436baf578b4c.  Please find a reproducible test case, attached.

The bug isn't actually in commit
21a152b37f36c9563d1b0b058fe1436baf578b4c -- it's just an accident that
the mechanism added by that commit happens to make your test case
fail. The underlying issue was introduced in commit 8a510275, "Further
optimize nbtree search scan key comparisons".

This looks to have been a silly oversight in our handling of NULL
tuple datums within _bt_check_compare. Attached provisional fix makes
your test case pass.

--
Peter Geoghegan

I can confirm that your patch fixes the problem, having spent some four hours trying to find other test cases which still fail, finding none.

Thank you for the quick reply, and again for the work on btree.

--

Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgsql: Improve nbtree skip scan primitive scan scheduling.

From
Peter Geoghegan
Date:
On Sun, Apr 27, 2025 at 1:06 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> I can confirm that your patch fixes the problem, having spent some four hours trying to find other test cases which
stillfail, finding none. 

Great.

I pushed essentially the same patch to HEAD just now.

Thanks for the report!

--
Peter Geoghegan



Re: pgsql: Improve nbtree skip scan primitive scan scheduling.

From
Mark Dilger
Date:


On Mon, Apr 28, 2025 at 9:12 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Sun, Apr 27, 2025 at 1:06 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> I can confirm that your patch fixes the problem, having spent some four hours trying to find other test cases which still fail, finding none.

Great.

I pushed essentially the same patch to HEAD just now.

A similar assertion can still be reached on HEAD using the attached (rather contrived) regression test as of f60420cff66a9089a9b431f9c07f4a29aae4990a:

TRAP: failed Assert("sktrig_required"), File: "nbtutils.c", Line: 1861, PID: 93648
0   postgres                            0x0000000100d1b0e0 ExceptionalCondition + 108
1   postgres                            0x00000001008b7658 _bt_advance_array_keys + 3828
2   postgres                            0x00000001008b5dc0 _bt_check_compare + 372
3   postgres                            0x00000001008b58ac _bt_checkkeys + 200
4   postgres                            0x00000001008b1400 _bt_readpage + 764
5   postgres                            0x00000001008b0454 _bt_readnextpage + 1260
6   postgres                            0x00000001008b0ab0 _bt_next + 76
7   postgres                            0x00000001008ac644 btgettuple + 136
8   postgres                            0x000000010089dcdc index_getnext_tid + 68
9   postgres                            0x0000000100a297a0 IndexOnlyNext + 228
10  postgres                            0x0000000100a0db2c ExecScan + 228
11  postgres                            0x0000000100a39088 ExecSort + 536
12  postgres                            0x0000000100a03c68 standard_ExecutorRun + 312
13  postgres                            0x0000000100be5fb8 PortalRunSelect + 236
14  postgres                            0x0000000100be5bd4 PortalRun + 492
15  postgres                            0x0000000100be4b18 exec_simple_query + 1292
16  postgres                            0x0000000100be1d1c PostgresMain + 1388
17  postgres                            0x0000000100bdd9a8 BackendInitialize + 0
18  postgres                            0x0000000100b36dd8 postmaster_child_launch + 372
19  postgres                            0x0000000100b3b03c ServerLoop + 4948
20  postgres                            0x0000000100b39360 InitProcessGlobals + 0
21  postgres                            0x0000000100a58c00 help + 0
22  dyld                                0x000000018eb17154 start + 2476




Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

Re: pgsql: Improve nbtree skip scan primitive scan scheduling.

From
Peter Geoghegan
Date:
On Wed, Apr 30, 2025 at 4:11 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> A similar assertion can still be reached on HEAD using the attached (rather contrived) regression test as of
f60420cff66a9089a9b431f9c07f4a29aae4990a:

I independently found the same bug just a few hours ago, using an
enhanced version of my python fuzzing script (posted to the main skip
scan thread) that now performs backwards scans. I'm not at all
surprised to see that your repro also uses ORDER BY ... DESC.

Would you post to that other thread going forward, rather than posting
to this -committers thread?

I'm going to talk about this on the main skip scan -hackers thread,
when I have a proper fix. I already have a fix for the issue that you
reported, but there seems to be an independent remaining issue with
backwards scans that I haven't quite got to the bottom of just yet.

Thanks
--
Peter Geoghegan