Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2) - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2) |
Date | |
Msg-id | CAH2-Wzm5M+Djb8dJnGdiP8dwN7fZDNR-PSwNekGvKxa9-a_9cQ@mail.gmail.com Whole thread Raw |
In response to | Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2) (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)
|
List | pgsql-hackers |
On Tue, Aug 6, 2024 at 5:42 PM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > Attached is version 16 now. I ran this with my old land registry benchmark, used for validating the space utilization impact of nbtree deduplication (among other things). This isn't obviously the best benchmark for this sort of thing, but I seem to recall that you'd used it yourself at some point. To validate work in this area, likely including this patch. So I decided to start there. To be clear, this test involves bulk loading of an unlogged table (the land registry table). The following composite index is created on the table before we insert any rows, so most of the cycles here are in index maintenance including _bt_search descents: CREATE INDEX composite ON land2 USING btree (county COLLATE "C", city COLLATE "C", locality COLLATE "C"); I wasn't able to see much of an improvement with this patch applied. It went from ~00:51.598 to ~00:51.053. That's a little disappointing, given that this is supposed to be a sympathetic case for the patch. Can you suggest something else? (Granted, I understand that this patch has some complicated relationship with other patches of yours, which I don't understand currently.) I'm a bit worried about side-effects for this assertion: @@ -485,7 +489,7 @@ _bt_check_unique(Relation rel, BTInsertState insertstate, Relation heapRel, Assert(insertstate->bounds_valid); Assert(insertstate->low >= P_FIRSTDATAKEY(opaque)); Assert(insertstate->low <= insertstate->stricthigh); - Assert(_bt_compare(rel, itup_key, page, offset) < 0); + Assert(_bt_compare(rel, itup_key, page, offset, &sprefix) < 0); break; } More generally, it's not entirely clear how the code in _bt_check_unique is supposed to work with the patch. Why should it be safe to do what you're doing with the prefix there? It's not like we're doing a binary search here -- it's more like a linear search. > - Move from 1-indexed AttributeNumber to 0-indexed ints for prefixes, > and use "prefix" as naming scheme, rather than cmpcol. A lack of > prefix, previously indicated with a cmpcol value of 1, is now a prefix > value of 0. Found a likely-related bug in the changes you made to amcheck, which I was able to fix myself like so: diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index c7dc6725a..15be61777 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -3187,7 +3187,7 @@ bt_rootdescend(BtreeCheckState *state, IndexTuple itup) insertstate.buf = lbuf; /* Get matching tuple on leaf page */ - offnum = _bt_binsrch_insert(state->rel, &insertstate, 1); + offnum = _bt_binsrch_insert(state->rel, &insertstate, 0); /* Compare first >= matching item on leaf page, if any */ -- Peter Geoghegan
pgsql-hackers by date: