Re: Index Searches higher than expected for skip scan - Mailing list pgsql-performance
| From | Peter Geoghegan |
|---|---|
| Subject | Re: Index Searches higher than expected for skip scan |
| Date | |
| Msg-id | CAH2-Wz=BrVUghRp61nwgL2DZwMQ8EJfFphzC43QPhjm07U-61A@mail.gmail.com Whole thread Raw |
| In response to | Index Searches higher than expected for skip scan (Michael Christofides <michael@pgmustard.com>) |
| Responses |
Re: Index Searches higher than expected for skip scan
|
| List | pgsql-performance |
On Thu, Nov 6, 2025 at 2:01 PM Michael Christofides <michael@pgmustard.com> wrote: > I'm trying to understand the new Index Searches field in Postgres 18 explain analyze output. I've put together a supersimple test case (below) expecting a skip scan with 2 Index Searches, one for each value in the leading (boolean) columnof the index. That's a good question. You're right that in this specific case, with a boolean column, skip scan performs 4 index searches, when in principle it only really needs to do 3. > In reality, instead of 2 Index Searches, I got 4 (query plan below). During work on skip scan (and on the 17 work), I went to quite a lot of effort to build custom instrumentation that would show me exactly what an index scan was doing. Including details of the scan's array keys, and how they advance as the scan advances through the index. Attached is its output when I run your test query. The issue here is that skip scan thinks that there are 4 distinct skip array values that it must use: 1. SK_BT_MINVAL 2. false 3. true 4. SK_ISNULL SK_BT_MINVAL is a sentinel value that represents the lowest possible value that can be stored by the data type. That's the same thing as "false", which is a little bit unfortunate with a datatype like bool, where the difference might actually matter -- here we waste an access to the leftmost leaf page to "advance" the skip array from SK_BT_MINVAL to false, instead of making false the lowest skip array value directly, from the very start. This is highly unlikely to matter with a data type like integer, though: in practice it's very unlikely that the value INT_MIN is actually stored in the index, so having 2 distinct representations for the same thing (SK_BT_MINVAL and INT_MIN) is equally unlikely to result in the scan reading any more leaf pages than strictly necessary due to this implementation deficiency. We'll have to go to the leftmost leaf page to determine what the real lowest value in the index is either way -- doesn't matter if you start from SK_BT_MINVAL or from INT_MIN (unless there really are hundreds of tuples that have the value INT_MIN, when the difference between those 2 things starts to matter, as with your bool test case). I did actually think about making this work. In fact, there were revisions of the skip scan patch where it actually did work. Ultimately I judged that it wasn't worth the trouble of introducing this special case. Better to have types with skip support (like boolean and integer) behave exactly the same as all other types, by also using these SK_BT_MINVAL/SK_BT_MAXVAL sentinels (we don't use SK_BT_MAXVAL at all here because the highest skip array value is SK_ISNULL, but we would if the index was declared NULLS FIRST). That just leaves SK_ISNULL. We cannot assume that the index doesn't have any NULLs (unless the query uses IS NOT NULL directly). We might end up having to read the rightmost leaf page anyway, in which case there won't be an extra search for SK_ISNULL, but we'll likely need an extra search for this too (just like the leftmost leaf page, with SK_BT_MINVAL). This isn't an implementation deficiency -- it's strictly necessary for correctness. -- Peter Geoghegan
Attachment
pgsql-performance by date: