Re: Adding skip scan (including MDAM style range skip scan) to nbtree - Mailing list pgsql-hackers

From Natalya Aksman
Subject Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Date
Msg-id CAJumhcirfMojbk20+W0YimbNDkwdECvJprQGQ-XqK--ph09nQw@mail.gmail.com
Whole thread Raw
In response to Re: Adding skip scan (including MDAM style range skip scan) to nbtree  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Adding skip scan (including MDAM style range skip scan) to nbtree
List pgsql-hackers
Timescaledb implemented multikey skipscan feature for queries like "select distinct key1, key2 ... from t_indexed_on_key1_key2". It pins key1 to a found key value (i.e key1=val1)  to skip over distinct values of key2. Then after values for (key1=va1) are exhausted the next distinct tuple is searched with (key1>val1).

In short, this implementation can change the scan key structure from "key1=val1" to "key1>val1" and back, and not just the key comparison value (i.e. val1).
It means that so->skipScan can get reset from true to false after the next call to _bt_preprocess_keys.

But after btrescan resets "so->numberOfKeys = 0", so->skipScan is not reset to "false" in  _bt_preprocess_keys because of this code: https://github.com/postgres/postgres/blob/9016fa7e3bcde8ae4c3d63c707143af147486a10/src/backend/access/nbtree/nbtpreprocesskeys.c#L1847
After we set "so->numberOfKeys = 0" we quit on line 1847 before we get to the line 1874 where we do "so->skipScan = (numSkipArrayKeys > 0);" https://github.com/postgres/postgres/blob/9016fa7e3bcde8ae4c3d63c707143af147486a10/src/backend/access/nbtree/nbtpreprocesskeys.c#L1874

I.e. if btrescan resets  "so->numberOfKeys = 0",  _bt_preprocess_keys quits before resetting  so->skipScan to false. 
It is not an issue when the scan key structure is not changed in amrescan, and I see that this is an intended usage. 
But in case the intended amrescan usage changes in the future, the issue may come up.

It's not a priority at the moment as we can reset so->skipScan in our extension.

Thank you,
Natalya Aksman.



On Wed, Sep 10, 2025 at 12:46 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, Sep 10, 2025 at 9:53 AM Natalya Aksman <natalya@tigerdata.com> wrote:
> Our Timescaledb extension has scenarios changing ">" quals to "=" and back on rescan and it breaks when so->Skipscan needs to be reset from true to false.

But the amrescan docs say:

"In practice the restart feature is used when a new outer tuple is
selected by a nested-loop join and so a new key comparison value is
needed, but the scan key structure remains the same" [1].

I don't understand why it is that our not resetting the so->Skipscan
flag within btrescan has any particular significance to Timescaledb,
relative to all of the other fields that are supposed to be set by
_bt_preprocess_keys. What is the actual failure you see? Is it an
assertion failure within _bt_readpage/_bt_checkkeys?

Note that btrescan *does* set "so->numberOfKeys = 0", which will make
the next call to _bt_preprocess_keys (from _bt_first) perform
preprocessing from scratch. This should set so->Skipscan from scratch
on each rescan (along with every other field set by preprocessing). It
seems like that should work for you (in spite of the fact that you're
doing something that seems at odds with the index AM API).

[1] https://www.postgresql.org/docs/current/index-functions.html
--
Peter Geoghegan

pgsql-hackers by date:

Previous
From: Zsolt Parragi
Date:
Subject: Re: OAuth client code doesn't work with Google OAuth
Next
From: Arseniy Mukhin
Date:
Subject: Re: Proposal: Out-of-Order NOTIFY via GUC to Improve LISTEN/NOTIFY Throughput