Re: btree: downlink right separator/HIKEY optimization - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: btree: downlink right separator/HIKEY optimization |
Date | |
Msg-id | ed1ad315-1a23-4f85-aa4b-c708e501df19@iki.fi Whole thread Raw |
In response to | btree: downlink right separator/HIKEY optimization (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: btree: downlink right separator/HIKEY optimization
|
List | pgsql-hackers |
On 01/11/2023 00:08, Matthias van de Meent wrote: > calling _bt_compare in _bt_moveright in many common cases. This patch, > when stacked on top of the prefix truncation patch, improves INSERT > performance by an additional 2-9%pt, with an extreme case of 45% in > the worscase index tests at [0]. > > The optimization is that we now recognze that our page split algorithm > all but guarantees that the HIKEY matches this page's downlink's right > separator key bytewise, excluding the data stored in the > IndexTupleData struct. Good observation. > By caching the right separator index tuple in _bt_search, we can > compare the downlink's right separator and the HIKEY, and when they > are equal (memcmp() == 0) we don't have to call _bt_compare - the > HIKEY is known to be larger than the scan key, because our key is > smaller than the right separator, and thus transitively also smaller > than the HIKEY because it contains the same data. As _bt_compare can > call expensive user-provided functions, this can be a large > performance boon, especially when there are only a small number of > column getting compared on each page (e.g. index tuples of many 100s > of bytes, or dynamic prefix truncation is enabled). What would be the worst case scenario for this? One situation where the memcmp() would not match is when there is a concurrent page split. I think it's OK to pessimize that case. Are there any other situations? When the memcmp() matches, I think this is almost certainly not slower than calling the datatype's comparison function. > if (offnum < PageGetMaxOffsetNumber(page)) > { > ItemId rightsepitem = PageGetItemId(page, offnum + 1); > IndexTuple pagerightsep = (IndexTuple) PageGetItem(page, rightsepitem); > memcpy(rsepbuf.data, pagerightsep, ItemIdGetLength(rightsepitem)); > rightsep = &rsepbuf.tuple; > } > else if (!P_RIGHTMOST(opaque)) > { > /* > * The rightmost data tuple on inner page has P_HIKEY as its right > * separator. > */ > ItemId rightsepitem = PageGetItemId(page, P_HIKEY); > IndexTuple pagerightsep = (IndexTuple) PageGetItem(page, rightsepitem); > memcpy(rsepbuf.data, pagerightsep, ItemIdGetLength(rightsepitem)); > rightsep = &rsepbuf.tuple; > } This could use a one-line comment above this, something like "Remember the right separator of the downlink we follow, to speed up the next _bt_moveright call". Should there be an "else rightsep = NULL;" here? Is it possible that we follow the non-rightmost downlink on a higher level and rightmost downlink on next level? Concurrent page deletion? Please update the comment above _bt_moveright to describe the new argument. Perhaps the text from README should go there, this feels like a detail specific to _bt_search and _bt_moveright. -- Heikki Linnakangas Neon (https://neon.tech)
pgsql-hackers by date: