Re: pgsql: Optimize btree insertions for common case of increasing values - Mailing list pgsql-committers
From | Peter Geoghegan |
---|---|
Subject | Re: pgsql: Optimize btree insertions for common case of increasing values |
Date | |
Msg-id | CAH2-WzkpJd5TP6uRCqa473mzZCNsVk5KjE3xrs-4=FfFiAMHog@mail.gmail.com Whole thread Raw |
In response to | Re: pgsql: Optimize btree insertions for common case of increasing values (Pavan Deolasee <pavan.deolasee@gmail.com>) |
Responses |
Re: pgsql: Optimize btree insertions for common case of increasing values
|
List | pgsql-committers |
On Thu, Apr 5, 2018 at 6:27 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > I came up with something like this: > > + /* > + * If we're here then a pagesplit is needed. We should never > reach here > + * if we're using the fastpath since we should have checked > for all the > + * required conditions, including the fact that this page > has enough > + * freespace. Note that this routine can in-theory deal with > the > + * situation where a NULL stack pointer is passed (that's > what would > + * happen if the fastpath is taken), like it does during > crash > + * recovery. But that path is much slower, defeating the > very purpose > + * of the optimisation. The following assertion should > protect us from > + * any future code changes that invalidate those > assumptions. > + * > + * Note the fact that whenever we fail to take the fastpath, > we clear > + * the cached block. So checking for a valid cached block at > this point > + * is enough to decide whether we're in a fastpath or not. > + */ > + Assert(!(P_ISLEAF(lpageop) && > + > BlockNumberIsValid(RelationGetTargetBlock(rel)))); > + This looks good to me. > But then I started thinking can _bt_insertonpg() be called from a code path > which doesn't start at _bt_doinsert()? I traced one such path > _bt_getstackbuf() -> _bt_finish_split() -> _bt_insert_parent(), but that can > only happen in case of a non-leaf page. The assertion which checks for a > leaf page, should be fine, right? Right. That's the whole reason for the P_ISLEAF() part of the assertion. Actually, since a page in an internal page can only happen as a direct consequence of a page split in one of its children, you don't even need the P_ISLEAF() part. I'd leave it in, though, since it makes it clearer which caller you're thinking about. >> Finally, I'd like to see a small paragraph in the nbtree README, about >> the high level theory behind this optimization and page recycling. The >> assumption is that there can only be one non-ignorable leaf rightmost >> page, and so even a RecentGlobalXmin style interlock isn't required. >> We cannot fail to detect that our hint was invalidated, because there >> can only be one such page in the B-Tree at any time. It's possible >> that the page will be deleted and recycled without a backend's cached >> page also being detected as invalidated, but only when we happen to >> recycle a page that once again becomes the rightmost leaf page. >> > > Can I borrow that almost verbatim, after adding details about the > optimisation itself? Looks like I'm too tired to think sensibly. I know the feeling. I think you can take that wording almost verbatim. Obviously it should refer to the optimization by name, and blend into the surrounding text in the README. I suggest putting a small section before "On-the-Fly Deletion Of Index Tuples", but after the main discussion of deletion + recycling. It's essentially an exception to the general rule, so that placement makes sense to me. Thanks -- Peter Geoghegan
pgsql-committers by date: