Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location |
Date | |
Msg-id | CAA4eK1JKm=bRZf4WcF9asejEGMi5kQaTHbqHALorVAX3M9w64Q@mail.gmail.com Whole thread Raw |
In response to | [WIP] [B-Tree] Keep indexes sorted by heap physical location (Claudio Freire <klaussfreire@gmail.com>) |
Responses |
Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location
|
List | pgsql-hackers |
On Thu, Aug 18, 2016 at 8:24 AM, Claudio Freire <klaussfreire@gmail.com> wrote: > > A couple of points make me uneasy about this patch, yet I can think of > no better alternative, so I seek feedback: > > - introducing a different format for inner index tuples makes for an > invasive patch and quite difficult-to-reason code (it's easy to forget > whether a page is leaf or inner and that now causes assertion failures > or segfaults) > - the ascent-descent to check for uniqueness when there are large > dead tuple runs could have locking subtleties that escape me. Perhaps > there's better ways to solve this. I have tried to study this part of your patch and it seems somewhat non-trivial and risky part of proposal. + } else { + /* + * We found the actual first item on another block, so we have to perform + * a two-step search - first half, until our write-locked buffer, then another + * starting from our write-locked buffer. + */ + LockBuffer(buf, BUFFER_LOCK_UNLOCK); + LockBuffer(buf, BT_WRITE); + + buf = _bt_moveright(rel, buf, natts, itup_scankey, &(itup->t_tid), false, + true, stack, BT_WRITE, NULL); + + first_offset = _bt_binsrch(rel, buf, natts, itup_scankey, NULL, false, NULL); + + xwait = _bt_check_unique(rel, itup, heapRel, nbuf, buf, first_offset, itup_scankey, + checkUnique, &is_unique, &speculativeToken); + + _bt_relbuf(rel, nbuf); + } The idea for uniqueness check is that we hold the write lock on buffer/page on which we are trying to operate (we do take read locks on the consecutive pages during this check). Here, in your changed proposal, you have two buffers (one to which the search led you and one buffer previous in the chain) and before checking uniqueness, on one of the buffers you seem to have write lock and on other you have read lock. This seems to change the way uniqueness check works till now, can you explain how this works (can we safely assume that all searches for uniqueness check will lead to the same buffer first). With this new mechanism, do we have two type of search interfaces where one would work for keys (as now) and other for key-ctid or it will be just a single interface which works both ways? I think either way there are pros and cons. Also, in the thread below you are talking about using the last bit in t_info, I want to bring it to your notice that there is a patch of mine [1] in which I have used it to avoid changing on-disk compatibility of hash indexes. I am not saying that we should not plan it for some other use, but just to keep in mind that there are other proposals to use it. We can decide what is best way to proceed, if we are aware of all the proposals that want to use it. [1] - https://www.postgresql.org/message-id/CAA4eK1LkQ_Udism-Z2Dq6cUvjH3dB5FNFNnEzZBPsRjw0haFqA@mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: