Re: [HACKERS] Page Scan Mode in Hash Index - Mailing list pgsql-hackers
From | Ashutosh Sharma |
---|---|
Subject | Re: [HACKERS] Page Scan Mode in Hash Index |
Date | |
Msg-id | CAE9k0P=JzsP07sz+-1Ujnzfxn96JhEW+6dq4-HXsKa906CE9qw@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Page Scan Mode in Hash Index (Ashutosh Sharma <ashu.coek88@gmail.com>) |
Responses |
Re: [HACKERS] Page Scan Mode in Hash Index
|
List | pgsql-hackers |
While doing the code coverage testing of v7 patch shared with - [1], I found that there are few lines of code in _hash_next() that are redundant and needs to be removed. I basically came to know this while testing the scenario where a hash index scan starts when a split is in progress. I have removed those lines and attached is the newer version of patch. [1] - https://www.postgresql.org/message-id/CAE9k0Pmn92Le0iOTU5b6087eRXzUnkq1PKcihxtokHJ9boXCBg%40mail.gmail.com On Mon, May 8, 2017 at 6:55 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > Hi, > > On Tue, Apr 4, 2017 at 3:56 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Sun, Apr 2, 2017 at 4:14 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >>> >>> Please note that these patches needs to be applied on top of [1]. >>> >> >> Few more review comments: >> >> 1. >> - page = BufferGetPage(so->hashso_curbuf); >> + blkno = so->currPos.currPage; >> + if (so->hashso_bucket_buf == so->currPos.buf) >> + { >> + buf = so->currPos.buf; >> + LockBuffer(buf, BUFFER_LOCK_SHARE); >> + page = BufferGetPage(buf); >> + } >> >> Here, you are assuming that only bucket page can be pinned, but I >> think that assumption is not right. When _hash_kill_items() is called >> before moving to next page, there could be a pin on the overflow page. >> You need some logic to check if the buffer is pinned, then just lock >> it. I think once you do that, it might not be convinient to release >> the pin at the end of this function. > > Yes, there are few cases where we might have pin on overflow pages too > and we need to handle such cases in _hash_kill_items. I have taken > care of this in the attached v7 patch. Thanks. > >> >> Add some comments on top of _hash_kill_items to explain the new >> changes or say some thing like "See _bt_killitems for details" > > Added some more comments on the new changes. > >> >> 2. >> + /* >> + * We save the LSN of the page as we read it, so that we know whether it >> + * safe to apply LP_DEAD hints to the page later. This allows us to drop >> + * the pin for MVCC scans, which allows vacuum to avoid blocking. >> + */ >> + so->currPos.lsn = PageGetLSN(page); >> + >> >> The second part of above comment doesn't make sense because vacuum's >> will still be blocked because we hold the pin on primary bucket page. > > That's right. It doesn't make sense because we won't allow vacuum to > start. I have removed it. > >> >> 3. >> + { >> + /* >> + * No more matching tuples were found. return FALSE >> + * indicating the same. Also, remember the prev and >> + * next block number so that if fetching tuples using >> + * cursor we remember the page from where to start the >> + * scan. >> + */ >> + so->currPos.prevPage = (opaque)->hasho_prevblkno; >> + so->currPos.nextPage = (opaque)->hasho_nextblkno; >> >> You can't read opaque without having lock and by this time it has >> already been released. > > I have corrected it. Please refer to the attached v7 patch. > > Also, I think if you want to save position for >> cursor movement, then you need to save the position of last bucket >> when scan completes the overflow chain, however as you have written it >> will be always invalid block number. I think there is similar problem >> with prevblock number. > > Did you mean last bucket or last page. If it is last page, then I am > already storing it. > >> >> 4. >> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum, >> + OffsetNumber maxoff, ScanDirection dir) >> +{ >> + HashScanOpaque so = (HashScanOpaque) scan->opaque; >> + IndexTuple itup; >> + int itemIndex; >> + >> + if (ScanDirectionIsForward(dir)) >> + { >> + /* load items[] in ascending order */ >> + itemIndex = 0; >> + >> + /* new page, relocate starting position by binary search */ >> + offnum = _hash_binsearch(page, so->hashso_sk_hash); >> >> What is the need to find offset number when this function already has >> that as an input parameter? > > It's not required. I have removed it. > >> >> 5. >> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum, >> + OffsetNumber maxoff, ScanDirection dir) >> >> I think maxoff is not required to be passed a parameter to this >> function as you need it only for forward scan. You can compute it >> locally. > > It is required for both forward and backward scan. However, I am not > passing it to _hash_load_qualified_items() now. > >> >> 6. >> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum, >> + OffsetNumber maxoff, ScanDirection dir) >> { >> .. >> + if (ScanDirectionIsForward(dir)) >> + { >> .. >> + while (offnum <= maxoff) >> { >> .. >> + if (so->hashso_sk_hash == _hash_get_indextuple_hashkey(itup) && >> + _hash_checkqual(scan, itup)) >> + { >> + /* tuple is qualified, so remember it */ >> + _hash_saveitem(so, itemIndex, offnum, itup); >> + itemIndex++; >> + } >> + >> + offnum = OffsetNumberNext(offnum); >> .. >> >> Why are you traversing the whole page even when there is no match? >> There is a similar problem in backward scan. I think this is blunder. > > Fixed. Please check the attached > '0001-Rewrite-hash-index-scans-to-work-a-page-at-a-timev7.patch' > >> >> 7. >> + if (so->currPos.buf == so->hashso_bucket_buf || >> + so->currPos.buf == so->hashso_split_bucket_buf) >> + { >> + so->currPos.prevPage = InvalidBlockNumber; >> + LockBuffer(so->currPos.buf, BUFFER_LOCK_UNLOCK); >> + } >> + else >> + { >> + so->currPos.prevPage = (opaque)->hasho_prevblkno; >> + _hash_relbuf(rel, so->currPos.buf); >> + } >> + >> + so->currPos.nextPage = (opaque)->hasho_nextblkno; >> >> What makes you think it is safe read opaque after releasing the lock? > > Nothing makes me think to read opaque after releasing lock. It's a > mistake. I have corrected it. Please check attached v7 patch. > > -- > With Regards, > Ashutosh Sharma > EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: