Re: [HACKERS] Page Scan Mode in Hash Index - Mailing list pgsql-hackers
From | Jesper Pedersen |
---|---|
Subject | Re: [HACKERS] Page Scan Mode in Hash Index |
Date | |
Msg-id | 0aaa9067-1e3c-868e-46fe-611a797ffd88@redhat.com Whole thread Raw |
In response to | [HACKERS] Page Scan Mode in Hash Index (Ashutosh Sharma <ashu.coek88@gmail.com>) |
Responses |
Re: [HACKERS] Page Scan Mode in Hash Index
|
List | pgsql-hackers |
Hi, On 02/14/2017 12:27 AM, Ashutosh Sharma wrote: > Currently, Hash Index scan works tuple-at-a-time, i.e. for every > qualifying tuple in a page, it acquires and releases the lock which > eventually increases the lock/unlock traffic. For example, if an index > page contains 100 qualified tuples, the current hash index scan has to > acquire and release the lock 100 times to read those qualified tuples > which is not good from performance perspective and it also impacts > concurency with VACUUM. > > Considering above points, I would like to propose a patch that allows > hash index scan to work in page-at-a-time mode. In page-at-a-time > mode, once lock is acquired on a target bucket page, the entire page > is scanned and all the qualified tuples are saved into backend's local > memory. This reduces the lock/unlock calls for retrieving tuples from > a page. Moreover, it also eliminates the problem of re-finding the > position of the last returned index tuple and more importanly it > allows VACUUM to release lock on current page before moving to the > next page which eventually improves it's concurrency with scan. > > Attached patch modifies hash index scan code for page-at-a-time mode. > For better readability, I have splitted it into 3 parts, > Due to the commits on master these patches applies with hunks. The README should be updated to mention the use of page scan. hash.h needs pg_indent. > 1) 0001-Rewrite-hash-index-scans-to-work-a-page-at-a-time.patch: this > patch rewrites the hash index scan module to work in page-at-a-time > mode. It basically introduces two new functions-- _hash_readpage() and > _hash_saveitem(). The former is used to load all the qualifying tuples > from a target bucket or overflow page into an items array. The latter > one is used by _hash_readpage to save all the qualifying tuples found > in a page into an items array. Apart from that, this patch bascially > cleans _hash_first(), _hash_next and hashgettuple(). > For _hash_next I don't see this - can you explain ? + * + * On failure exit (no more tuples), we release pin and set + * so->currPos.buf to InvalidBuffer. + * Returns true if any matching items are found else returns false. s/Returns/Return/g > 2) 0002-Remove-redundant-function-_hash_step-and-some-of-the.patch: > this patch basically removes the redundant function _hash_step() and > some of the unused members of HashScanOpaqueData structure. > Looks good. > 3) 0003-Improve-locking-startegy-during-VACUUM-in-Hash-Index.patch: > this patch basically improves the locking strategy for VACUUM in hash > index. As the new hash index scan works in page-at-a-time, vacuum can > release the lock on previous page before acquiring a lock on the next > page, hence, improving hash index concurrency. > + * As the new hash index scan work in page at a time mode, Remove 'new'. > I have also done the benchmarking of this patch and would like to > share the results for the same, > > Firstly, I have done the benchmarking with non-unique values and i > could see a performance improvement of 4-7%. For the detailed results > please find the attached file 'results-non-unique values-70ff', and > ddl.sql, test.sql are test scripts used in this experimentation. The > detail of non-default GUC params and pgbench command are mentioned in > the result sheet. I also did the benchmarking with unique values at > 300 and 1000 scale factor and its results are provided in > 'results-unique-values-default-ff'. > I'm seeing similar results, and especially with write heavy scenarios. Best regards, Jesper
pgsql-hackers by date: