Re: [HACKERS] Cache Hash Index meta page. - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] Cache Hash Index meta page. |
Date | |
Msg-id | CAA4eK1+K0q2_Di8CgDyBmcAzKiB38XhQfN6Sq-KArVoXQL7Lfg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Cache Hash Index meta page. (Mithun Cy <mithun.cy@enterprisedb.com>) |
Responses |
Re: [HACKERS] Cache Hash Index meta page.
|
List | pgsql-hackers |
On Fri, Jan 13, 2017 at 9:58 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote: > On Fri, Jan 6, 2017 at 11:43 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: Below are review comments on latest version of patch. 1. /* - * Read the metapage to fetch original bucket and tuple counts. Also, we - * keep a copy of the last-seen metapage so that we can use its - * hashm_spares[] values to compute bucket page addresses. This is a bit - * hokey but perfectly safe, since the interesting entries in the spares - * array cannot change under us; and it beats rereading the metapage for - * each bucket. + * update and get the metapage cache data. */ - metabuf = _hash_getbuf(rel, HASH_METAPAGE, HASH_READ, LH_META_PAGE); - metap = HashPageGetMeta(BufferGetPage(metabuf)); - orig_maxbucket = metap->hashm_maxbucket; - orig_ntuples = metap->hashm_ntuples; - memcpy(&local_metapage, metap, sizeof(local_metapage)); - /* release the lock, but keep pin */ - LockBuffer(metabuf, BUFFER_LOCK_UNLOCK); + cachedmetap = _hash_getcachedmetap(rel, true); + orig_maxbucket = cachedmetap->hashm_maxbucket; + orig_ntuples = cachedmetap->hashm_ntuples; (a) I think you can retain the previous comment or modify it slightly. Just removing the whole comment and replacing it with a single line seems like a step backward. (b) Another somewhat bigger problem is that with this new change it won't retain the pin on meta page till the end which means we might need to perform an I/O again during operation to fetch the meta page. AFAICS, you have just changed it so that you can call new API _hash_getcachedmetap, if that's true, then I think you have to find some other way of doing it. BTW, why can't you design your new API such that it always take pinned metapage? You can always release the pin in the caller if required. I understand that you don't always need a metapage in that API, but I think the current usage of that API is also not that good. 2. + if (bucket_opaque->hasho_prevblkno != InvalidBlockNumber || + bucket_opaque->hasho_prevblkno > cachedmetap->hashm_maxbucket) + cachedmetap = _hash_getcachedmetap(rel, true); I don't understand the meaning of above if check. It seems like you will update the metapage when previous block number is not a valid block number which will be true at the first split. How will you ensure that there is a re-split and cached metapage is not relevant. I think if there is && in the above condition, then we can ensure it. 3. + Given a hashkey get the target bucket page with read lock, using cached + metapage. The getbucketbuf_from_hashkey method below explains the same. + All the sentences in algorithm start with small letters, then why do you need an exception for this sentence. I think you don't need to add an empty line. Also, I think the usage of getbucketbuf_from_hashkey seems out of place. How about writing it as: The usage of cached metapage is explained later. 4. + If target bucket is split before metapage data was cached then we are + done. + Else first release the bucket page and then update the metapage cache + with latest metapage data. I think it is better to retain original text of readme and add about meta page update. 5. + Loop: .. .. + Loop again to reach the new target bucket. No need to write "Loop again ..", that is implicit. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: