Re: Need more reviewers! - Mailing list pgsql-hackers
From | Kenneth Marshall |
---|---|
Subject | Re: Need more reviewers! |
Date | |
Msg-id | 20080904184817.GG16619@it.is.rice.edu Whole thread Raw |
In response to | Re: Need more reviewers! (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Need more reviewers!
|
List | pgsql-hackers |
On Thu, Sep 04, 2008 at 02:01:18PM -0400, Tom Lane wrote: > "Jonah H. Harris" <jonah.harris@gmail.com> writes: > > I'll push forward on reviewing and testing Xiao's hash index > > improvements for inclusion into core. Though, someone will still need > > to review my stuff. > > I think what the hash index patch really needs is some performance > testing. I'm willing to take responsibility for the code being okay > or not, but I haven't got any production-grade hardware to do realistic > performance tests on. I'd like to see a few more scenarios tested than > the one provided so far: in particular, wide vs narrow index keys and > good vs bad key distributions. > Tom, Do you mean good vs. bad key distributions with respect to hash value collisions? > It strikes me that there's an intermediate posture that the patch > doesn't provide: namely, store *both* hash code and original index key > in each index entry. This would make the index larger rather than > smaller, but you could still do the intra-page sort by hash code, > which I think is likely a big win. In exchange for the bigger index > you'd avoid fruitless trips to the heap for chance hashcode matches. > So it seems far from clear to me that the hash-code-only solution is > necessarily the best. > Currently, since a trip to the heap is required to validate any tuple even if the exact value is contained in the index, it does not seem like it would be a win to store the value in both places. One of the big improvements is the space efficiency of the index obtained by just storing the hash value. I think that increasing the number of hash bits stored would provide more bang-for-the-buck than storing the exact value. One easy way to provide this functionality without increasing the storage requirements is to take advantage of the knowledge of the bucket number to increase the number of bit, i.e. at 256 buckets, remove the first 8-bits of the hash and add 8-bits to provide a 40-bit hash in the same storage. At 64k buckets, you can now store a 48-bit hash in the same space and at 16m buckets, you can store a 56-bit hash in the original 32-bit space. So as the number of buckets increases, the number of hash bits increases as well as the specificity of the resulting hash thereby reducing the need to visit the heap tuple store. Another idea, takes advantage of smaller "bucket" size (I think of them as sub-buckets) to add an additional 8-bits of hash value at the 32m or 64m buckets by looking up the hash in one of 2^n sub-buckets where n = 64 or 128 or even 256. This has the advantage of eliminating the binary lookup and insertion within the bucket page. Again, this will need to be tested. > If anyone is willing to do comparative performance testing, I'll > volunteer to make up two variant patches that do it both ways and > are otherwise equivalent. > I am in the boat with you, in that I do not have database systems with enough hardware to perform realistic testing. > (I wouldn't be too surprised if testing shows that each way could be > a win depending on the situation. In that case we could think about > how to provide a switch to let users pick the method. But for the > moment let's just test two separate patches.) I think, as of yet un-tested, that where storing both would be of value when we do not have to visit the heap for visibility information or when using the space to store more hash bits would be more effective. Cheers, Ken > regards, tom lane >
pgsql-hackers by date: