Re: B-Tree support function number 3 (strxfrm() optimization) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: B-Tree support function number 3 (strxfrm() optimization) |
Date | |
Msg-id | CA+TgmoYp5FUeWuy7dZ-mx6LCxFYA=q9tc_UJysT4+ES34sGNYw@mail.gmail.com Whole thread Raw |
In response to | Re: B-Tree support function number 3 (strxfrm() optimization) (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: B-Tree support function number 3 (strxfrm() optimization)
|
List | pgsql-hackers |
On Mon, Apr 7, 2014 at 2:19 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Mon, Apr 7, 2014 at 10:47 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> To throw out one more point that I think is problematic, Peter's >> original email on this thread gives a bunch of examples of strxfrm() >> normalization that all different in the first few bytes - but so do >> the underlying strings. I *think* (but don't have time to check right >> now) that on my MacOS X box, strxfrm() spits out 3 bytes of header >> junk and then 8 bytes per character in the input string - so comparing >> the first 8 bytes of the strxfrm()'d representation would amount to >> comparing part of the first byte. If for any reason the first byte is >> the same (or similar enough) on many of the input strings, then this >> will probably work out to be slower rather than faster. Even if other >> platforms are more space-efficient (and I think at least some of them >> are), I think it's unlikely that this optimization will ever pay off >> for strings that don't differ in the first 8 bytes. > > Why would any platform have header bytes in the resulting binary > strings? That doesn't make any sense. Are you sure you aren't thinking > of the homogeneous trailing bytes that you can also see in my example? No, I'm not sure of that at all. I haven't looked at this topic in a while, but I'm happy to budget some to time to do so - for the June CommitFest. > The only case that this patch could possibly regress is where there > are strings that differ beyond about the first 8 bytes, but are not > identical (we chance a memcmp() == 0 before doing a full strcoll() > when tie-breaking on the semi-reliable initial comparison). We still > always avoid fmgr-overhead (and shim overhead, which I've measured), > as in your original patch - you put that at adding 7% at the time, > which is likely to make up for otherwise-regressed cases. There is > nothing at all contrived about my test-case. It's not a question of whether your test case is contrived. Your test case can be (and likely is) extremely realistic and still not account for other cases when the patch regresses performance. If I understand correctly, and I may not because I wasn't planning to spend time on this patch until the next CommitFest, the patch basically uses the bytes available in datum1 to cache what you refer to as a normalized or poor man's sort key which, it is hoped, will break most ties. However, on any workload where it fails to break ties, you're going to incur the additional overheads of (1) comparing the poor-man's sort key, (2) memcmping the strings (based on what you just said), and then (3) digging the correct datum back out of the tuple. I note that somebody thought #3 was important enough to be worth creating datum1 for in the first place, so I don't think it can or should be assumed that undoing that optimization in certain cases will turn out to be cheap enough not to matter. In short, I don't see any evidence that you've made an attempt to construct a worst-case scenario for this patch, and that's a basic part of performance testing. I had to endure having Andres beat the snot out of me over cases where the MVCC snapshot patch regressed performance, and as painful as that was, it led to a better patch. If I'd committed the first thing that did well on my own tests, which *did* include attempts (much less successful than Andres's) to find regressions, we'd be significantly worse off today than we are. > You have to have an awfully large number of significantly similar but > not identical strings in order to possibly lose out. Even if you have > such a case, and the fmgr-trampoline-elision doesn't make up for it > (doesn't make up for having to do a separate heapattr lookup on the > minimal tuple, and optimization not too relevant for pass by reference > types), which is quite a stretch, it seems likely that you have other > cases that do benefit, which in aggregate makes up for it. The > benefits I've shown, on the first test case I picked are absolutely > enormous. Testing the cases where your patch wins and hand-waving that the losses won't be that bad in other cases - without actually testing - is not the right methodology. And I think it would be quite a large error to assume that tables never contain large numbers of similar but not identical strings. I bet there are many people who have just that. That's also quite an inaccurate depiction of the cases where this will regress things; a bunch of strings that share the first few characters might not be similar in any normal human sense of that term, while still slipping through the proposed sieve. In your OP, a six-byte string blew up until 20 bytes after being strxfrm()'d, which means that you're only comparing the first 2-3 bytes. On a platform where Datums are only 4 bytes, you're only comparing the first 1-2 bytes. Arguing that nobody anywhere has a table where not even the first character or two are identical across most or all of the strings in the column is ridiculous. > Now, let's assume that I'm wrong about all this, and that in fact > there is a plausible case where all of those tricks don't work out, > and someone has a complaint about a regression. What are we going to > do about that? Not accelerate text sorting by at least a factor of 3 > for the benefit of some very narrow use-case? That's the only measure > I can see that you could take to not regress that case. The appropriate time to have a discussion about whether the patch wins by a large enough margin in enough use cases to justify possible regressions in cases where it loses is after we have made a real effort to characterize the winning and losing cases, mitigate the losing cases as far as possible, and benchmarked the results. I think it's far from a given even that your patch will win in more cases than it loses, or that the regressions when it loses will be small. That, of course, has yet to be proven, but so does the contrary. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: