Re: Progress on fast path sorting, btree index creation time - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Progress on fast path sorting, btree index creation time |
Date | |
Msg-id | CAEYLb_UpgkSDs+tNqp_TOmZR04TxpEpqGGupt5E3333D6m53Ag@mail.gmail.com Whole thread Raw |
In response to | Re: Progress on fast path sorting, btree index creation time (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Progress on fast path sorting, btree index creation time
Re: Progress on fast path sorting, btree index creation time |
List | pgsql-hackers |
On 6 January 2012 21:47, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jan 6, 2012 at 4:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Admittedly, I don't have any numbers quantifying just how useful that >> might be, but on the other hand you've not presented any evidence >> justifying removing the behavior either. If we believe your position >> that indexes don't generally have lots of duplicates, then the code in >> question will seldom be reached and therefore there would be no >> performance benefit in removing it. I have decided on a tactical retreat in relation to this patch. This has been dragging on since September. I cannot risk not having the patch accepted for 9.2, due to trying to handle both heap and btree tuple sorting at once - the additional relatively modest improvements for common cases that it will bring to btree index creation time do not warrant digging my heels in to cover that case in one larger commit. For that reason, I attach for your consideration a revision of the patch without any support for btree index tuples whatsoever (though I have adjusted btree tuplesort comments, and one tiny piece of code, in a non-controversial way). I'll revisit btree sorting towards the end of the commitfest, circumstances permitting. As to the question of binary bloat, I have devised a pgbench-tools workload that I think will go some way towards reassuring those who are concerned about its distributed costs. The attached spreadsheet has all the relevant details. A custom scripts has been specified (details of which are evident from benchmark results themselves). If we experienced some kind of noticeable marginalisation of usefully cached instructions, that's obviously where it'll show up. Note that I have taken the preparatory step of updating some tables with random data, rather than the sequential data that they have by default, which, unusually for such large tables, can allow us to avail of our pre-sorted input check optimisation, spoiling things. I did vacuumdb immediately afterwards, immediately before the pgbench run in each case. I've kept the test at 8 clients/8 threads, on a machine with 4 physical cores + hyperthreading for 8 logical ("Intel Core(TM) i7 CPU 870 @ 2.93GHz", with 4 x 32 KB instruction caches, among several other CPU caches) on a newly-initialised, throw-away database, single run at scale 50 for 15 minutes in all cases. This is the same server that I have used throughout. My analysis is that although there may be a very slight regression in non-affected queries (it's a tough call to make - how queries happen to coincide undoubtedly muddies the waters, as does the fact that we cut lots of time from periods in which backends hold a lot of locally allocated sort memory - it might actually be a win for those other queries sometimes but not others, and it appears that the margin either way is very small), that is more than compensated for by the benefit of the specialisations. The CPU cache is doing its fjob as expected, and we clearly see a net benefit from its preference for cacheing more instructions that are specialised - those sort operations are way more expensive (if they weren't, it wouldn't cache them so heavily). I also include results for running the same query again and again with a single client, to put that effect in context (though note that previous benchmarks avoiding paying a high client overhead by using explain analyze, and indeed originally compared pre-SortSupport Postgres, so these numbers aren't as good either). It's unusual for a database workload to be so heavily CPU bound, so I'd suggest that the latter benchmark is more representative than the former. Either way, we win by some margin. If the queries didn't have such a high client overhead, as for example with a sort node that feeds a merge join, we'd do better still. If a sorting specialisation is never used anyway, the overhead, for practical purposes, is zero. I believe that sorting specialisations are just too useful to not be a net win in almost all reasonable cases. Why haven't I used all specialisations at once, rather than only two (one for single int4, the other multiple)? Well, I might have used all of them, but there was no floats available in the pgbench tables, and besides, the chances of all of them being simultaneously in play during any sufficiently short period for marginalisation of CPU cache contents to be of particular concern is, in general, not all that great. A major systems programming language, C++, produces multiple specialisations as its standard library's sort function is used, for example, each of which will have separate entries in the procedure linkage table (though various implementation-defined techniques are frequently used to reduce the number of copies across translation units at link time). If the programmer specifies either a different datatype, or a different comparator (through the use of an alternative to the default std::less_than<T> functor/predicate), a whole new specialisation is generated by the compiler. This does raise concerns in relation to binary bloat, but they're reasonably well understood, and this is the default way of doing things, for general purpose application development. Now, I know that Postgres isn't written in C++, and you might not find "C++ does it" to be a particularly compelling argument, but it does go to show that these ideas are not by any stretch of the imagination radical. I remind everyone that the improvement seen in raw qsort_arg runtime is fairly large, as it would have to be, in order to bring such large though proportionally smaller improvements to each given affected query as a whole - there are of course many other sources of overhead involved in parsing, planning and executing affected queries. Here is a complete detailing of the specialisations that this latest revision produces: int4, single key (supports date too) int8, multiple keys (supports timestamps too, with and without TZ, where we HAVE_INT64_TIMESTAMP) float4, single key float8, multiple keys Type-generic single key specialisation. Expected to deliver some of the same additional benefits for types that do not merit there own specialisations but would still benefit, like name, int2, etc, but is used all the time where applicable, even for types like text for which it is expected that there will be no appreciable benefit. Thoughts? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Attachment
pgsql-hackers by date: