some longer, larger pgbench tests with various performance-related patches - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | some longer, larger pgbench tests with various performance-related patches |
Date | |
Msg-id | CA+Tgmobvif_ErSj7hWZ5xzLhDX_fGZbiqKt1EvPdLaHrj+p3Xw@mail.gmail.com Whole thread Raw |
Responses |
Re: some longer, larger pgbench tests with various
performance-related patches
Re: some longer, larger pgbench tests with various performance-related patches Re: some longer, larger pgbench tests with various performance-related patches Re: some longer, larger pgbench tests with various performance-related patches |
List | pgsql-hackers |
Early yesterday morning, I was able to use Nate Boley's test machine do a single 30-minute pgbench run at scale factor 300 using a variety of trees built with various patches, and with the -l option added to track latency on a per-transaction basis. All tests were done using 32 clients and permanent tables. The configuration was otherwise identical to that described here: http://archives.postgresql.org/message-id/CA+TgmoboYJurJEOB22Wp9RECMSEYGNyHDVFv5yisvERqFw=6dw@mail.gmail.com By doing this, I hoped to get a better understanding of (1) the effects of a scale factor too large to fit in shared_buffers, (2) what happens on a longer test run, and (3) how response time varies throughout the test. First, here are the raw tps numbers: background-clean-slru-v2: tps = 2027.282539 (including connections establishing) buffreelistlock-reduction-v1: tps = 2625.155348 (including connections establishing) buffreelistlock-reduction-v1-freelist-ok-v2: tps = 2468.638149 (including connections establishing) freelist-ok-v2: tps = 2467.065010 (including connections establishing) group-commit-2012-01-21: tps = 2205.128609 (including connections establishing) master: tps = 2200.848350 (including connections establishing) removebufmgrfreelist-v1: tps = 2679.453056 (including connections establishing) xloginsert-scale-6: tps = 3675.312202 (including connections establishing) Obviously these numbers are fairly noisy, especially since this is just one run, so the increases and decreases might not be all that meaningful. Time permitting, I'll try to run some more tests to get my hands around that situation a little better, Graphs are here: http://wiki.postgresql.org/wiki/Robert_Haas_9.2CF4_Benchmark_Results There are two graphs for each branch. The first is a scatter plot of latency vs. transaction time. I found that graph hard to understand, though; I couldn't really tell what I was looking at. So I made a second set of graphs which graph number of completed transactions in a given second of the test against time. The results are also included on the previous page, below the latency graphs, and I find them much more informative. A couple of things stand out at me from these graphs. First, some of these transactions had really long latency. Second, there are a remarkable number of seconds all of the test during which no transactions at all manage to complete, sometimes several seconds in a row. I'm not sure why. Third, all of the tests initially start of processing transactions very quickly, and get slammed down very hard, probably because the very high rate of transaction processing early on causes a checkpoint to occur around 200 s. I didn't actually log when the checkpoints were occuring, but it seems like a good guess. It's also interesting to wonder whether the checkpoint I/O itself causes the drop-off, or the ensuing full page writes. Fourth, xloginsert-scale-6 helps quite a bit; in fact, it's the only patch that actually changes the whole shape of the tps graph. I'm speculating here, but that may be because it blunts the impact of full page writes by allowing backends to copy their full page images into the write-ahead log in parallel. One thing I also noticed while running the tests is that the system was really not using much CPU time. It was mostly idle. That could be because waiting for I/O leads to waiting for locks, or it could be fundamental lock contention. I don't know which. A couple of obvious further tests suggest themselves: (1) rerun some of the tests with full_page_writes=off, and (2) repeat this test with the remaining performance-related patches. It would be especially interesting, I think, to see what effect the checkpoint-related patches have on these graphs. But I plan to drop buffreelistlock-reduction-v1 and freelist-ok-v2 from future test runs based on Simon's comments elsewhere. I'm including the results here just because these tests were already running when he made those comments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: