Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page(Was: [HACKERS] [WIP] Zipfian distribution in pgbench) - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page(Was: [HACKERS] [WIP] Zipfian distribution in pgbench) |
Date | |
Msg-id | CAH2-Wzm89Byt95e3f6pVdzRB0zja3VVGUDpbxzm=eWyMkutWWQ@mail.gmail.com Whole thread Raw |
In response to | Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page(Was: [HACKERS] [WIP] Zipfian distribution in pgbench) (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page(Was: [HACKERS] [WIP] Zipfian distribution in pgbench)
|
List | pgsql-hackers |
On Mon, Jul 31, 2017 at 10:54 AM, Peter Geoghegan <pg@bowt.ie> wrote: > Let's wait to see what difference it makes if Alik's zipfian > distribution pgbench test case uses unlogged tables. That may gives us a > good sense of the problem for cases with contention/concurrency. Yura Sokolov of Postgres Pro performed this benchmark at my request. He took the 9.5 commit immediately proceeding 2ed5b87f9 as a baseline. In all cases, logged tables were used. Note that this is not the most effective benchmark for showing the regression, because he didn't replace the PK with a non-unique index, though that is planned as follow-up; I wanted to stick with Alik's Zipfian test case for the time being. (Using a unique index is not the most effective thing for showing the regression because unique indexes have most LP_DEAD setting done in _bt_check_unique(), so only SELECTs will do less LP_DEAD cleanup there; SELECTs are 50% of all queries.) His results with 10 minute pgbench runs: Logged clients | 8217fb14 | 2ed5b87f | master | hashsnap | hashsnap_lwlock --------+----------+----------+--------+----------+---------------- 10 | 201569 | 204154 | 201095 | 201793 | 206111 20 | 328180 | 333358 | 334858 | 336721 | 370769 40 | 196352 | 194692 | 232413 | 231968 | 393947 70 | 121083 | 116881 | 148066 | 148443 | 224162 110 | 77989 | 73414 | 99305 | 111404 | 161900 160 | 48958 | 45830 | 65830 | 82340 | 115678 230 | 27699 | 25510 | 38186 | 57617 | 80575 310 | 16369 | 15137 | 21571 | 39745 | 56819 400 | 10327 | 9486 | 13157 | 27318 | 40859 500 | 6920 | 6496 | 8638 | 18677 | 29429 650 | 4315 | 3971 | 5196 | 11762 | 17883 800 | 2850 | 2772 | 3523 | 7733 | 10977 Note that you also see numbers from various patches from Yura, and the master branch mixed in here, but 8217fb14 (before) and 2ed5b87f (after) are the interesting numbers as far as this regression goes. There is an appreciable reduction in TPS here, though this workload is not as bad by that measure as first thought. There is a roughly 5% regression here past 40 clients or so. The regression in the *consistency* of transaction *throughput* is far more interesting, though. I've been doing analysis of this by drilling down to individual test cases with vimdiff, as follows: $ vimdiff test_8217fb14_logged_1_pgbench_40.out test_2ed5b87f_logged_1_pgbench_40.out (I attach these two files as a single example. I can provide the full results to those that ask for them privately; it's too much data to attach to an e-mail to the list.) You can see in this example that for most 5 second slices of the 10 minute benchmark, commit 2ed5b87f actually increases TPS somewhat, which is good. But there are also occasional *big* drops in TPS, sometimes by as much as 50% over a single 5 second period (when ANALYZE runs, adding random I/O during holding an exclusive buffer lock [1]?). When this slowdown happens, latency can be over 3 times higher, too. We see much more consistent performance without the B-Tree buffer pin VACUUM optimization, where there is no discernible pattern of performance dropping. The headline regression of 4% or 5% is not the main problem here, it seems. In summary, commit 2ed5b87f makes the workload have increased throughput most of the time, but occasionally sharply reduces throughput, which averages out to TPS being 4% or 5% lower overall. I think we'll find that there are bigger problems TPS-wise with non-unique indexes when that other test is performed by Yura; let's wait for those results to come back. Finally, I find it interesting that when Yura did the same benchmark, but with 5% SELECTs + 95% UPDATEs, rather than 50% SELECTs + 50% UPDATEs as above, the overall impact was surprisingly similar. His results: clients | 8217fb14 | 2ed5b87f | master | hashsnap | hashsnap_lwlock --------+----------+----------+--------+----------+---------------- 20 | 187697 | 187335 | 217558 | 215059 | 266894 50 | 81272 | 78784 | 97948 | 97659 | 157710 85 | 49446 | 47683 | 64597 | 70814 | 107748 130 | 32358 | 30393 | 42216 | 50531 | 75001 190 | 19403 | 17569 | 25704 | 35506 | 51292 270 | 10803 | 9878 | 14166 | 23851 | 35257 370 | 6108 | 5645 | 7684 | 15390 | 23659 500 | 3649 | 3297 | 4225 | 9172 | 14643 650 | 2239 | 2049 | 2635 | 5887 | 8588 800 | 1475 | 1424 | 1804 | 4035 | 5611 If nothing else, this shows how generally reliant these kinds of workload can be on LP_DEAD setting. And, there is one difference: The regression is seen here at *all* client counts, even with only 20 clients, This is presumably because with only 5% SELECTs it's more important that those few remaining SELECTs be able to perform LP_DEAD setting. [1] https://wiki.postgresql.org/wiki/Key_normalization#Avoiding_unnecessary_unique_index_enforcement -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: