Re: Speed up Clog Access by increasing CLOG buffers - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Speed up Clog Access by increasing CLOG buffers |
Date | |
Msg-id | 26b69fb2-fa4d-530c-7783-1cb9d952c4e5@2ndquadrant.com Whole thread Raw |
In response to | Re: Speed up Clog Access by increasing CLOG buffers (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Speed up Clog Access by increasing CLOG buffers
Re: Speed up Clog Access by increasing CLOG buffers Re: Speed up Clog Access by increasing CLOG buffers |
List | pgsql-hackers |
On 09/21/2016 08:04 AM, Amit Kapila wrote: > On Wed, Sep 21, 2016 at 3:48 AM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: ... > >> I'll repeat the test on the 4-socket machine with a newer kernel, >> but that's probably the last benchmark I'll do for this patch for >> now. >> Attached are results from benchmarks running on kernel 4.5 (instead of the old 3.2.80). I've only done synchronous_commit=on, and I've added a few client counts (mostly at the lower end). The data are pushed the data to the git repository, see git push --set-upstream origin master The summary looks like this (showing both the 3.2.80 and 4.5.5 results): 1) Dilip's workload 3.2.80 16 32 64 128 192 ------------------------------------------------------------------- master 26138 37790 38492 13653 8337 granular-locking 25661 38586 40692 14535 8311 no-content-lock 25653 39059 41169 14370 8373 group-update 26472 39170 42126 18923 8366 4.5.5 1 8 16 32 64 128 192 ------------------------------------------------------------------- granular-locking 4050 23048 27969 32076 34874 36555 37710 no-content-lock 4025 23166 28430 33032 35214 37576 39191 group-update 4002 23037 28008 32492 35161 36836 38850 master 3968 22883 27437 32217 34823 36668 38073 2) pgbench 3.2.80 16 32 64 128 192 ------------------------------------------------------------------- master 22904 36077 41295 35574 8297 granular-locking 23323 36254 42446 43909 8959 no-content-lock 23304 36670 42606 48440 8813 group-update 23127 36696 41859 46693 8345 4.5.5 1 8 16 32 64 128 192 ------------------------------------------------------------------- granular-locking 3116 19235 27388 29150 31905 34105 36359 no-content-lock 3206 19071 27492 29178 32009 34140 36321 group-update 3195 19104 26888 29236 32140 33953 35901 master 3136 18650 26249 28731 31515 33328 35243 The 4.5 kernel clearly changed the results significantly: (a) Compared to the results from 3.2.80 kernel, some numbers improved, some got worse. For example, on 3.2.80 pgbench did ~23k tps with 16 clients, on 4.5.5 it does 27k tps. With 64 clients the performance dropped from 41k tps to ~34k (on master). (b) The drop above 64 clients is gone - on 3.2.80 it dropped very quickly to only ~8k with 192 clients. On 4.5 the tps actually continues to increase, and we get ~35k with 192 clients. (c) Although it's not visible in the results, 4.5.5 almost perfectly eliminated the fluctuations in the results. For example when 3.2.80 produced this results (10 runs with the same parameters): 12118 11610 27939 11771 18065 12152 14375 10983 13614 11077 we get this on 4.5.5 37354 37650 37371 37190 37233 38498 37166 36862 37928 38509 Notice how much more even the 4.5.5 results are, compared to 3.2.80. (d) There's no sign of any benefit from any of the patches (it was only helpful >= 128 clients, but that's where the tps actually dropped on 3.2.80 - apparently 4.5.5 fixes that and the benefit is gone). It's a bit annoying that after upgrading from 3.2.80 to 4.5.5, the performance with 32 and 64 clients dropped quite noticeably (by more than 10%). I believe that might be a kernel regression, but perhaps it's a price for improved scalability for higher client counts. It of course begs the question what kernel version is running on the machine used by Dilip (i.e. cthulhu)? Although it's a Power machine, so I'm not sure how much the kernel matters on it. I'll ask someone else with access to this particular machine to repeat the tests, as I have a nagging suspicion that I've missed something important when compiling / running the benchmarks. I'll also retry the benchmarks on 3.2.80 to see if I get the same numbers. > > Okay, but I think it is better to see the results between 64~128 > client count and may be greater than128 client counts, because it is > clear that patch won't improve performance below that. > There are results for 64, 128 and 192 clients. Why should we care about numbers in between? How likely (and useful) would it be to get improvement with 96 clients, but no improvement for 64 or 128 clients? >> >> I agree with Robert that the cases the patch is supposed to >> improve are a bit contrived because of the very high client >> counts. >> > > No issues, I have already explained why I think it is important to > reduce the remaining CLOGControlLock contention in yesterday's and > this mail. If none of you is convinced, then I think we have no > choice but to drop this patch. > I agree it's useful to reduce lock contention in general, but considering the last set of benchmarks shows no benefit with recent kernel, I think we really need a better understanding of what's going on, what workloads / systems it's supposed to improve, etc. I don't dare to suggest rejecting the patch, but I don't see how we could commit any of the patches at this point. So perhaps "returned with feedback" and resubmitting in the next CF (along with analysis of improved workloads) would be appropriate. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: