Thread: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Hello folks, I've been playing around today a lot with sysbench, and observed that 2.6.32 kernel supplied by Ubuntu is having perf regressionwith PG (which does not affect MySQL), compared to 2.6.28 builds I have. What I observed can be seen in a paste at http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is 2.6.32 - 2.6.32-24-server). Machines are two socket quad-opterons 2356s. oprofile output can be seen at http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - system has >20% of idle cpu, which is somewherein the top symbol :) Domas
On 12/09/10 23:31, Domas Mituzas wrote: > I've been playing around today a lot with sysbench, and observed that 2.6.32 kernel supplied by Ubuntu is having perf regressionwith PG (which does not affect MySQL), compared to 2.6.28 builds I have. > What I observed can be seen in a paste at http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is 2.6.32 - 2.6.32-24-server). > > Machines are two socket quad-opterons 2356s. > > oprofile output can be seen at http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - system has>20% of idle cpu, which is somewherein the top symbol :) Can you run oprofile on the older kernel, so that we can compare and see where the time is spent? Looks like over 7% of the time is spent in s_lock, which suggests some change in behavior in context switching or something like that, but let's see what the old profile looks like. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Domas Mituzas wrote: > I've been playing around today a lot with sysbench, and observed that 2.6.32 kernel supplied by Ubuntu is having perf regressionwith PG (which does not affect MySQL), compared to 2.6.28 builds I have. > What I observed can be seen in a paste at http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is 2.6.32 - 2.6.32-24-server). > > Machines are two socket quad-opterons 2356s. > > oprofile output can be seen at http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - system has >20% of idle cpu, which is somewherein the top symbol :) > Are you using the same filesystem setup on both setups? And regardless, what is that filesystem? We know that between 2.6.28 and 2.6.32 the kernel improved how it handles fsync requests in a good way from a reliability perspective (to fix bugs that could cause data loss before), particularly on ext4, so it's possible the regression you're seeing is just the expense of handling things properly. If you already have sysbench on there, I'd suggest comparing the two systems by seeing how fast each can execute fsync requests: sysbench --test=fileio --file-fsync-freq=1 --file-num=1 --file-total-size=16384 --file-test-mode=rndwr run | grep "Requests/sec" To help distinguish whether this regression might be coming from the already known changes in that area, or if it's instead from something that's impacting CPU efficiency. Also, it's easy to see a performance change of this size just from the database files being on a different part of the disk if you didn't control for that. Disks are almost twice as fast at their beginning than their end nowadays. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On 09/13/2010 06:05 PM, Greg Smith wrote: > Domas Mituzas wrote: >> I've been playing around today a lot with sysbench, and observed that >> 2.6.32 kernel supplied by Ubuntu is having perf regression with PG >> (which does not affect MySQL), compared to 2.6.28 builds I have. >> What I observed can be seen in a paste at >> http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is >> 2.6.32 - 2.6.32-24-server). >> Machines are two socket quad-opterons 2356s. >> oprofile output can be seen at >> http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - system has >20% of idle >> cpu, which is somewhere in the top symbol :) > > Are you using the same filesystem setup on both setups? And regardless, > what is that filesystem? We know that between 2.6.28 and 2.6.32 the > kernel improved how it handles fsync requests in a good way from a > reliability perspective (to fix bugs that could cause data loss before), > particularly on ext4, so it's possible the regression you're seeing is > just the expense of handling things properly. > > If you already have sysbench on there, I'd suggest comparing the two > systems by seeing how fast each can execute fsync requests: > > sysbench --test=fileio --file-fsync-freq=1 --file-num=1 > --file-total-size=16384 --file-test-mode=rndwr run | grep "Requests/sec" > > To help distinguish whether this regression might be coming from the > already known changes in that area, or if it's instead from something > that's impacting CPU efficiency. > > Also, it's easy to see a performance change of this size just from the > database files being on a different part of the disk if you didn't > control for that. Disks are almost twice as fast at their beginning than > their end nowadays. well the main point here is that domas is doing a pure read-only test on a rather small workload so it should entirely fit in memory...From some very quick testing here as well it rathers seemsthat for some reason the CPU scheduler is not actually scheduling us all the available CPU on 2.6.32 or we are having some sort of locking issue that is more exposed on this kernel. Stefan
On 13 September 2010 17:27, Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: > On 09/13/2010 06:05 PM, Greg Smith wrote: >> >> Domas Mituzas wrote: >>> >>> I've been playing around today a lot with sysbench, and observed that >>> 2.6.32 kernel supplied by Ubuntu is having perf regression with PG >>> (which does not affect MySQL), compared to 2.6.28 builds I have. >>> What I observed can be seen in a paste at >>> http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is >>> 2.6.32 - 2.6.32-24-server). >>> Machines are two socket quad-opterons 2356s. >>> oprofile output can be seen at >>> http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - system has >20% of idle >>> cpu, which is somewhere in the top symbol :) >> >> Are you using the same filesystem setup on both setups? And regardless, >> what is that filesystem? We know that between 2.6.28 and 2.6.32 the >> kernel improved how it handles fsync requests in a good way from a >> reliability perspective (to fix bugs that could cause data loss before), >> particularly on ext4, so it's possible the regression you're seeing is >> just the expense of handling things properly. >> >> If you already have sysbench on there, I'd suggest comparing the two >> systems by seeing how fast each can execute fsync requests: >> >> sysbench --test=fileio --file-fsync-freq=1 --file-num=1 >> --file-total-size=16384 --file-test-mode=rndwr run | grep "Requests/sec" >> >> To help distinguish whether this regression might be coming from the >> already known changes in that area, or if it's instead from something >> that's impacting CPU efficiency. >> >> Also, it's easy to see a performance change of this size just from the >> database files being on a different part of the disk if you didn't >> control for that. Disks are almost twice as fast at their beginning than >> their end nowadays. > > well the main point here is that domas is doing a pure read-only test on a > rather small workload so it should entirely fit in memory... > From some very quick testing here as well it rathers seems that for some > reason the CPU scheduler is not actually scheduling us all the available CPU > on 2.6.32 or we are having some sort of locking issue that is more exposed > on this kernel. I thought sysbench was designed for MySQL benchmarks. How new is the PostgreSQL driver? Is it stable yet? -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935
Thom Brown wrote: > I thought sysbench was designed for MySQL benchmarks. How new is the > PostgreSQL driver? Is it stable yet? > It's been out there for years; the FreeBSD 7.0 development used it extensively on MySQL and PostgreSQL to track kernel performance on both databases back in 2007: http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf I don't think "stable" applies here just based on code age though, given how infrequent updates to the sysbench code are and how little QA is put into them. They pushed out two updates in 2009, 0.4.11 and 0.4.12, but all they did for me was break basic compilation on multiple platforms. I still use 0.4.10 as the last version that seems to work without makefile surgery on both RedHat and Ubuntu. The last time I tried it, the read-only OLTP implementation worked fine, but the one that wrote instead was prone to deadlocks in PostgreSQL. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On 09/13/2010 06:43 PM, Greg Smith wrote: > Thom Brown wrote: >> I thought sysbench was designed for MySQL benchmarks. How new is the >> PostgreSQL driver? Is it stable yet? > > It's been out there for years; the FreeBSD 7.0 development used it > extensively on MySQL and PostgreSQL to track kernel performance on both > databases back in 2007: > http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf > > I don't think "stable" applies here just based on code age though, given > how infrequent updates to the sysbench code are and how little QA is put > into them. They pushed out two updates in 2009, 0.4.11 and 0.4.12, but > all they did for me was break basic compilation on multiple platforms. I > still use 0.4.10 as the last version that seems to work without makefile > surgery on both RedHat and Ubuntu. > > The last time I tried it, the read-only OLTP implementation worked fine, > but the one that wrote instead was prone to deadlocks in PostgreSQL. yeah the read-only part works quite well(the other ones not so much) and it was much faster than pgbench in older pg release - I have not looked yet if the new threaded in 9.0 implementation fixes that issue. Stefan
Hello, > Can you run oprofile on the older kernel, so that we can compare and see where the time is spent? > Looks like over 7% of the time is spent in s_lock, which suggests some change in behavior in context switching or somethinglike that, but let's see what the old profile looks like. I grabbed the 2.6.28.2 as a loaner from prod boxes I had around, may take a while to do that again. Will see if I can get some nehalem loaners (or do these tests at other environment) to do more modern hardware comparison. Domas
On Mon, Sep 13, 2010 at 12:05 PM, Greg Smith <greg@2ndquadrant.com> wrote: > Domas Mituzas wrote: >> >> I've been playing around today a lot with sysbench, and observed that >> 2.6.32 kernel supplied by Ubuntu is having perf regression with PG (which >> does not affect MySQL), compared to 2.6.28 builds I have. >> What I observed can be seen in a paste at >> http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is 2.6.32 - >> 2.6.32-24-server). >> Machines are two socket quad-opterons 2356s. >> oprofile output can be seen at http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - >> system has >20% of idle cpu, which is somewhere in the top symbol :) >> > > Are you using the same filesystem setup on both setups? And regardless, > what is that filesystem? We know that between 2.6.28 and 2.6.32 the kernel > improved how it handles fsync requests in a good way from a reliability > perspective (to fix bugs that could cause data loss before), particularly on > ext4, so it's possible the regression you're seeing is just the expense of > handling things properly. > > If you already have sysbench on there, I'd suggest comparing the two systems > by seeing how fast each can execute fsync requests: > > sysbench --test=fileio --file-fsync-freq=1 --file-num=1 > --file-total-size=16384 --file-test-mode=rndwr run | grep "Requests/sec" > > To help distinguish whether this regression might be coming from the already > known changes in that area, or if it's instead from something that's > impacting CPU efficiency. > > Also, it's easy to see a performance change of this size just from the > database files being on a different part of the disk if you didn't control > for that. Disks are almost twice as fast at their beginning than their end > nowadays. Greg, have you run into any other evidence suggesting a problem with 2.6.32? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On 28/09/10 04:28, Robert Haas wrote: <blockquote cite="mid:AANLkTimPCHCb56Tdid_h9v6jZtCw5dn-1Sjz_r_=5+C2@mail.gmail.com"type="cite"><pre wrap="">On Mon, Sep 13, 2010 at 12:05PM, Greg Smith <a class="moz-txt-link-rfc2396E" href="mailto:greg@2ndquadrant.com"><greg@2ndquadrant.com></a>wrote: </pre><blockquote type="cite"><pre wrap="">DomasMituzas wrote: </pre><blockquote type="cite"><pre wrap=""> I've been playing around today a lot with sysbench, and observed that 2.6.32 kernel supplied by Ubuntu is having perf regression with PG (which does not affect MySQL), compared to 2.6.28 builds I have. What I observed can be seen in a paste at <a class="moz-txt-link-freetext" href="http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q">http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q</a>(db12 is 2.6.28, db20 is 2.6.32- 2.6.32-24-server). Machines are two socket quad-opterons 2356s. oprofile output can be seen at <a class="moz-txt-link-freetext" href="http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w">http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w</a>- system has >20% of idle cpu, which is somewhere in the top symbol :) </pre></blockquote><pre wrap=""> Are you using the same filesystem setup on both setups? And regardless, what is that filesystem? We know that between 2.6.28 and 2.6.32 the kernel improved how it handles fsync requests in a good way from a reliability perspective (to fix bugs that could cause data loss before), particularly on ext4, so it's possible the regression you're seeing is just the expense of handling things properly. If you already have sysbench on there, I'd suggest comparing the two systems by seeing how fast each can execute fsync requests: sysbench --test=fileio --file-fsync-freq=1 --file-num=1 --file-total-size=16384 --file-test-mode=rndwr run | grep "Requests/sec" To help distinguish whether this regression might be coming from the already known changes in that area, or if it's instead from something that's impacting CPU efficiency. Also, it's easy to see a performance change of this size just from the database files being on a different part of the disk if you didn't control for that. Disks are almost twice as fast at their beginning than their end nowadays. </pre></blockquote><pre wrap=""> Greg, have you run into any other evidence suggesting a problem with 2.6.32? </pre></blockquote><font size="-1"><font face="Helvetica"><br /> Not Greg (sorry), but this might be worth a look:<br /><br/><a class="moz-txt-link-freetext" href="http://www.spinics.net/lists/linux-ext4/msg20299.html">http://www.spinics.net/lists/linux-ext4/msg20299.html</a><br /><br/> regards<br /><br /> Mark<br /></font></font>
On Mon, Sep 27, 2010 at 11:37 PM, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote: > Greg, have you run into any other evidence suggesting a problem with 2.6.32? > > Not Greg (sorry), but this might be worth a look: > > http://www.spinics.net/lists/linux-ext4/msg20299.html Oh, interesting. But why wouldn't that also affect MySQL? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On 28/09/10 16:59, Robert Haas wrote: <blockquote cite="mid:AANLkTikTnZrzumZYShTO_o7Vy8UpLucJ=7nWZTNfdL6X@mail.gmail.com"type="cite"><pre wrap="">On Mon, Sep 27, 2010 at 11:37PM, Mark Kirkwood <a class="moz-txt-link-rfc2396E" href="mailto:mark.kirkwood@catalyst.net.nz"><mark.kirkwood@catalyst.net.nz></a> wrote:</pre><blockquote type="cite"><pre wrap="">Greg, have you run into any other evidence suggesting a problem with 2.6.32? Not Greg (sorry), but this might be worth a look: <a class="moz-txt-link-freetext" href="http://www.spinics.net/lists/linux-ext4/msg20299.html">http://www.spinics.net/lists/linux-ext4/msg20299.html</a> </pre></blockquote><prewrap=""> Oh, interesting. But why wouldn't that also affect MySQL? </pre></blockquote><font size="-1"><font face="Helvetica"><br /> Yeah, wondered that myself - perhaps if sysbench is usingmyisam tables then there is probably no fsync activity at all for a read only workload. Be interesting to see if Mysqlsuffers a hit for sysbench configured to use innodb storage...<br /></font></font>
Robert Haas wrote: > Greg, have you run into any other evidence suggesting a problem with 2.6.32? > I haven't actually checked myself yet. Right now the only distribution shipping 2.6.32 usefully is Ubuntu 10.04, which I can't recommend anyone use on a server because their release schedules are way too aggressive to ever deliver stable versions anymore. So until either RHEL6 or Debian Squeeze ships, very later this year or early next, the performance of 2.6.32 is irrelevant to me. And by then I'm hoping that the early adopters have squashed more of the obvious bugs here. 2.6.32 is 11 months old at this point, which makes it still a bleeding edge kernel in my book. -- Greg Smith, 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services and Support www.2ndQuadrant.us