Thread: removing volatile qualifiers from lwlock.c
As discussed on the thread "Spinlocks and compiler/memory barriers", now that we've made the spinlock primitives function as compiler barriers (we think), it should be possible to remove volatile qualifiers from many places in the source code. The attached patch does this in lwlock.c. If the changes in commit 0709b7ee72e4bc71ad07b7120acd117265ab51d0 (and follow-on commits) are correct and complete, applying this shouldn't break anything, while possibly giving the compiler room to optimize things better than it does today. However, demonstrating the necessity of that commit for these changes seems to be non-trivial. I tried applying this patch and reverting commits 5b26278822c69dd76ef89fd50ecc7cdba9c3f035, b4c28d1b92c81941e4fc124884e51a7c110316bf, and 0709b7ee72e4bc71ad07b7120acd117265ab51d0 on a PPC64 POWER8 box with a whopping 192 hardware threads (thanks, IBM!). I then ran the regression tests repeatedly, and I ran several long pgbench runs with as many as 350 concurrent clients. No failures. So I'm posting this patch in the hope that others can help. The relevant tests are: 1. If you apply this patch to master and run tests of whatever kind strikes your fancy, does anything break under high concurrency? If it does, then the above commits weren't enough to make this safe on your platform. 2. If you apply this patch to master, revert the commits mentioned above, and again run tests, does anything now break? If it does (but the first tests were OK), then that shows that those commits did something useful on your platform. The change to make spinlocks act as compiler barriers provides the foundation for, hopefully, a cleaner code base and easier future work on scalabilty and performance projects, so help is much appreciated. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Hi, On 2014-09-10 14:53:07 -0400, Robert Haas wrote: > As discussed on the thread "Spinlocks and compiler/memory barriers", > now that we've made the spinlock primitives function as compiler > barriers (we think), it should be possible to remove volatile > qualifiers from many places in the source code. The attached patch > does this in lwlock.c. If the changes in commit > 0709b7ee72e4bc71ad07b7120acd117265ab51d0 (and follow-on commits) are > correct and complete, applying this shouldn't break anything, while > possibly giving the compiler room to optimize things better than it > does today. > > However, demonstrating the necessity of that commit for these changes > seems to be non-trivial. I tried applying this patch and reverting > commits 5b26278822c69dd76ef89fd50ecc7cdba9c3f035, > b4c28d1b92c81941e4fc124884e51a7c110316bf, and > 0709b7ee72e4bc71ad07b7120acd117265ab51d0 on a PPC64 POWER8 box with a > whopping 192 hardware threads (thanks, IBM!). I then ran the > regression tests repeatedly, and I ran several long pgbench runs with > as many as 350 concurrent clients. No failures. There's actually one more commit to revert. What I used was: git revert 5b26278822c69dd76ef89fd50ecc7cdba9c3f035 \ b4c28d1b92c81941e4fc124884e51a7c110316bf \ 68e66923ff629c324e219090860dc9e0e0a6f5d6 \ 0709b7ee72e4bc71ad07b7120acd117265ab51d0 > So I'm posting this patch in the hope that others can help. The > relevant tests are: > > 1. If you apply this patch to master and run tests of whatever kind > strikes your fancy, does anything break under high concurrency? If it > does, then the above commits weren't enough to make this safe on your > platform. > > 2. If you apply this patch to master, revert the commits mentioned > above, and again run tests, does anything now break? If it does (but > the first tests were OK), then that shows that those commits did > something useful on your platform. I just tried this on my normal x86 workstation. I applied your lwlock patch and ontop I removed most volatiles (there's a couple still required) from xlog.c. Works for 100 seconds. Then I reverted the above commits. Breaks within seconds: master: LOG: request to flush past end of generated WAL; request 2/E5EC3DE0, currpos 2/E5EC1E60 standby: LOG: record with incorrect prev-link 4/684C3108 at 4/684C3108 and similar. So at least for x86 the compiler barriers are obviously required and seemingly working. I've attached the very quickly written xlog.c de-volatizing patch. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Wed, Sep 17, 2014 at 7:45 AM, Andres Freund <andres@2ndquadrant.com> wrote: > I just tried this on my normal x86 workstation. I applied your lwlock > patch and ontop I removed most volatiles (there's a couple still > required) from xlog.c. Works for 100 seconds. Then I reverted the above > commits. Breaks within seconds: > master: > LOG: request to flush past end of generated WAL; request 2/E5EC3DE0, currpos 2/E5EC1E60 > standby: > LOG: record with incorrect prev-link 4/684C3108 at 4/684C3108 > and similar. > > So at least for x86 the compiler barriers are obviously required and > seemingly working. Oh, that's great. In that case I think I should go ahead and apply that patch in the hopes of turning up any systems where the barriers aren't working properly yet. Although it would be nice to know whether it breaks with *only* the lwlock.c patch. > I've attached the very quickly written xlog.c de-volatizing patch. I don't have time to go through this in detail, but I don't object to you applying it if you're confident you've done it carefully enough. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-09-19 13:58:17 -0400, Robert Haas wrote: > On Wed, Sep 17, 2014 at 7:45 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > I just tried this on my normal x86 workstation. I applied your lwlock > > patch and ontop I removed most volatiles (there's a couple still > > required) from xlog.c. Works for 100 seconds. Then I reverted the above > > commits. Breaks within seconds: > > master: > > LOG: request to flush past end of generated WAL; request 2/E5EC3DE0, currpos 2/E5EC1E60 > > standby: > > LOG: record with incorrect prev-link 4/684C3108 at 4/684C3108 > > and similar. > > > > So at least for x86 the compiler barriers are obviously required and > > seemingly working. > > Oh, that's great. In that case I think I should go ahead and apply > that patch in the hopes of turning up any systems where the barriers > aren't working properly yet. Agreed. > Although it would be nice to know whether it breaks with *only* the lwlock.c patch. It didn't, at least not visibly within the 1000s I let pgbench run. > > I've attached the very quickly written xlog.c de-volatizing patch. > > I don't have time to go through this in detail, but I don't object to > you applying it if you're confident you've done it carefully enough. It's definitely not yet carefully enough checked. I wanted to get it break fast and only made one pass through the file. But I think it should be easy enough to get it into shape for that. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services