Thread: removing volatile qualifiers from lwlock.c

removing volatile qualifiers from lwlock.c

From
Robert Haas
Date:
As discussed on the thread "Spinlocks and compiler/memory barriers",
now that we've made the spinlock primitives function as compiler
barriers (we think), it should be possible to remove volatile
qualifiers from many places in the source code.  The attached patch
does this in lwlock.c.  If the changes in commit
0709b7ee72e4bc71ad07b7120acd117265ab51d0 (and follow-on commits) are
correct and complete, applying this shouldn't break anything, while
possibly giving the compiler room to optimize things better than it
does today.

However, demonstrating the necessity of that commit for these changes
seems to be non-trivial.  I tried applying this patch and reverting
commits 5b26278822c69dd76ef89fd50ecc7cdba9c3f035,
b4c28d1b92c81941e4fc124884e51a7c110316bf, and
0709b7ee72e4bc71ad07b7120acd117265ab51d0 on a PPC64 POWER8 box with a
whopping 192 hardware threads (thanks, IBM!).  I then ran the
regression tests repeatedly, and I ran several long pgbench runs with
as many as 350 concurrent clients.  No failures.

So I'm posting this patch in the hope that others can help.  The
relevant tests are:

1. If you apply this patch to master and run tests of whatever kind
strikes your fancy, does anything break under high concurrency?  If it
does, then the above commits weren't enough to make this safe on your
platform.

2. If you apply this patch to master, revert the commits mentioned
above, and again run tests, does anything now break?  If it does (but
the first tests were OK), then that shows that those commits did
something useful on your platform.

The change to make spinlocks act as compiler barriers provides the
foundation for, hopefully, a cleaner code base and easier future work
on scalabilty and performance projects, so help is much appreciated.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: removing volatile qualifiers from lwlock.c

From
Andres Freund
Date:
Hi,

On 2014-09-10 14:53:07 -0400, Robert Haas wrote:
> As discussed on the thread "Spinlocks and compiler/memory barriers",
> now that we've made the spinlock primitives function as compiler
> barriers (we think), it should be possible to remove volatile
> qualifiers from many places in the source code.  The attached patch
> does this in lwlock.c.  If the changes in commit
> 0709b7ee72e4bc71ad07b7120acd117265ab51d0 (and follow-on commits) are
> correct and complete, applying this shouldn't break anything, while
> possibly giving the compiler room to optimize things better than it
> does today.
>
> However, demonstrating the necessity of that commit for these changes
> seems to be non-trivial.  I tried applying this patch and reverting
> commits 5b26278822c69dd76ef89fd50ecc7cdba9c3f035,
> b4c28d1b92c81941e4fc124884e51a7c110316bf, and
> 0709b7ee72e4bc71ad07b7120acd117265ab51d0 on a PPC64 POWER8 box with a
> whopping 192 hardware threads (thanks, IBM!).  I then ran the
> regression tests repeatedly, and I ran several long pgbench runs with
> as many as 350 concurrent clients.  No failures.

There's actually one more commit to revert. What I used was:
git revert 5b26278822c69dd76ef89fd50ecc7cdba9c3f035 \
    b4c28d1b92c81941e4fc124884e51a7c110316bf \
    68e66923ff629c324e219090860dc9e0e0a6f5d6 \
    0709b7ee72e4bc71ad07b7120acd117265ab51d0

> So I'm posting this patch in the hope that others can help.  The
> relevant tests are:
>
> 1. If you apply this patch to master and run tests of whatever kind
> strikes your fancy, does anything break under high concurrency?  If it
> does, then the above commits weren't enough to make this safe on your
> platform.
>
> 2. If you apply this patch to master, revert the commits mentioned
> above, and again run tests, does anything now break?  If it does (but
> the first tests were OK), then that shows that those commits did
> something useful on your platform.

I just tried this on my normal x86 workstation. I applied your lwlock
patch and ontop I removed most volatiles (there's a couple still
required) from xlog.c. Works for 100 seconds. Then I reverted the above
commits. Breaks within seconds:
master:
LOG:  request to flush past end of generated WAL; request 2/E5EC3DE0, currpos 2/E5EC1E60
standby:
LOG:  record with incorrect prev-link 4/684C3108 at 4/684C3108
and similar.

So at least for x86 the compiler barriers are obviously required and
seemingly working.

I've attached the very quickly written xlog.c de-volatizing patch.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: removing volatile qualifiers from lwlock.c

From
Robert Haas
Date:
On Wed, Sep 17, 2014 at 7:45 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> I just tried this on my normal x86 workstation. I applied your lwlock
> patch and ontop I removed most volatiles (there's a couple still
> required) from xlog.c. Works for 100 seconds. Then I reverted the above
> commits. Breaks within seconds:
> master:
> LOG:  request to flush past end of generated WAL; request 2/E5EC3DE0, currpos 2/E5EC1E60
> standby:
> LOG:  record with incorrect prev-link 4/684C3108 at 4/684C3108
> and similar.
>
> So at least for x86 the compiler barriers are obviously required and
> seemingly working.

Oh, that's great.  In that case I think I should go ahead and apply
that patch in the hopes of turning up any systems where the barriers
aren't working properly yet.  Although it would be nice to know
whether it breaks with *only* the lwlock.c patch.

> I've attached the very quickly written xlog.c de-volatizing patch.

I don't have time to go through this in detail, but I don't object to
you applying it if you're confident you've done it carefully enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: removing volatile qualifiers from lwlock.c

From
Andres Freund
Date:
On 2014-09-19 13:58:17 -0400, Robert Haas wrote:
> On Wed, Sep 17, 2014 at 7:45 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > I just tried this on my normal x86 workstation. I applied your lwlock
> > patch and ontop I removed most volatiles (there's a couple still
> > required) from xlog.c. Works for 100 seconds. Then I reverted the above
> > commits. Breaks within seconds:
> > master:
> > LOG:  request to flush past end of generated WAL; request 2/E5EC3DE0, currpos 2/E5EC1E60
> > standby:
> > LOG:  record with incorrect prev-link 4/684C3108 at 4/684C3108
> > and similar.
> >
> > So at least for x86 the compiler barriers are obviously required and
> > seemingly working.
> 
> Oh, that's great.  In that case I think I should go ahead and apply
> that patch in the hopes of turning up any systems where the barriers
> aren't working properly yet.

Agreed.

> Although it would be nice to know whether it breaks with *only* the lwlock.c patch.

It didn't, at least not visibly within the 1000s I let pgbench run.

> > I've attached the very quickly written xlog.c de-volatizing patch.
> 
> I don't have time to go through this in detail, but I don't object to
> you applying it if you're confident you've done it carefully enough.

It's definitely not yet carefully enough checked. I wanted to get it
break fast and only made one pass through the file. But I think it
should be easy enough to get it into shape for that.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services