Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process
Date
Msg-id 14003.1402093305@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process
Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process
List pgsql-bugs
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-06-06 18:03:53 -0400, Tom Lane wrote:
>> The point here seems to be that lazy_vacuum_page does the visibility map
>> ops inside its own critical section.  Why?  Setting a visibility bit
>> doesn't seem like it's critical.  Why can't we just move the
>> END_CRIT_SECTION() to before the PageIsAllVisible test?

> Yea, that's what I am proposing upthread. If we move the visibility
> tests out of the critical section this will get rid of the original
> report as well.

I went trolling for other critical sections ...

lazy_scan_heap has same disease, but looks like it can be fixed same way.

Also, there are a bunch of fsync_fname() calls inside critical sections in
replication/slot.c.  Seems at best pretty damn risky; what's more, the
critical sections cover only the fsyncs and not anything else, which is
flat out broken.  If it was okay to fail just before calling the fsync,
why is it critical to not fail inside it?  Somebody was not thinking
clearly there.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process
Next
From: Andres Freund
Date:
Subject: Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process