Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby - Mailing list pgsql-bugs
From | Jeff Janes |
---|---|
Subject | Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby |
Date | |
Msg-id | CAMkU=1x4aTiQjECT0TcVDnM=Sgi_vG_3XfYB6rajR82Ym6PjJg@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Responses |
Re: BUG #8673: Could not open file
"pg_multixact/members/xxxx" on slave during hot_standby
|
List | pgsql-bugs |
On Fri, Jun 6, 2014 at 9:47 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Alvaro Herrera wrote: > > Serge Negodyuck wrote: > > > > > 2014-06-02 08:20:55 EEST 172.18.10.4 db PANIC: could not access status > of > > > transaction 2080547 > > > 2014-06-02 08:20:55 EEST 172.18.10.4 db DETAIL: Could not open file > > > "pg_multixact/members/14078": No such file or directory. > > > 2014-06-02 08:20:55 EEST 172.18.10.4 db CONTEXT: SQL statement " > UPDATE > > > ....." > > > > So as it turns out, this was caused because the arithmetic to handle the > > wraparound case neglected to handle multixacts with more members than > > the number that fit in the last page(s) of the last segment, leading to > > a number of pages in the 14078 segment (or whatever the last segment is > > for a given BLCKSZ) to fail to be initialized. This patch is a rework > > of that arithmetic, although it seems little bit too obscure. > > After some simplification I think it should be clearer. Thanks Andres > for commenting offlist. > > There is a different way to compute the "difference" proposed by Andres, > without using if/else; the idea is to cast the values to int64 and then > clamp. It would be something like > > uint64 diff64; > > diff64 = Min((uint64) offset + nmembers, > (uint64) offset + MULTIXACT_MEMBERS_PER_PAGE); > difference = (uint32) Min(diff64, MaxMultiXactOffset); > > (There are other ways to formulate this, of course, but this seems to me > to be the most readable one). I am undecided between this and the one I > propose in the patch, so I've stuck with the patch. > This patch seems to solve a problem I've also been having with non-existent "pg_multixact/members/13D35" files in my testing of torn page write and fk locks again recent 9.4. However, sometimes I've found the problem even before multi wrap around occured, and my reading of this thread suggests that the current issue would not show up in that case, so I don't know how much comfort to take in the knowledge that the problem seems to have gone away--as they say, things that go away by themselves without explanation also tend to come back by themselves. Anyone wishing to investigate further can find the testing harness and the tarball of a bad data directory here: https://drive.google.com/folderview?id=0Bzqrh1SO9FcEb1FNcm52aEMtWTA&usp=sharing Starting up the bad data directory should nominally succeed, but then the first database wide vacuum should fail when it hits the bad tuple. Recent changes to my testing harness were: Making some of the transactions randomly rollback in the test harness, to test transaction aborts not associated with server crashes. Make the p_id in the child table get changed only half the time, not all the time, so that a mixture of HOT updates and non-HOT updates get tested. Add a patch to allow fast-forward of multixact and add a code to my harness to trigger that patch. It is possible that this patch itself is causing the problem, but I suspect it is just accelerating the discovery of it. (By the way, I did this wrong for my original intent, as I had it create quite large multixact, when I should have had it create more of them but smaller, so wrap around would occur more often. But I don't want to change it now until the problem is resolved) Add a delay.pl which delivers SIGSTOP to random postgres processes and then wait a while to SIGCONT them, to try to uncover unlikely races. Surprisingly this does not slow things by very much. I thought it would be frequent that processes got interrupted while holding important locks, but it actually seems to be rare. Turn on archive_mode and set wal_level to logical. The problem is that the last two of them (the delay.pl and the archive) I can reverse and still see the problem, while the other 3 were extensively tested elsewhere without seeing a problem, so I can't figure out what the trigger is. When the problem shows up it takes anywhere from 1 to 13 hours to do so. Anyway, if no one has opinions to the contrary I think I will just assume this is fixed now and move on to other tests. Cheers, Jeff
pgsql-bugs by date: