Re: Index corruption - Mailing list pgsql-hackers
From | Marc Munro |
---|---|
Subject | Re: Index corruption |
Date | |
Msg-id | 1151632823.3913.97.camel@bloodnok.com Whole thread Raw |
In response to | Re: Index corruption (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Index corruption
|
List | pgsql-hackers |
On Thu, 2006-06-29 at 21:47 -0400, Tom Lane wrote: > One easy thing that would be worth trying is to build with > --enable-cassert and see if any Asserts get provoked during the > failure case. I don't have a lot of hope for that, but it's > something that would require only machine time not people time. I'll try this tomorrow. > A couple other things to try, given that you can provoke the failure > fairly easily: > > 1. In studying the code, it bothers me a bit that P_NEW is the same as > InvalidBlockNumber. The intended uses of P_NEW appear to be adequately > interlocked, but it's fairly easy to see how something like this could > happen if there are any places where InvalidBlockNumber is > unintentionally passed to ReadBuffer --- that would look like a P_NEW > call and it *wouldn't* be interlocked. So it would be worth changing > P_NEW to "(-2)" (this should just take a change in bufmgr.h and > recompile) and adding an "Assert(blockNum != InvalidBlockNumber)" > at the head of ReadBufferInternal(). Then rebuild with asserts enabled > and see if the failure case provokes that assert. I'll try this too. > 2. I'm also eyeing this bit of code in hio.c: > > /* > * If the FSM knows nothing of the rel, try the last page before > * we give up and extend. This avoids one-tuple-per-page syndrome > * during bootstrapping or in a recently-started system. > */ > if (targetBlock == InvalidBlockNumber) > { > BlockNumber nblocks = RelationGetNumberOfBlocks(relation); > > if (nblocks > 0) > targetBlock = nblocks - 1; > } > > If someone else has just extended the relation, it's possible that this > will allow a process to get to the page before the intended extender has > finished initializing it. AFAICT that's not harmful because the page > will look like it has no free space ... but it seems a bit fragile. > If you dike out the above-mentioned code, can you still provoke the > failure? By dike out, you mean remove? Please confirm and I'll try it. > A different line of attack is to see if you can make a self-contained > test case so other people can try to reproduce it. More eyeballs on the > problem are always better. Can't really see this being possible. This is clearly a very unusual problem and without similar hardware I doubt that anyone else will trigger it. We ran this system happily for nearly a year on the previous kernel without experiencing this problem (tcp lockups are a different matter). Also the load is provided by a bunch of servers and robots simulating rising and falling load. > Lastly, it might be interesting to look at the WAL logs for the period > leading up to a failure. This would give us an idea of what was > happening concurrently with the processes that seem directly involved. Next time we reproduce it, I'll take a copy of the WAL files too. __ Marc
pgsql-hackers by date: