Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock) - Mailing list pgsql-hackers
From | Jeff Janes |
---|---|
Subject | Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock) |
Date | |
Msg-id | CAMkU=1xoA6Fdyoj_4fMLqpicZR1V9GP7cLnXJdHU+iGgqb6WUw@mail.gmail.com Whole thread Raw |
In response to | Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock) (Fujii Masao <masao.fujii@gmail.com>) |
List | pgsql-hackers |
On Tue, Feb 21, 2012 at 5:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Tue, Feb 21, 2012 at 8:19 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Sat, Feb 18, 2012 at 12:36 AM, Heikki Linnakangas >> <heikki.linnakangas@enterprisedb.com> wrote: >>> Attached is a new version, fixing that, and off-by-one bug you pointed out >>> in the slot wraparound handling. I also moved code around a bit, I think >>> this new division of labor between the XLogInsert subroutines is more >>> readable. > > When I ran the long-running performance test, I encountered the following > panic error. > > PANIC: could not find WAL buffer for 0/FF000000 I too see this panic when the system survives long enough to get to that log switch. But I'm also still seeing (with version 9) the assert failure at "xlog.c", Line: 2154 during the end-of-recovery checkpoint. Here is a set up for repeating my tests. I used this test simply because I had it sitting around after having written it for other purposes. Indeed I'm not all that sure I should publish it. Hopefully other people will write other tests which exercise other corner cases, rather than exercising the same ones I am. The patch creates a guc which causes the md writer routine to panic and bring down the database, triggering recovery, after a given number for writes. In this context probably any other method of forcing a crash and recovery would be just as good as this specific method of crashing. The choice of 400 for the cutoff for crashing is based on: 1) If the number is too low, you re-crash within recovery so you never get a chance to inspect the database. In my hands, recovery doesn't need to do more than 400 writes. (I don't know how to make the database use different guc setting during recovery than it did before the crash). 2) If the number is too high, it takes too long for a crash to happen and I'm not all that patient. Some of the changes to postgresql.conf.sample are purely my preferences and have nothing in particular to do with this set up. But archive_timeout = 30 is necessary in order to get checkpoints, and thus mdwrites, to happen often enough to trigger crashes often enough to satisfy my impatience. The Perl script exercises the integrity of the database by launching multiple processes (4 by default) to run updates and memorize what updates they have run. After a crash, the Perl processes all communicate their data up to the parent, which consolidates that information and then queries the post-recovery database to make sure it agrees. Transactions that are in-flight at the time of a crash are indeterminate. Maybe the crash happened before the commit, and maybe it happened after the commit but before we received notification of the commit. So whichever way those turn out, it is not proof of corruption. With the xloginsert-scale-9.patch, the above features are not needed because the problem is not that the database is incorrect after recovery, but that the database doesn't recover in the first place. So just running pgbench would be good enough to detect that. But in earlier versions this feature did detect incorrect recovery. This logs an awful lot of stuff, most of which merely indicates normal operation. The problem is that corruption is rare, so if you wait until you see corruption before turning on logging, then you have to wait l long time to get another instance of corruption so you can dissect the log information. So, I just log everything all of the time. A warning from 'line 63' which is not marked as in-flight indicates database corruption. A warning from 'line 66' indicates even worse corruption. A failure of the entire outer script to execute for the expected number of iterations (i.e. failure of the warning issued on 'line 18' to show up 100 times) indicates the database failed to restart. Also attached is a bash script that exercises the whole thing. Note that it has various directories hard coded that really ought not be, and that it has no compunctions about calling rm -r /tmp/data. I run it is as "./do.sh >& log" and then inspect the log file for unusual lines. To run this, you first have to apply your own xlog patch, and apply my crash-inducing patch, and build and install the resulting pgsql. And edit the shell script to point to it, etc.. The whole thing is a bit of a idiosyncratic mess. Cheers, Jeff
Attachment
pgsql-hackers by date: