Re: Point in Time Recovery - Mailing list pgsql-hackers
From | Mark Kirkwood |
---|---|
Subject | Re: Point in Time Recovery |
Date | |
Msg-id | 40F7176E.4000001@coretech.co.nz Whole thread Raw |
In response to | Re: Point in Time Recovery (markw@osdl.org) |
Responses |
Re: Point in Time Recovery
|
List | pgsql-hackers |
Simon Riggs wrote: > >So far: > >I've tried to re-create the problem as exactly as I can, but it works >for me. > >This is clearly an important case to chase down. > >I assume that this is the very first time you tried recovery? Second and >subsequent recoveries using the same set have a potential loophole, >which we have been discussing. > >Right now, I'm thinking that the "exactly 2 logs worth" of data has >brought you very close to the end of the log file (FFFFE0) ending with 1 >and the shutdown checkpoint that is then subsequently written is >failing. > >Can you repeat this your end? > > > It is repeatable at my end. It is actually fairly easy to recreate the example I am using, download http://sourceforge.net/projects/benchw and generate the dataset for Pg - but trim the large "fact0.dat" dump file using head -100000. Thus step 7 consists of creating the 4 tables and COPYing in the data for them. >The nearest I can get to the exact record pointers you show are to start >recovery at A4807C and to end at with FFFF88. > >Overall, PITR changes the recovery process very little, if at all. The >main areas of effect are to do with sequencing of actions and matching >up the right logs with the right backup. I'm not looking for bugs in the >code but in subtle side-effects and "edge" cases. Everything you can >tell me will help me greatly in chasing that down. > > > I agree - I will try this sort of example again, but will change the number of rows I am COPYing (currently 100000) and see if that helps. >Best Regards, Simon Riggs > > > By way of contrast, using the *same* procedure (1-11), but generating 2 logs worth of INSERTS/UPDATES using 10 concurrent process *works fine* - e.g : LOG: database system was interrupted at 2004-07-16 11:17:52 NZST LOG: recovery command file found... LOG: restore_program = cp %s/%s %s LOG: recovery_target_inclusive = true LOG: recovery_debug_log = true LOG: starting archive recovery LOG: restored log file "0000000000000000" from archive LOG: checkpoint record is at 0/A4803C LOG: redo record is at 0/A4803C; undo record is at 0/0; shutdown FALSE LOG: next transaction ID: 496; next OID: 25419 LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 0/A4807C postmaster starting [postgres@shroudeater 7.5]$ LOG: restored log file "0000000000000001" from archive cp: cannot stat `/data1/pgdata/7.5-archive/0000000000000002': No such file or directory LOG: could not restore "0000000000000002" from archive LOG: could not open file "/data1/pgdata/7.5/pg_xlog/0000000000000002" (log file 0, segment 2): No such file or directory LOG: redo done at 0/1FFFFD4 LOG: archive recovery complete LOG: database system is ready LOG: archiver started
pgsql-hackers by date: