Re: incremental backups - Mailing list pgsql-general
From | Rick Gigger |
---|---|
Subject | Re: incremental backups |
Date | |
Msg-id | 19532E3F-5BB4-479B-8D58-8A2947402E14@alpinenetworking.com Whole thread Raw |
In response to | Re: incremental backups (Rick Gigger <rick@alpinenetworking.com>) |
Responses |
Re: incremental backups
|
List | pgsql-general |
And here is the real million dollar question. Let's say for some reason I don't have the last WAL file I need for my backup to be valid. Will it die and tell me it's bad or will it just start up with a screwed up data directory? On Jan 30, 2006, at 4:29 PM, Rick Gigger wrote: > Yes! Thanks you! That is exactly what I was looking for. > > So I take it that this means that it is save to copy the current in > use WAL file even as it is being written to? > And it also means that if I copy it with my physical file system > backup then I should have the last file that I need to restore from > that physical backup? > > So if I write my own backup_latest_WAL_file.sh script (I think I > found one on the list from Simon Riggs) then I can do what I need > to do before those todo items get done? Or will I need to wait > till postgres gives me the ability to safely copy the file? > > > > On Jan 30, 2006, at 11:13 AM, Bruce Momjian wrote: > >> >> Unfortunately, I think I understand your question. :-) >> >> These TODO items are what you need: >> >> * Point-In-Time Recovery (PITR) >> >> o Allow point-in-time recovery to archive partially filled >> write-ahead logs [pitr] >> >> Currently only full WAL files are archived. This means >> that the >> most recent transactions aren't available for recovery >> in case >> of a disk failure. This could be triggered by a user >> command or >> a timer. >> >> o Automatically force archiving of partially-filled WAL >> files when >> pg_stop_backup() is called or the server is stopped >> >> Doing this will allow administrators to know more >> easily when >> the archive contains all the files needed for point-in- >> time >> recovery. >> >> I will try to push to have them done for 8.2. >> >> --------------------------------------------------------------------- >> ------ >> >> Rick Gigger wrote: >>> I guess my email wasn't all that clear. I will try to rephrase. I >>> am moving from using the old style pg_dump for backups to using >>> incrementals and want to make sure I understand the process before I >>> go about writing a bunch of scritps. >>> >>> To me setting up incremental backup consists of the following >>> components: >>> >>> 1) Setting up the WAL archiving. This one is trivial. >>> 2) Doing physical dumps of the $PGDATA directory. This one is once >>> again trivial. >>> 3) Knowing which physical dumps are Good and Not Good. For a given >>> physical dump D there is are WAL archive files Dstart and Dend for >>> which you much have Dstart and Dend and all files in between. If >>> you >>> have all those files then the physical dump is Good. If you don't >>> have them then the dump is worthless to you. >>> 4) Knowing which dumps and which archive files can be deleted. This >>> depends on a number of factors. >>> a) How far back do you want to be able to do PITR >>> b) How much space do you have / want to use for PITR >>> c) Which physical dumps are Good and which are Not Good. (see #3) >>> >>> Now I think I have a pretty good plan here except for #3 (and so #4 >>> then also suffers). >>> >>> Just as an example lets say I'm not concerned so much with PITR as I >>> am recovering from a db crash. I've got all the backups files saved >>> to my backup db server so I can failover to it if my primary db >>> server dies. I just want to make sure I've got one physical dump >>> that is good. (This is not my actual situation but it >>> illustrated my >>> point better.) >>> >>> Now when I do a physical dump it is not a Good dump. That is I >>> don't >>> have the end archive file necessary to recover from that physical >>> dump. That is to say that when I call pg_backup_start() then copy >>> $PGDATA then call pg_backup_stop() postgres might be on say WAL >>> archive file #5. Once the physical dump is completed WAL archive >>> file #5 hasn't been archived yet. I only have up to #4. So if I >>> delete my old physical dumps and all I've got is this most recent >>> one >>> and my database crashes before #5 gets archived then I am hosed. I >>> have no good physical backups to start from. >>> >>> My main question is about the best way to figure out when a physical >>> dump is Good. >>> >>> One strategy is to always keep around lots of physical dumps. If >>> you >>> keep around 100 dumps you can be pretty sure that in the space of >>> time that those physical dumps take place that at least one WAL file >>> was archived. In fact if you keep 2 physical dumps you can be >>> fairly >>> certain of this. If not then you really need to space our your >>> dumps >>> more. >>> >>> Is this making sense at this point? >>> >>> The problem is that the WAL archiving is triggered by postgres and >>> the rate at which the db is updated. The physical dumps are >>> triggered by cron and on a purely time based schedule. So in theory >>> if you had the physical dumps happening once a day but for some odd >>> reason no one updated the database for 4 days then all of a sudden >>> you'd have 2 physical backups and neither of them are good. If >>> you're db crashes during that time you are hosed. >>> >>> Maybe I am arguing a point that is just stupid because this will >>> never happen in real life. But in that it is my backups system that >>> I will be using to recover from complete and total disaster I just >>> want to have all my bases covered. >>> >>> So my ideas on how to determine if a physical dump is Good are as >>> follows. >>> >>> 1) When you do the physical backup (after dumping the $PGDATA dir >>> but >>> before calling pg_stop_backup() ) determine the current WAL archive >>> file. Mark somewhere in the backed up physical dump the last file >>> needed for the dump to be considered good. Then your deletion >>> scripts can look at the WAL archive files you have and the last one >>> required for the dump to be Good and determine if the dump is >>> Good or >>> not. >>> >>> 2) After doing the physical dump but before calling pg_stop_backup() >>> just copy the current WAL file to the physical dump. If that file >>> later gets archived then the restore commands overwrites your >>> partially completed one so it doesn't hurt but you know that when >>> you >>> call pg_stop_backup() that that physical dump is good. (Is it ok to >>> copy the current WAL file while it is still in use?) >>> >>> Is anyone taking one of these or any other precautions to make sure >>> they've got a good physical dump or does everyone just keep a whole >>> bunch of dumps around, and then actually restore the dump to see if >>> it is good and if not go back to a previous dump? >>> >>> I hope that makes more sense. >>> >>> Thanks, >>> >>> Rick >>> >>> On Jan 27, 2006, at 3:33 AM, Richard Huxton wrote: >>> >>>> Rick Gigger wrote: >>>>> Um, no you didn't read my email at all. I am aware of all of that >>>>> and it is clearly outlined in the docs. My email was about a >>>>> specific detail in the process. Please read it if you want to >>>>> know what my actual question was. >>>> >>>> I'm not sure your email is quite right as regards the process. You >>>> need: >>>> 1. the filesystem backup >>>> 2. the WAL file indicated in the history-file >>>> 3. all the WAL files later than that >>>> to get up to "now". >>>> >>>> If you don't want to replay up to "now" then you will not need some >>>> of the more recent WAL files. You can't afford to throw them away >>>> though since you've got a rolling backup system running and the >>>> whole point is so you can recover to any point you like. >>>> >>>> You can however throw away any WAL files older than that indicated >>>> in the history file for your current filesystem-backup. You can >>>> then only restore from that point in time forward. >>>> >>>> There is no "last one" in the WAL set unless you know the time you >>>> want to restore to. Indeed, the "last one" might not be "full" yet >>>> and therefore archived if you want to restore to 10 seconds ago. >>>> >>>> Or am I mis-understanding your email too? >>>> >>>> -- >>>> Richard Huxton >>>> Archonet Ltd >>>> >>> >>> >>> ---------------------------(end of >>> broadcast)--------------------------- >>> TIP 4: Have you searched our list archives? >>> >>> http://archives.postgresql.org >>> >> >> -- >> Bruce Momjian | http://candle.pha.pa.us >> pgman@candle.pha.pa.us | (610) 359-1001 >> + If your life is a hard drive, | 13 Roberts Road >> + Christ can be your backup. | Newtown Square, >> Pennsylvania 19073 >> > > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org >
pgsql-general by date: