Re: Updated backup APIs for non-exclusive backups - Mailing list pgsql-hackers
From | Laurenz Albe |
---|---|
Subject | Re: Updated backup APIs for non-exclusive backups |
Date | |
Msg-id | f4757edf28bdcec95d54ec0eb7a8b2afe62191c6.camel@cybertec.at Whole thread Raw |
In response to | Re: Updated backup APIs for non-exclusive backups (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: Updated backup APIs for non-exclusive backups
Re: Updated backup APIs for non-exclusive backups |
List | pgsql-hackers |
Stephen Frost wrote: > > > Seeing it often doesn't make it a good solution. Running just > > > pre-backup and post-backup scripts and copying the filesystem isn't > > > enough to perform an online PostgreSQL backup- the WAL needs to be > > > collected as well, and you need to make sure that you have all of the > > > WAL before the backup can be considered complete. > > > > Yes, that's why "pg_stop_backup" has the "wait_for_archive" parameter. > > So this is not a problem. > > That doesn’t actually make sure you have all of the WAL reliably saved > across the backup, it just cares what archive command returns, which is > sadly often a bad thing to depend on. I certainly wouldn’t rely on only > that for any system I cared about. If you write a bad archive_command, you have a problem. But that is quite unrelated to the problem at hand, if I am not mistaken. > > > On restore, you're > > > going to need to create a recovery.conf (at least in released versions) > > > which provides a restore command (needed even in HEAD today) to get the > > > old WAL, so having to also create the backup_label file shouldn't be > > > that difficult. > > > > You write "recovery.conf" upon recovery, when you have the restored > > backup, so you have it on a file system. No problem adding a file then. > > > > This is entirely different from adding a "backup_label" file to > > a backup that has been taken by a backup software in some arbitrary > > format in some arbitrary location (think snapshot). > > There isn’t any need to write the backup label before you restore the database, > just as you write recovery.conf then. Granted. But it is pretty convenient, and writing it to the data directory right away is a good thing on top, because it reduces the danger of inadvertedly starting the backup without recovery. > > > Lastly, if you really want, you can extract out the data from > > > pg_stop_backup in whatever your post-backup script is. > > > > Come on, now. > > You usually use backup techniques like that because you can't get > > your large database backed up in the available time window otherwise. > > I’m not following what you’re trying to get at here, why can’t you extract > the data for the backup label from pg_stop_backup..? Certainly other tools > do, even ones that do extremely fast parallel backups.. the two are > completely independent. > > Did you think I meant pg_basebackup..? I certaily didn’t. Oh yes, I misunderstood. Sorry. Yes, you can come up with a post-backup script that somehow communicates with your pre-backup script to get the information, but it sure is inconvenient. Simplicity is good in backup solutions, because complicated things tend to break more easily. > > I thought our goal is to provide convenient backup methods... > > Correctness would be first and having a broken system because of a crash during a backup isn’t correct. "Not starting up without manual intervention" is not actually broken... > > But what's wrong with retaining the exclusive backup method and just > > sticking a big "Warning: this may cause a restart to fail after a crash" > > on it? That sure wouldn't be unsafe. > > I haven’t seen anyone pushing for it to be removed immediately, but users should > not use it and newcomers would be much better served by using the non exclusive api. > There is a reason it was deprecated and it’s because it simply isn’t a good API. > Coming along a couple years later and saying that it’s a good API while ignoring > the issues that it has doesn’t change that. I don't think I'm ignoring the issues, I just think there is a valid use case for the exclusive backup API, with all its caveats. Of course I'm not arguing on behalf of organizations running lots of databases for whom manual intervention after a crash is unacceptable. I'm arguing on behalf of users that run a few databases, want a simple backup solution and are ready to deal with the shortcomings. But I will gladly accept defeat in this matter, I just needed to vent my opinion. Yours, Laurenz Albe
pgsql-hackers by date: