Thread: [GENERAL] Associating a basebackup and it's .backup WAL file
Hi Given a basebackup base.tar.gz and an archive of WAL files, is there any way to find out which .backup WAL file is associated with the basebackup from command line? My use case is for a retention policy bash script which: -deletes all basebackups older than X days -runs pg_archivecleanup for the oldest basebackup I just don't know how to find out which WAL to feed to pg_archivecleanup at this point.
On Thu, Jul 13, 2017 at 10:30 AM, cen <imbacen@gmail.com> wrote: > Given a basebackup base.tar.gz and an archive of WAL files, is there any way > to find out which .backup WAL file is associated with the basebackup from > command line? Not from what Postgres ships directly. Without any custom meta data save with each one of your backups, say something that you write after calling pg_basebackup, you would need to untar base.tar to look for the backup_label file. > My use case is for a retention policy bash script which: > -deletes all basebackups older than X days > -runs pg_archivecleanup for the oldest basebackup > > I just don't know how to find out which WAL to feed to pg_archivecleanup at > this point. Recalling something I know about, pg_rman uses its own meta data to do this decision making with dedicated folder names that use a structure and names based on timestamps, and this meta data is written and saved when each backup is taken. This saves future lookups at all tarballs when doing cleanup of past backups. I am not sure about the more popular barman and pgBackrest since I know them less, but I would imagine they handle retention policies similarly. -- Michael
That is really unfortunate. It seems it would be a nice feature for pg_basebackup to simply create a .metadata file in basebackup output directory or something along those lines.
Non tarballed/compressed basebackup is fine since I can read the label, but most people probably want to always compress. I'll probably try to get the WAL file by getting last modified .backup right after basebackup invoke.
I also found another "automation" problem looking at PITR recovery documentation, specifically:
"If you have unarchived WAL segment files that you saved in step 2, copy them into pg_xlog/. (It is best to copy them, not move them, so you still have the unmodified files if a problem occurs and you have to start over.)"
Seems like a lot of manual work to me, to automate it I'd basically have to diff both directories and then copy only the newest differences over to the recovery. So far I was unable to find a supersecret git repo with bash scripts accomplishing this tasks which is surprising.
On Thu, Jul 13, 2017 at 10:30 AM, cen <imbacen@gmail.com> wrote:Given a basebackup base.tar.gz and an archive of WAL files, is there any way to find out which .backup WAL file is associated with the basebackup from command line?Not from what Postgres ships directly. Without any custom meta data save with each one of your backups, say something that you write after calling pg_basebackup, you would need to untar base.tar to look for the backup_label file.My use case is for a retention policy bash script which: -deletes all basebackups older than X days -runs pg_archivecleanup for the oldest basebackup I just don't know how to find out which WAL to feed to pg_archivecleanup at this point.Recalling something I know about, pg_rman uses its own meta data to do this decision making with dedicated folder names that use a structure and names based on timestamps, and this meta data is written and saved when each backup is taken. This saves future lookups at all tarballs when doing cleanup of past backups. I am not sure about the more popular barman and pgBackrest since I know them less, but I would imagine they handle retention policies similarly.
Greetings, * cen (imbacen@gmail.com) wrote: > Seems like a lot of manual work to me, to automate it I'd basically > have to diff both directories and then copy only the newest > differences over to the recovery. So far I was unable to find a > supersecret git repo with bash scripts accomplishing this tasks > which is surprising. I would hazard to suggest that's because people have realized that bash scripting backup and recovery of PostgreSQL, and ensuring that all of the appropriate precautions are taken and checks are done, is next to impossible; were it to be done the scripts would be more complex and challenging to use than any of the existing solutions. I'd strongly suggest you consider one of the maintained backup solutions that have already been written instead of inventing yet another one. Thanks! Stephen