Thread: pg_basebackup issues
Greetings, I'm running postgresql-9.1.3 on a Linux-x86_64 (Fedora16, if it matters) system. I noticed the existence of pg_basebackup starting in 9.1, and figured I'd try it out and see if it would simplify our backup & management processes. I setup a test system (same OS & postgresql version as production) with a fairly recent snapshot of our production database, invoked it, and saw the following output: ###### # pg_basebackup -P -v -D backups -Ft -z -U postgres 135717206/135717230 kB (100%), 1/1 tablespace pg_basebackup: could not get WAL end position from server ###### I wasn't sure what that error meant, so after googling a bit, turns out that it really means that there were one or more files not owned by the postgres user (see http://serverfault.com/questions/312205/pg-basebackup-could-not-get-wal-end-position-from-server ). Sure enough, the file that wasn't owned by the postgres user was the backup tarball that pg_basebackup was creating, since I had been running it as root. That error is rather cryptic, and it would be helpful if it was improved to suggest the real cause of the failure. Anyway, lesson learned, I need to either invoke pg_basebackup as the same user that runs the database (or is specified with the -U parameter ?), or write the backup somewhere outside of the directory structure that is being backed up. I eventually also found the following entries in the postgresql server log: FATAL: could not open directory "./backups": Permission denied FATAL: archive member "backups/base.tar.gz" too large for tar format What concerns me is the 2nd fatal error. The tarball that pg_basebackup created before erroring out is about 12GB: 12393094165 base.tar.gz I wasn't aware of any 12GB file size limit for tar, so this is a bit of a mystery to me. Regardless, I'd be happy to try some other archiving strategy, but the man page for pg_basebackup suggests that there are only two formats, tar and basically just copying the filesystem. If I copied the filesystem, I'd still have to find some way to archive them for easy management (copying elsewhere, etc). Has anyone come up with a good strategy on how to deal with it? thanks
On Fri, Apr 20, 2012 at 19:51, Lonni J Friedman <netllama@gmail.com> wrote: > Greetings, > I'm running postgresql-9.1.3 on a Linux-x86_64 (Fedora16, if it > matters) system. I noticed the existence of pg_basebackup starting in > 9.1, and figured I'd try it out and see if it would simplify our > backup & management processes. I setup a test system (same OS & > postgresql version as production) with a fairly recent snapshot of our > production database, invoked it, and saw the following output: > ###### > # pg_basebackup -P -v -D backups -Ft -z -U postgres > 135717206/135717230 kB (100%), 1/1 tablespace > pg_basebackup: could not get WAL end position from server > ###### > > I wasn't sure what that error meant, so after googling a bit, turns > out that it really means that there were one or more files not owned > by the postgres user (see > http://serverfault.com/questions/312205/pg-basebackup-could-not-get-wal-end-position-from-server > ). Sure enough, the file that wasn't owned by the postgres user was > the backup tarball that pg_basebackup was creating, since I had been > running it as root. That error is rather cryptic, and it would be > helpful if it was improved to suggest the real cause of the failure. Yeah, the error message comes from the fact that the backend gives up, and the real message is in the backend log. We should try to do something about that. > Anyway, lesson learned, I need to either invoke pg_basebackup as the > same user that runs the database (or is specified with the -U > parameter ?), or write the backup somewhere outside of the directory > structure that is being backed up. > > I eventually also found the following entries in the postgresql server log: > FATAL: could not open directory "./backups": Permission denied > FATAL: archive member "backups/base.tar.gz" too large for tar format > > What concerns me is the 2nd fatal error. The tarball that > pg_basebackup created before erroring out is about 12GB: > 12393094165 base.tar.gz Are you actually storing your backup files *inside* the data directory? You really shouldn't do that, you're creating a cyclic dependency where each new backup will include the old one inside it... You should store the resulting backup file somewhere outside the data directory. > I wasn't aware of any 12GB file size limit for tar, so this is a bit > of a mystery to me. Regardless, I'd be happy to try some other > archiving strategy, but the man page for pg_basebackup suggests that > there are only two formats, tar and basically just copying the > filesystem. If I copied the filesystem, I'd still have to find some > way to archive them for easy management (copying elsewhere, etc). Has > anyone come up with a good strategy on how to deal with it? The max file size of a single flie inside a standard tar file is 8Gb, see e.g. http://en.wikipedia.org/wiki/Tar_(file_format). I think there are extensions that let you store bigger files, but since PostgreSQL will never create files that big it's not implemented in the basebackup system. Because again, the root of your problem seems to be that you are trying to store the resulting backup inside the data directory. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Fri, Apr 20, 2012 at 12:31 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Fri, Apr 20, 2012 at 19:51, Lonni J Friedman <netllama@gmail.com> wrote: >> Anyway, lesson learned, I need to either invoke pg_basebackup as the >> same user that runs the database (or is specified with the -U >> parameter ?), or write the backup somewhere outside of the directory >> structure that is being backed up. >> >> I eventually also found the following entries in the postgresql server log: >> FATAL: could not open directory "./backups": Permission denied >> FATAL: archive member "backups/base.tar.gz" too large for tar format >> >> What concerns me is the 2nd fatal error. The tarball that >> pg_basebackup created before erroring out is about 12GB: >> 12393094165 base.tar.gz > > Are you actually storing your backup files *inside* the data > directory? You really shouldn't do that, you're creating a cyclic > dependency where each new backup will include the old one inside it... > You should store the resulting backup file somewhere outside the data > directory. yea, in hindsight that was silly. i just saw a convenient 'backups' subdirectory and figured that'd be a nice logical place. > >> I wasn't aware of any 12GB file size limit for tar, so this is a bit >> of a mystery to me. Regardless, I'd be happy to try some other >> archiving strategy, but the man page for pg_basebackup suggests that >> there are only two formats, tar and basically just copying the >> filesystem. If I copied the filesystem, I'd still have to find some >> way to archive them for easy management (copying elsewhere, etc). Has >> anyone come up with a good strategy on how to deal with it? > > The max file size of a single flie inside a standard tar file is 8Gb, > see e.g. http://en.wikipedia.org/wiki/Tar_(file_format). > > I think there are extensions that let you store bigger files, but > since PostgreSQL will never create files that big it's not > implemented in the basebackup system. Because again, the root of your > problem seems to be that you are trying to store the resulting backup > inside the data directory. You're right, that was indeed the issue. I've redone the process using a location external to $PGDATA, and it completed successfully: $ pg_basebackup -P -v -D /tmp/backup -x -Ft -z -U postgres xlog start point: C6/64000020 135733616/135733616 kB (100%), 1/1 tablespace xlog end point: C6/640000A0 pg_basebackup: base backup completed So after running through this, I tried to use (restore) the backup that was generated. While everything appears to be working ok from a functional perspective, in the server log I saw the following: ###### LOG: creating missing WAL directory "pg_xlog/archive_status" LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at C6/66000078 LOG: could not open file "pg_xlog/00000001000000C600000067" (log file 198, segment 103): No such file or directory LOG: redo done at C6/660000A0 FATAL: the database system is starting up LOG: autovacuum launcher started LOG: database system is ready to accept connections ##### Just to be clear, here's what I did after pg_basebackup had completed successfully: 0) shutdown postgresql gracefully, and verified that it was fully shutdown 1) moved $PGDATA to $PGDATA.old 2) created $PGDATA as postgres user 3) extracted the basebackup tarball as postgres user cd $PGDATA && tar xzvpf /tmp/backup/base.tar.gz 4) started postgresql up I would have expected that I wouldn't have gotten the 'not properly shutdown' warning, or the 'could not open file' warning by following this process. Am I doing something wrong? thanks