Thread: [GENERAL] Using cp to back up a database?
v8.4.20
This is what the current backup script uses:
Should it use rsync or pg_dump instead?
Thanks
-- World Peace Through Nuclear Pacification
If you want a consistent database (you *REALLY* do), pg_dump is the correct tool.
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106
From: <pgsql-general-owner@postgresql.org> on behalf of Ron Johnson <ron.l.johnson@cox.net>
Date: Monday, October 9, 2017 at 8:41 AM
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] Using cp to back up a database?
Hi,
v8.4.20
This is what the current backup script uses:
/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"
Should it use rsync or pg_dump instead?
Thanks
--
World Peace Through Nuclear Pacification
If you want a consistent database (you *REALLY* do), pg_dump is the correct tool.
--
Larry Rosenman h
ttp://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106
From: <pgsql-general-owner@
postgresql.org> on behalf of Ron Johnson <ron.l.johnson@cox.net>
Date: Monday, October 9, 2017 at 8:41 AM
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] Using cp to back up a database?
Hi,
v8.4.20
This is what the current backup script uses:/usr/bin/psql -U postgres -c "SELECT pg_start_backup('
Incrementalbackup',true);" cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"
Should it use rsync or pg_dump instead?
Thanks--
World Peace Through Nuclear Pacification
It is a matter of choice. What you are doing with the script is making a backup of the entire PostgreSQL data file directory. That includes all PostgreSQL system catalogs as well as user objects. A restore of your data directory would be faster than a full restore of pg_dump as indexes and constraints would not have to be recreated. However, it does not give you the option to restore one or more selected objects.
pg_dump, on the other hand, gives you the flexibility of selecting what to backup and what to restore. FWIW, I much prefer pg_dump.
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

If you want a consistent database (you *REALLY* do), pg_dump is the correct tool.
In all pg conferences I have been, ppl scream : do not use pg_dump for backups :)
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106
From: <pgsql-general-owner@postgresql.org> on behalf of Ron Johnson <ron.l.johnson@cox.net>
Date: Monday, October 9, 2017 at 8:41 AM
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] Using cp to back up a database?
Hi,
v8.4.20
This is what the current backup script uses:/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"
Should it use rsync or pg_dump instead?
Thanks--
World Peace Through Nuclear Pacification
-- Achilleas Mantzios IT DEV Lead IT DEPT Dynacom Tankers Mgmt
On Mon, Oct 9, 2017 at 11:09 PM, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote: > On 09/10/2017 16:51, Larry Rosenman wrote: > > If you want a consistent database (you *REALLY* do), pg_dump is the correct > tool. > > In all pg conferences I have been, ppl scream : do not use pg_dump for > backups :) It depends on what you are trying to achieve, pg_dump can be fine for small-ish databases. By relying on both logical (pg_dump) and physical backups (base backups) brings more insurance in face of a disaster. -- Michael -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Sure I want a consistent database. Why doesn't?
But log shipping requires you to rsync/var/lib/pgsql/data to the remote server, and that's consistent, so why wouldn't rsync to a local directory also be consistent?
On 10/09/2017 08:51 AM, Larry Rosenman wrote:
If you want a consistent database (you *REALLY* do), pg_dump is the correct tool.
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106
From: <pgsql-general-owner@postgresql.org> on behalf of Ron Johnson <ron.l.johnson@cox.net>
Date: Monday, October 9, 2017 at 8:41 AM
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] Using cp to back up a database?
Hi,
v8.4.20
This is what the current backup script uses:/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"
Should it use rsync or pg_dump instead?
Thanks
-- World Peace Through Nuclear Pacification
On 09/10/2017 17:13, Michael Paquier wrote: > On Mon, Oct 9, 2017 at 11:09 PM, Achilleas Mantzios > <achill@matrix.gatewaynet.com> wrote: >> In all pg conferences I have been, ppl scream : do not use pg_dump for >> backups :) > It depends on what you are trying to achieve, pg_dump can be fine for > small-ish databases. By relying on both logical (pg_dump) and physical > backups (base backups) brings more insurance in face of a disaster. Basically yes, it depends on size. Where I work in order to do a proper pg_dump would take ~ one week. Business can't waitfor that long. -- Achilleas Mantzios IT DEV Lead IT DEPT Dynacom Tankers Mgmt -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Hi,
v8.4.20
This is what the current backup script uses:/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);" cp -r /var/lib/pgsql/data/* $dumpdir/data//usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"
Should it use rsync or pg_dump instead?
On Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:Hi,
v8.4.20
This is what the current backup script uses:/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);" cp -r /var/lib/pgsql/data/* $dumpdir/data//usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"That's fine, as long as you have a wal archive. Although I don't know what is "Incremental" about it.
From reading the docs, that field is just a label with no intrinsic meaning.
-- World Peace Through Nuclear Pacification
Hi,
v8.4.20
This is what the current backup script uses:/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);" cp -r /var/lib/pgsql/data/* $dumpdir/data//usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"
Should it use rsync or pg_dump instead?
Thanks
-- World Peace Through Nuclear Pacification
Sure I want a consistent database. Why doesn't?
But log shipping requires you to rsync/var/lib/pgsql/data to the remote server, and that's consistent, so why wouldn't rsync to a local directory also be consistent?
On 10/09/2017 11:33 AM, Jeff Janes wrote:On Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:Hi,
v8.4.20
This is what the current backup script uses:/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);" cp -r /var/lib/pgsql/data/* $dumpdir/data//usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"That's fine, as long as you have a wal archive. Although I don't know what is "Incremental" about it.
From reading the docs, that field is just a label with no intrinsic meaning.
Ron:Here is an explanation that may help a bit.Your script is executing a PHYSICAL backup. A physical backup is simply a full copy of the cluster (instance) data directory ($PGDATA). A physical backup is your best option when you need to backup the cluster data as well as all configuration for the cluster. Essentially, if you had to rebuild the entire computer hosting the cluster, you could just reinstall the same version of postgres, copy in the backup data directory, and the cluster would run exactly as it did before with the same data. A physical backup is also necessary when the databases get very large.In the backup script you posted, the 'pg_start_backup' and 'pg_stop_backup' commands fulfill two purposes. The first is to create a label for the point in time the backup was started - this is done by pg_start_backup. The second is to ensure that all WAL segments that have been written since the backup began have been safely archived. That is done by pg_stop_backup. This approach is necessary to accomplish an online physical backup.As others have mentioned pg_dump is a LOGICAL backup tool similar to any SQL dump you've done with another DBMS. The pg_dump command will do a SQL dump to recreate everything within a single database. So, if you have multiple databases in your cluster, its not the best option. pg_dumpall is the logical backup tool that will do a logical dump of all globals (schema + roles) along with all databases in the cluster. Because the pg_dump/pg_dumpall commands are not executing a physical backup, the pg_start_backup and pg_stop_backup commands do not apply.As for times when you would elect to do a logical backup, as others have mentioned, this is the only valid option when you are restoring to a different version of Postgres. It is also a good option to do a backup of a single small database or several small databases. And, if for any reason the backup needs to be human-readable, this is the approach of choice as well.DarrenThe firstOn Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:Hi,
v8.4.20
This is what the current backup script uses:/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);" cp -r /var/lib/pgsql/data/* $dumpdir/data//usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"
Should it use rsync or pg_dump instead?
Thanks
-- World Peace Through Nuclear Pacification--
On 10/09/2017 11:49 AM, Darren Douglas wrote:
Ron:Here is an explanation that may help a bit.Your script is executing a PHYSICAL backup. A physical backup is simply a full copy of the cluster (instance) data directory ($PGDATA). A physical backup is your best option when you need to backup the cluster data as well as all configuration for the cluster. Essentially, if you had to rebuild the entire computer hosting the cluster, you could just reinstall the same version of postgres, copy in the backup data directory, and the cluster would run exactly as it did before with the same data. A physical backup is also necessary when the databases get very large.In the backup script you posted, the 'pg_start_backup' and 'pg_stop_backup' commands fulfill two purposes. The first is to create a label for the point in time the backup was started - this is done by pg_start_backup. The second is to ensure that all WAL segments that have been written since the backup began have been safely archived. That is done by pg_stop_backup. This approach is necessary to accomplish an online physical backup.As others have mentioned pg_dump is a LOGICAL backup tool similar to any SQL dump you've done with another DBMS. The pg_dump command will do a SQL dump to recreate everything within a single database. So, if you have multiple databases in your cluster, its not the best option. pg_dumpall is the logical backup tool that will do a logical dump of all globals (schema + roles) along with all databases in the cluster. Because the pg_dump/pg_dumpall commands are not executing a physical backup, the pg_start_backup and pg_stop_backup commands do not apply.As for times when you would elect to do a logical backup, as others have mentioned, this is the only valid option when you are restoring to a different version of Postgres. It is also a good option to do a backup of a single small database or several small databases. And, if for any reason the backup needs to be human-readable, this is the approach of choice as well.DarrenThe firstOn Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:Hi,
v8.4.20
This is what the current backup script uses:/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);" cp -r /var/lib/pgsql/data/* $dumpdir/data//usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"
Should it use rsync or pg_dump instead?
Thanks
-- World Peace Through Nuclear Pacification--
-- World Peace Through Nuclear Pacification
Maybe my original question wasn't clear, so I'll try again: is it safe to do a physical using cp (as opposed to rsync)?
On 10/09/2017 11:49 AM, Darren Douglas wrote:Ron:Here is an explanation that may help a bit.Your script is executing a PHYSICAL backup. A physical backup is simply a full copy of the cluster (instance) data directory ($PGDATA). A physical backup is your best option when you need to backup the cluster data as well as all configuration for the cluster. Essentially, if you had to rebuild the entire computer hosting the cluster, you could just reinstall the same version of postgres, copy in the backup data directory, and the cluster would run exactly as it did before with the same data. A physical backup is also necessary when the databases get very large.In the backup script you posted, the 'pg_start_backup' and 'pg_stop_backup' commands fulfill two purposes. The first is to create a label for the point in time the backup was started - this is done by pg_start_backup. The second is to ensure that all WAL segments that have been written since the backup began have been safely archived. That is done by pg_stop_backup. This approach is necessary to accomplish an online physical backup.As others have mentioned pg_dump is a LOGICAL backup tool similar to any SQL dump you've done with another DBMS. The pg_dump command will do a SQL dump to recreate everything within a single database. So, if you have multiple databases in your cluster, its not the best option. pg_dumpall is the logical backup tool that will do a logical dump of all globals (schema + roles) along with all databases in the cluster. Because the pg_dump/pg_dumpall commands are not executing a physical backup, the pg_start_backup and pg_stop_backup commands do not apply.As for times when you would elect to do a logical backup, as others have mentioned, this is the only valid option when you are restoring to a different version of Postgres. It is also a good option to do a backup of a single small database or several small databases. And, if for any reason the backup needs to be human-readable, this is the approach of choice as well.DarrenThe firstOn Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:Hi,
v8.4.20
This is what the current backup script uses:/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);" cp -r /var/lib/pgsql/data/* $dumpdir/data//usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"
Should it use rsync or pg_dump instead?
Thanks
-- World Peace Through Nuclear Pacification---- World Peace Through Nuclear Pacification
On Mon, Oct 9, 2017 at 1:19 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:Maybe my original question wasn't clear, so I'll try again: is it safe to do a physical using cp (as opposed to rsync)?Yes -- however you must configure WAL archiving first. If not, no backup tool, cp, rsync, etc... will provide a good backup.Oh, and BTW -- The obligatory: You are on an ancient, EOL version of PG. Upgrade.Make sure that these are set:- archive_mode- archive_commandThen, on when you restore the backup, you need to create a recovery.conf and configure- restore_command
This is good to know. Thanks.
-- World Peace Through Nuclear Pacification
Ron, * Ron Johnson (ron.l.johnson@cox.net) wrote: > Maybe my original question wasn't clear, so I'll try again: is it > safe to do a physical using cp (as opposed to rsync)? Frankly, I'd say no. There's nothing to guarantee that the backup is actually sync'd out to disk. Further, you're probably in the same boat when it comes to WAL archiving (just using 'cp' there isn't good for the same reason), though that's arguably worse because once you return true from archive_command, that WAL file will be gone and you could end up with a hole in your WAL stream. Next, this backup method has been deprecated because of issues with the backup_label file and what happens when the system crashes during a backup. So, no, you shouldn't be just using 'cp', or 'rsync', or any of those similar, simple, tools for doing a proper PG backup. Use a tool which has been developed specifically for PG such as pgBackRest, barman, WAL-E, WAL-G, etc. Thanks! Stephen