Thread: Performance issues during backup
Hi
We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the instance into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode.
Ie,
psql -c "SELECT pg_start_backup('${DATE}');"
tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME}
psql -c "SELECT pg_stop_backup();"
The size of our database is about 250GB and it usually takes about 1 hour to backup.
During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2 to 3 seconds.
During this time (1:30am) usage is low (less than 10 users) on the system.
Has anyone experience the same problem and any suggestions where to look at to resolve the problem?
Thanks.
Dylan
Dylan Luong wrote: > We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the instance > into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode. > Ie, > psql -c "SELECT pg_start_backup('${DATE}');" > tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME} > psql -c "SELECT pg_stop_backup();" > > The size of our database is about 250GB and it usually takes about 1 hour to backup. > During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2 to3 seconds. > During this time (1:30am) usage is low (less than 10 users) on the system. > > Has anyone experience the same problem and any suggestions where to look at to resolve the problem? The "tar" is probably taking up too much I/O bandwidth. Assuming this is Linux, you could run it with ionice -c 2 -n 7 tar ... or ionice -c 3 tar ... Of course then you can expect the backup to take more time. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Dylan Luong wrote:
> We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the instance
> into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode.
> Ie,
> psql -c "SELECT pg_start_backup('${DATE}');"
> tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME}
> psql -c "SELECT pg_stop_backup();"
>
> The size of our database is about 250GB and it usually takes about 1 hour to backup.
> During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2 to 3 seconds.
> During this time (1:30am) usage is low (less than 10 users) on the system.
>
> Has anyone experience the same problem and any suggestions where to look at to resolve the problem?
The "tar" is probably taking up too much I/O bandwidth.
Assuming this is Linux, you could run it with
ionice -c 2 -n 7 tar ...
or
ionice -c 3 tar ...
Of course then you can expect the backup to take more time.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Greetings, * Dylan Luong (Dylan.Luong@unisa.edu.au) wrote: > We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the instanceinto back mode and then backs up (tar) the /Data directory and then takes it out of backup mode. > Ie, > psql -c "SELECT pg_start_backup('${DATE}');" > tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME} > psql -c "SELECT pg_stop_backup();" This doesn't do anything to ensure that the backup files were actually written out to disk, meaning that you might lose files or portions of files if the system crashed shortly after this completed. You don't mention anything about archive_command or how you are handling WAL? Note that you must have the WAL generated between the start backup and the stop backup for the backup to be valid. If you do have an archive command and it looks anything like what you have for your backup command, you likely have the same issue there where the WAL isn't being synced out to disk and therefore you might lose some WAL on a crash and not be able to do PITR or possibly utilize a given backup. You also don't meniton anything about checking that you have all of the WAL needed between the start/stop before considering the backup valid. I'd strongly suggest you look into a designed-for-PG backup solution instead of trying to roll your own and trying to fix all of these issues. There's a few of them out there that I'd suggest considering- pgBackRest being favorite (but I'm biased), barman, or wal-e (maybe wal-g). > The size of our database is about 250GB and it usually takes about 1 hour to backup. > During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2 to3 seconds. > During this time (1:30am) usage is low (less than 10 users) on the system. > > Has anyone experience the same problem and any suggestions where to look at to resolve the problem? Some of the backup tools for PG support throttling to slow down the backup, but my recommendation here would be to stand up a replica and have the bulk of the data pulled from the replica for a given backup instead of doing it from the primary. As mentioned down-thread, you're likely saturating either your i/o bandwidth or your network bandwidth (if you're going to an NFS mount or similar). Thanks! Stephen
Attachment
Hi
Thanks for the rely.
I have trialed the ionice -c 2 -n 7 tar…. change to our backup script and it appears to have helped but not by much.
The affected queries are more of the update/delete/insert queries. Could pg_start_backup be causing locking of some sort.
Regards
Dylan
From: Rene Romero Benavides [mailto:rene.romero.b@gmail.com]
Sent: Wednesday, 21 February 2018 1:37 AM
To: Laurenz Albe <laurenz.albe@cybertec.at>
Cc: Dylan Luong <Dylan.Luong@unisa.edu.au>; pgsql-general@lists.postgresql.org
Subject: Re: Performance issues during backup
What about sending the backup to a different server? through ssh / rsync or something, that would save lots of IO activity
2018-02-20 2:02 GMT-06:00 Laurenz Albe <laurenz.albe@cybertec.at>:
Dylan Luong wrote:
> We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the instance
> into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode.
> Ie,
> psql -c "SELECT pg_start_backup('${DATE}');"
> tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME}
> psql -c "SELECT pg_stop_backup();"
>
> The size of our database is about 250GB and it usually takes about 1 hour to backup.
> During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2 to 3 seconds.
> During this time (1:30am) usage is low (less than 10 users) on the system.
>
> Has anyone experience the same problem and any suggestions where to look at to resolve the problem?
The "tar" is probably taking up too much I/O bandwidth.
Assuming this is Linux, you could run it with
ionice -c 2 -n 7 tar ...
or
ionice -c 3 tar ...
Of course then you can expect the backup to take more time.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/
On Fri, 23 Feb 2018 01:46:22 +0000 Dylan Luong <Dylan.Luong@unisa.edu.au> wrote: > Hi > Thanks for the rely. > I have trialed the ionice -c 2 -n 7 tar…. change to our backup script and it appears to have helped but not by much. If "ionice -c 2 -n 7" helped some, maybe try "ionice -c 3" to set IO priority to the lowest possible level. However, I've used this technique many times on heavily loaded systems without issue. If you're having to squirm this much to avoid problems, you probably need to get better hardware or investigate the possibility that your hardware is faulty or some other IO related issue. > The affected queries are more of the update/delete/insert queries. Could pg_start_backup be causing locking of some sort. Not in my experience. And the fact that they are write queries having trouble makes me theorize that you're saturating the write capacity of your disks. > From: Rene Romero Benavides [mailto:rene.romero.b@gmail.com] > Sent: Wednesday, 21 February 2018 1:37 AM > To: Laurenz Albe <laurenz.albe@cybertec.at> > Cc: Dylan Luong <Dylan.Luong@unisa.edu.au>; pgsql-general@lists.postgresql.org > Subject: Re: Performance issues during backup > > What about sending the backup to a different server? through ssh / rsync or something, that would save lots of IO activity > > 2018-02-20 2:02 GMT-06:00 Laurenz Albe <laurenz.albe@cybertec.at<mailto:laurenz.albe@cybertec.at>>: > Dylan Luong wrote: > > We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the instance > > into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode. > > Ie, > > psql -c "SELECT pg_start_backup('${DATE}');" > > tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME} > > psql -c "SELECT pg_stop_backup();" > > > > The size of our database is about 250GB and it usually takes about 1 hour to backup. > > During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2 to3 seconds. > > During this time (1:30am) usage is low (less than 10 users) on the system. > > > > Has anyone experience the same problem and any suggestions where to look at to resolve the problem? > > The "tar" is probably taking up too much I/O bandwidth. > > Assuming this is Linux, you could run it with > > ionice -c 2 -n 7 tar ... > > or > > ionice -c 3 tar ... > > Of course then you can expect the backup to take more time. > > Yours, > Laurenz Albe > -- > Cybertec | https://www.cybertec-postgresql.com > > > > -- > El genio es 1% inspiración y 99% transpiración. > Thomas Alva Edison > http://pglearn.blogspot.mx/ > -- Bill Moran