Thread: Confused about how to enable backups (e.g. Write Ahead Log).
I have been attempting, without success, to create a Write Ahead Log (WAL) backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS. It might be that I am doing the wrong thing and my method will never work. In which case: what is the best way to achieve my 'Goal' (below)? Goal: I want to backup a database every 15 minutes. Ideally, I want to backup only if the database has changed (perhaps this isn't possible and backups happen whether or not the database has changed)? Attempts: I edited /etc/postgresql/10/main/postgresql.conf as follows: wal_level = replica archive_mode = on archive_command = 'cp %p /test/%f' archive_timeout = 900 This should - in my opinion - every 15 minutes backup (cp) the database (referenced by %p) to the /test/ directory using the filename (referenced by %f) of the database. I then: /etc/init.d/postgresql stop /etc/init.d/postgresql start And checked that PostgreSQL is running with /etc/init.d/postgresql status: postgresql.service - PostgreSQL RDBMS Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor preset: enabled) Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS) Main PID: 60534 (code=exited, status=0/SUCCESS) Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset devices.list: Operation not permitted Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS... Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS. However, the /test/ directory did not show any backup files I then read that I needed to use pg_basebackup to make an initial backup (before editing postgresql.conf) so I tried pg_basebackup -D /etc/backup. This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup but I'm not sure how this relates to or effects (if it does) the Write Ahead Log. I still don't have any backups in /test/ (or anywhere). Can anyone kindly explain how, exactly, I can achieve my 'Goal' whether using a modification of this method or something different (but superior). Many thanks!
Hi, Indeed Continuous Archiving and Point-in-Time Recovery (https://www.postgresql.org/docs/10/continuous-archiving.html) is the best way to achieve your goal. Have you checked the archiver process ? ('ps -ef |grep postgres |grep archiver'). It should report success or fail. You could also check that with the system view : 'SELECT * FROM pg_stat_archiver;'. Best option is to have a look at the PostgreSQL logs. Is your /test directory writable by the postgres linux user ? To help you, there's a few PITR tools that handles backups, restores and even backup retention. Have a look at https://pgbackrest.org/ ;-) Kind regards, On 2/21/20 12:18 PM, mimble9@danwin1210.me wrote: > I have been attempting, without success, to create a Write Ahead Log (WAL) > backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS. > > It might be that I am doing the wrong thing and my method will never work. > In which case: what is the best way to achieve my 'Goal' (below)? > > Goal: > > I want to backup a database every 15 minutes. Ideally, I want to backup > only if the database has changed (perhaps this isn't possible and backups > happen whether or not the database has changed)? > > Attempts: > > I edited /etc/postgresql/10/main/postgresql.conf as follows: > > wal_level = replica > archive_mode = on > archive_command = 'cp %p /test/%f' > archive_timeout = 900 > > This should - in my opinion - every 15 minutes backup (cp) the database > (referenced by %p) to the /test/ directory using the filename (referenced > by %f) of the database. > > I then: > > /etc/init.d/postgresql stop > /etc/init.d/postgresql start > > And checked that PostgreSQL is running with /etc/init.d/postgresql status: > > postgresql.service - PostgreSQL RDBMS > Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor > preset: enabled) > Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago > Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS) > Main PID: 60534 (code=exited, status=0/SUCCESS) > > Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset > devices.list: Operation not permitted > Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS... > Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS. > > However, the /test/ directory did not show any backup files > > I then read that I needed to use pg_basebackup to make an initial backup > (before editing postgresql.conf) so I tried pg_basebackup -D /etc/backup. > This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup > but I'm not sure how this relates to or effects (if it does) the Write > Ahead Log. > > I still don't have any backups in /test/ (or anywhere). > > Can anyone kindly explain how, exactly, I can achieve my 'Goal' whether > using a modification of this method or something different (but superior). > Many thanks! > > > > > -- Stefan FERCOT http://dalibo.com - http://dalibo.org
Greetings, * mimble9@danwin1210.me (mimble9@danwin1210.me) wrote: > I have been attempting, without success, to create a Write Ahead Log (WAL) > backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS. There isn't really any such thing as a "WAL" backup. The WAL is where PG writes before it writes to the actual heap, but you don't typically keep WAL since the start of the database and so you need to have a copy of the heap files to start from and then you use WAL to replay up to a specific point in time. > It might be that I am doing the wrong thing and my method will never work. The method you've outlined below won't work, no. > In which case: what is the best way to achieve my 'Goal' (below)? There's a few options. > Goal: > > I want to backup a database every 15 minutes. Ideally, I want to backup > only if the database has changed (perhaps this isn't possible and backups > happen whether or not the database has changed)? This is what's known as an 'incremental backup' and there's a few tools out there which support that- but I suggest you first consider if your goal is actually to backup every 15 minutes, or if your goal is to be able to minimize data loss to, at most, 15 minutes, or if your goal is to recover to any specific point in time. The reason this matters is that PG supports Point-in-time-Recovery (PITR), by using file-level backups combined with the write ahead log. This means that you can restore a backup of PG and then play forward to *any* point in time- 9:00, 9:15, 9:17, 10:20, whatever. Taking more frequent incremental backups will mean that there is less WAL to have to go through to get to a specific point in time, but unless you have a very write-heavy system, it won't take long to play through an hour of WAL, so you might consider taking backups once an hour, or even less frequently. Reasonably common setups include daily incremental backups and then regular differential and/or full backups. > Attempts: > > I edited /etc/postgresql/10/main/postgresql.conf as follows: > > wal_level = replica > archive_mode = on > archive_command = 'cp %p /test/%f' > archive_timeout = 900 I strongly recommend against trying to write your own backup software for PG. In particular, 'cp' isn't really safe to use as an archive_command as it doesn't 'fsync' the WAL file after copying it, meaning you might be missing WAL if the system crashes. When using archive_command, hopefully with a proper tool, it can make sense to have an archive_timeout set, which will cause PG to kick out a WAL segment after that long, if it hasn't already, and there's been changes since the last WAL segment. This avoids the risk that a WAL segment sits with data in it for a long and then the system crashes in an unrecoverable way and those changes are then lost. Note that this primarily makes sense when you're actually archiving the WAL to a different system. PG, as part of each commit, will fsync the data to the WAL and so as long as the filesystem recovers, no data will be lost. > This should - in my opinion - every 15 minutes backup (cp) the database > (referenced by %p) to the /test/ directory using the filename (referenced > by %f) of the database. No, that's not how archive_command works at all. The archive_command simply copies the *WAL* file (each of which are 16MB by default) to another location (ideally, a remote one, where the file is fsync'd, prior to returning to PG). The database itself is *not* copied. > I then: > > /etc/init.d/postgresql stop > /etc/init.d/postgresql start > > And checked that PostgreSQL is running with /etc/init.d/postgresql status: > > postgresql.service - PostgreSQL RDBMS > Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor > preset: enabled) > Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago > Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS) > Main PID: 60534 (code=exited, status=0/SUCCESS) > > Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset > devices.list: Operation not permitted > Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS... > Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS. > > However, the /test/ directory did not show any backup files You would just end up with WAL files in that directory, per the discussion above regarding the purpose of archive_command. > I then read that I needed to use pg_basebackup to make an initial backup > (before editing postgresql.conf) so I tried pg_basebackup -D /etc/backup. > This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup > but I'm not sure how this relates to or effects (if it does) the Write > Ahead Log. pg_basebackup is one tool to create a backup, but it doesn't (currently, at least) support incremental backups, so each backup is a full backup of the system. The WAL can be used to take that backup up to a specific point in time. > I still don't have any backups in /test/ (or anywhere). Based on what you wrote above, the contents of '/etc/backup' should actually be a backup of PG, with the necessary WAL to bring it up consistently. Note that it's absolutely essential that all of the WAL generated during a backup be kept, otherwise the backup is not valid. I strongly recommend using a tool which ensures that all WAL from a backup is verified to have been stored properly. Thanks, Stephen
Attachment
Thank you for your input, > > You could also check that with the system view : 'SELECT * FROM > pg_stat_archiver;'. > 5 | 000000010000000000000004 | 2020-02-21 01:29:56.77076+00 | 0 | | | 2020-02-21 00:31:12.898026+00 So there is some kind of archive but not one that is regularly updating. > Is your /test directory writable by the postgres linux user ? Permissions are: drwxr-xr-x 1 root root 278 Feb 21 01:50 test > > To help you, there's a few PITR tools that handles backups, restores and > even backup retention. Have a look at https://pgbackrest.org/ ;-) > I've had a look at the FAQ for this tool and it seems suitable. Have you used it? I like the step-by-step instructions plus screenshots of the results. And it uses the WAL. My impression is that this is a user-friendly way for novices to backup their databases hopefully without having the convoluted messes that I'm currently having editing postgresql.conf, etc. Is that correct? > Kind regards, > > > On 2/21/20 12:18 PM, mimble9@danwin1210.me wrote: > >> I have been attempting, without success, to create a Write Ahead Log >> (WAL) >> backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS. >> >> It might be that I am doing the wrong thing and my method will never >> work. In which case: what is the best way to achieve my 'Goal' (below)? >> >> >> Goal: >> >> >> I want to backup a database every 15 minutes. Ideally, I want to backup >> only if the database has changed (perhaps this isn't possible and >> backups happen whether or not the database has changed)? >> >> Attempts: >> >> >> I edited /etc/postgresql/10/main/postgresql.conf as follows: >> >> >> wal_level = replica archive_mode = on archive_command = 'cp %p /test/%f' >> archive_timeout = 900 >> >> This should - in my opinion - every 15 minutes backup (cp) the database >> (referenced by %p) to the /test/ directory using the filename >> (referenced >> by %f) of the database. >> >> I then: >> >> >> /etc/init.d/postgresql stop >> /etc/init.d/postgresql start >> >> >> And checked that PostgreSQL is running with /etc/init.d/postgresql >> status: >> >> >> postgresql.service - PostgreSQL RDBMS Loaded: loaded >> (/lib/systemd/system/postgresql.service; enabled; vendor >> preset: enabled) >> Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago >> Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS) >> Main PID: 60534 (code=exited, status=0/SUCCESS) >> >> >> Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset >> devices.list: Operation not permitted >> Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS... >> Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS. >> >> >> However, the /test/ directory did not show any backup files >> >> >> I then read that I needed to use pg_basebackup to make an initial >> backup (before editing postgresql.conf) so I tried pg_basebackup -D >> /etc/backup. >> This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup >> but I'm not sure how this relates to or effects (if it does) the Write >> Ahead Log. >> >> >> I still don't have any backups in /test/ (or anywhere). >> >> >> Can anyone kindly explain how, exactly, I can achieve my 'Goal' whether >> using a modification of this method or something different (but >> superior). Many thanks! >> >> >> >> >> >> > -- > Stefan FERCOT > http://dalibo.com - http://dalibo.org > > > > >
Hi, On 2/22/20 12:07 AM, mimble9@danwin1210.me wrote: > Thank you for your input, > >> You could also check that with the system view : 'SELECT * FROM >> pg_stat_archiver;'. >> > 5 | 000000010000000000000004 | 2020-02-21 01:29:56.77076+00 | 0 > | | | 2020-02-21 00:31:12.898026+00 > > > So there is some kind of archive but not one that is regularly updating. > >> Is your /test directory writable by the postgres linux user ? > Permissions are: drwxr-xr-x 1 root root 278 Feb 21 01:50 test That's a problem. /test has to be writable by the postgres linux user. -> "sudo -u postgres touch /test/some_file" should work. Below commands should solve that : chown postgres: /test chmod 750 /test Of course, /test isn't really suitable in long term to store backups. Don't let that stored locally, and give your directory a representative name like pgbackups or something like that. Furthermore, if the PostgreSQL cluster has a specific name or meaning (like an application name), try not to store files in the main directory of a mount point. It would be safer to use something like /pgbackups/app-or-cluster-representative-name. >> To help you, there's a few PITR tools that handles backups, restores and >> even backup retention. Have a look at https://pgbackrest.org/ ;-) >> > I've had a look at the FAQ for this tool and it seems suitable. Have you > used it? I like the step-by-step instructions plus screenshots of the > results. And it uses the WAL. > > My impression is that this is a user-friendly way for novices to backup > their databases hopefully without having the convoluted messes that I'm > currently having editing postgresql.conf, etc. Is that correct? > In fact, as Stephen explained, WAL "backup" (archives) isn't enough. Point-in-time recovery can be achieved by combining a file-system-level copy and WAL archiving. Then, re-apply WAL archives on top of the copy to restore the data to a specific target (time, transaction id,...). pg_basebackup is one way to take a copy. Then indeed, you need to handle WAL archives and pg_basebackup copies retention,... pgBackRest is a tool that will help you for that. It will provide a specific command for WAL archiving ( PostgreSQL archive_command) and commands to take "backups" (file-system-level copies). Those backups can be full, differential or even incremental. You'll still need to configure postgresql.conf to configure WAL archiving. pgBackRest will also help with the "restore" command. (And yes, I already used it a lot. I'm like a "super fan" : https://pgstef.github.io/pitr/) >> Kind regards, >> >> >> On 2/21/20 12:18 PM, mimble9@danwin1210.me wrote: >> >>> I have been attempting, without success, to create a Write Ahead Log >>> (WAL) >>> backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS. >>> >>> It might be that I am doing the wrong thing and my method will never >>> work. In which case: what is the best way to achieve my 'Goal' (below)? >>> >>> >>> Goal: >>> >>> >>> I want to backup a database every 15 minutes. Ideally, I want to backup >>> only if the database has changed (perhaps this isn't possible and >>> backups happen whether or not the database has changed)? >>> >>> Attempts: >>> >>> >>> I edited /etc/postgresql/10/main/postgresql.conf as follows: >>> >>> >>> wal_level = replica archive_mode = on archive_command = 'cp %p /test/%f' >>> archive_timeout = 900 >>> >>> This should - in my opinion - every 15 minutes backup (cp) the database >>> (referenced by %p) to the /test/ directory using the filename >>> (referenced >>> by %f) of the database. >>> >>> I then: >>> >>> >>> /etc/init.d/postgresql stop >>> /etc/init.d/postgresql start >>> >>> >>> And checked that PostgreSQL is running with /etc/init.d/postgresql >>> status: >>> >>> >>> postgresql.service - PostgreSQL RDBMS Loaded: loaded >>> (/lib/systemd/system/postgresql.service; enabled; vendor >>> preset: enabled) >>> Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago >>> Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS) >>> Main PID: 60534 (code=exited, status=0/SUCCESS) >>> >>> >>> Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset >>> devices.list: Operation not permitted >>> Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS... >>> Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS. >>> >>> >>> However, the /test/ directory did not show any backup files >>> >>> >>> I then read that I needed to use pg_basebackup to make an initial >>> backup (before editing postgresql.conf) so I tried pg_basebackup -D >>> /etc/backup. >>> This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup >>> but I'm not sure how this relates to or effects (if it does) the Write >>> Ahead Log. >>> >>> >>> I still don't have any backups in /test/ (or anywhere). >>> >>> >>> Can anyone kindly explain how, exactly, I can achieve my 'Goal' whether >>> using a modification of this method or something different (but >>> superior). Many thanks! >>> -- Stefan FERCOT http://dalibo.com - http://dalibo.org