Thread: PITR backup history files with identical 2nd part file names
Hello After upgrading to 8.3.9 and moving some our PostgreSQL clusters to a new server yesterday, we have experienced a strange thing this past night. All PITR backup history files created when running a PITR base backup on all PostgreSQL clusters running in this new server (at different hours during the night) got an identical 2nd part file name. <24 digits>.00000020.backup e.g.000000020000020800000060.00000020.backup All of them got 00000020.backup. as 2nd part of the file name. After checking old logfiles from other servers, we can say that this is the first time this happens. They used to be very different values each time we run a PITR base backup. We found this because a 'bug' in the script we use to create the PITR base backup. This script could not find out the WAL files to be deleted after one of the base backup was finished because 00000020 could be found in all WAL file names in one of the clusters. This 'bug' has been already fixed in our script. After some more testing, it looks like we get the same 2nd part of the file name *everytime* we run a PITR base backup i.e. '.00000020.backup'. This happens on 10 different postgreSQL clusters runnig on the same server. Is it normal to get the same 2nd part of the file name all the time? How is this value generated? This behavior is strange and I wonder if there is anything wrong with this new server. Everything else looks ok and works without problems. Thanks in advance. -- Rafael Martinez, <r.m.guerrero@usit.uio.no>Center for Information Technology ServicesUniversity of Oslo, Norway PGP Public Key: http://folk.uio.no/rafael/
Rafael Martinez <r.m.guerrero@usit.uio.no> writes: > After upgrading to 8.3.9 and moving some our PostgreSQL clusters to a > new server yesterday, we have experienced a strange thing this past night. > All PITR backup history files created when running a PITR base backup on > all PostgreSQL clusters running in this new server (at different hours > during the night) got an identical 2nd part file name. > <24 digits>.00000020.backup e.g.000000020000020800000060.00000020.backup I think this is normal behavior now, if you have an unloaded server. pg_start_backup now forces a segment switch, so if nothing much else is happening it's quite likely that the recorded start point will be the beginning of the WAL segment (plus the page header size). > Is it normal to get the same 2nd part of the file name all the time? How > is this value generated? It's just the current offset within the current WAL segment. regards, tom lane
Tom Lane wrote: > Rafael Martinez <r.m.guerrero@usit.uio.no> writes: > >> All PITR backup history files created when running a PITR base backup on >> all PostgreSQL clusters running in this new server (at different hours >> during the night) got an identical 2nd part file name. > >> <24 digits>.00000020.backup e.g.000000020000020800000060.00000020.backup > > I think this is normal behavior now, if you have an unloaded server. > pg_start_backup now forces a segment switch, so if nothing much else is > happening it's quite likely that the recorded start point will be the > beginning of the WAL segment (plus the page header size). > The strange thing is that a lot is happening. These clusters generate several hundred WAL files a day. How is possible to get the same value *everytime* we run a base backup with many WAL files generated between runs and on different clusters? I trust what you say on the subject :-) .... is only that in all the years we have been using PITR, we have never seen identical values in the 2nd part of the backup history file name (not one) -- Rafael Martinez, <r.m.guerrero@usit.uio.no>Center for Information Technology ServicesUniversity of Oslo, Norway PGP Public Key: http://folk.uio.no/rafael/
Rafael Martinez <r.m.guerrero@usit.uio.no> writes: > Tom Lane wrote: >> I think this is normal behavior now, if you have an unloaded server. >> pg_start_backup now forces a segment switch, so if nothing much else is >> happening it's quite likely that the recorded start point will be the >> beginning of the WAL segment (plus the page header size). > The strange thing is that a lot is happening. These clusters generate > several hundred WAL files a day. Well, by "loaded server" I meant "something sufficient busy to generate another WAL record in the extremely narrow time window between when pg_start_backup advances the WAL pointer and when it copies the WAL pointer". It might even be that those two things happen within a single acquisition of WALInsertLock and thus there isn't any window at all --- I didn't dig into the code closely enough to be sure about that. > I trust what you say on the subject :-) .... is only that in all the > years we have been using PITR, we have never seen identical values in > the 2nd part of the backup history file name (not one) Well, it was a pretty recent change that made pg_start_backup force a segment switch. Before that you'd have seen values ranging throughout the size of a WAL segment. regards, tom lane