Re: block-level incremental backup - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: block-level incremental backup |
Date | |
Msg-id | CAA4eK1JeaiGQCSA4QsokRUmDxqBq9PRcC8CSem0GsvKQ9S5JFA@mail.gmail.com Whole thread Raw |
In response to | Re: block-level incremental backup (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: block-level incremental backup
|
List | pgsql-hackers |
On Mon, Sep 16, 2019 at 7:22 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Sep 12, 2019 at 9:13 AM Jeevan Chalke > <jeevan.chalke@enterprisedb.com> wrote: > > I had a look over this issue and observed that when the new database is > > created, the catalog files are copied as-is into the new directory > > corresponding to a newly created database. And as they are just copied, > > the LSN on those pages are not changed. Due to this incremental backup > > thinks that its an existing file and thus do not copy the blocks from > > these new files, leading to the failure. > > *facepalm* > > Well, this shoots a pretty big hole in my design for this feature. I > don't know why I didn't think of this when I wrote out that design > originally. Ugh. > > Unless we change the way that CREATE DATABASE and any similar > operations work so that they always stamp pages with new LSNs, I think > we have to give up on the idea of being able to take an incremental > backup by just specifying an LSN. > This seems to be a blocking problem for the LSN based design. Can we think of using creation time for file? Basically, if the file creation time is later than backup-labels "START TIME:", then include that file entirely. I think one big point against this is clock skew like what if somebody tinkers with the clock. And also, this can cover cases like what Jeevan has pointed but might not cover other cases which we found problematic. > We'll instead need to get a list of > files from the server first, and then request the entirety of any that > we don't have, plus the changed blocks from the ones that we do have. > I guess that will make Stephen happy, since it's more like the design > he wanted originally (and should generalize more simply to parallel > backup). > > One question I have is: is there any scenario in which an existing > page gets modified after the full backup and before the incremental > backup but does not end up with an LSN that follows the full backup's > start LSN? > I think the operations covered by WAL flag XLR_SPECIAL_REL_UPDATE will have similar problems. One related point is how do incremental backups handle the case where vacuum truncates the relation partially? Basically, with current patch/design, it doesn't appear that such information can be passed via incremental backup. I am not sure if this is a problem, but it would be good if we can somehow handle this. Isn't some operations where at the end we directly call heap_sync without writing WAL will have a similar problem as well? Similarly, it is not very clear if unlogged relations are handled in some way if not, the same could be documented. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: