Re: [BUG] pg_basebackup from disconnected standby fails - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [BUG] pg_basebackup from disconnected standby fails |
Date | |
Msg-id | CAA4eK1LxTo=mN6CRKsMs2Bk4K9UOd7U24Lb3nPk2MOT4e-Rw3w@mail.gmail.com Whole thread Raw |
In response to | Re: [BUG] pg_basebackup from disconnected standby fails (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [BUG] pg_basebackup from disconnected standby fails
|
List | pgsql-hackers |
On Tue, Oct 25, 2016 at 10:43 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Oct 24, 2016 at 12:26 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> You are right that it will include additional WAL than strictly >> necessary, but that can happen today as well because minrecoverypoint >> could be updated after you have established restart point for >> do_pg_start_backup(). Do you want to say that even without patch in >> some cases we are including additional WAL in backup than what is >> strictly necessary, so it is better to improve the situation? > > In that case, including additional WAL is *necessary*. If the minimum > recovery point is updated after do_pg_start_backup(), pages bearing > LSNs newer than the old minimum recovery point could conceivably have > been copied as part of the backup, > I might be missing something here, but I think this theory doesn't hold true with respect to current code because we always update minimum recovery point to last record being replayed not till till the LSN of the page being flushed. This is done to avoid repeated update of control file. Considering that, it seems to me that the patch by Kyotaro-San is a reasonable fix provided we update comments at all appropriate places. I can try to update the comments, if the proposed fix is acceptable to you. > and therefore we need to replay up > through the new minimum recovery point to guarantee consistency. > >>> I can think of two solutions that would be "tighter": >>> >>> 1. When performing a restartpoint, update the minimum recovery point >>> to just beyond the checkpoint record. I think this can't hurt anyone >>> who is actually restarting recovery on the same machine, because >>> that's exactly the point where they are going to start recovery. A >>> minimum recovery point which precedes the point at which they will >>> start recovery is no better than one which is equal to the point at >>> which they will start recovery. >> >> I think this will work but will cause an unnecessary control file >> update for the cases when buffer flush will anyway do it. It can work >> if we try to do only when required (minrecoverypoint is lesser than >> lastcheckpoint) after flush of buffers. > > Sure, that doesn't sound too hard to implement. > If you are inclined towards this solution, then I think what we need to do is to change the API UpdateMinRecoveryPoint() such that it's second parameter can take three values. 0 means update minRecoveryPoint to passed lsn if minRecoveryPoint < lsn; 1 means update minRecoveryPoint to latest replayed point if minRecoveryPoint < lsn, same as currently false for *force*; 2 indicates same behaviour as current *force* as true. Also we need to pass currentTLI parameter (lastCheckPoint.ThisTimeLineID) to this API to update minRecoveryPointTLI. I have not tried this, but I think something on these lines should work. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: