Thread: standalone backend PANICs during recovery
While investigating a problem on a streaming hot standby instance at a customer site, i got the following when using a standalone backend: PANIC: btree_xlog_delete_get_latestRemovedXid: cannot operate with inconsistent data CONTEXT: xlog redo delete: index 1663/65588/65625; iblk 11, heap 1663/65588/65613; The responsible PANIC is triggered here: src/backend/access/nbtree/nbtxlog.c:555 btree_xlog_delete_get_latestRemovedXid(XLogReaderState *record) { [...] if (!reachedConsistency) elog(PANIC, "btree_xlog_delete_get_latestRemovedXid: cannot operate with inconsistent data"); [...] } There's already an "optimization" before, exiting with InvalidTransactionId in case a HS doesn't count any active backends. In standalone mode however, CountDBBackends() will always return 1 afaik. It looks like btree_xlog_delete_get_latestRemovedXid() needs an additional check for standalone backends, so i came up with the attached patch. This allowed the standalone backend to recover without any problems. -- Thanks Bernd
Attachment
Bernd Helmle wrote: > While investigating a problem on a streaming hot standby instance at a > customer site, i got the following when using a standalone backend: > > PANIC: btree_xlog_delete_get_latestRemovedXid: cannot operate with > inconsistent data > CONTEXT: xlog redo delete: index 1663/65588/65625; iblk 11, heap > 1663/65588/65613; > > The responsible PANIC is triggered here: > > src/backend/access/nbtree/nbtxlog.c:555 > > btree_xlog_delete_get_latestRemovedXid(XLogReaderState *record) This PANIC was introduced in the commit below. AFAICS your proposed patch and rationale are correct. commit f786e91a75b2f64527dcf321e754b6448fcad7fe Author: Tom Lane <tgl@sss.pgh.pa.us> AuthorDate: Fri Aug 3 15:41:18 2012 -0400 CommitDate: Fri Aug 3 15:41:18 2012 -0400 Improve underdocumented btree_xlog_delete_get_latestRemovedXid() code. As noted by Noah Misch, btree_xlog_delete_get_latestRemovedXidis critically dependent on the assumption that it's examining a consistent stateof the database. This was undocumented though, so the seemingly-unrelated check for no active HS sessions might bethought to be merely an optional optimization. Improve comments, and add an explicit check of reachedConsistency justto be sure. This function returns InvalidTransactionId (thereby killing all HS transactions) in several casesthat are not nearly unlikely enough for my taste. This commit doesn't attempt to fix those deficiencies, just documentthem. Back-patch to 9.2, not from any real functional need but just to keep the branches more closely syncedto simplify possible future back-patching. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Apr 2, 2016 at 5:57 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Bernd Helmle wrote: >> While investigating a problem on a streaming hot standby instance at a >> customer site, i got the following when using a standalone backend: >> >> PANIC: btree_xlog_delete_get_latestRemovedXid: cannot operate with >> inconsistent data >> CONTEXT: xlog redo delete: index 1663/65588/65625; iblk 11, heap >> 1663/65588/65613; >> >> The responsible PANIC is triggered here: >> >> src/backend/access/nbtree/nbtxlog.c:555 >> >> btree_xlog_delete_get_latestRemovedXid(XLogReaderState *record) > > This PANIC was introduced in the commit below. AFAICS your proposed > patch and rationale are correct. If nobody's going to commit this right away, this should be added to the next CommitFest so we don't forget it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
--On 5. April 2016 21:50:17 -0400 Robert Haas <robertmhaas@gmail.com> wrote: > If nobody's going to commit this right away, this should be added to > the next CommitFest so we don't forget it. Done. -- Thanks Bernd
[ got around to looking at this finally ] Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Bernd Helmle wrote: >> While investigating a problem on a streaming hot standby instance at a >> customer site, i got the following when using a standalone backend: >> >> PANIC: btree_xlog_delete_get_latestRemovedXid: cannot operate with >> inconsistent data >> CONTEXT: xlog redo delete: index 1663/65588/65625; iblk 11, heap >> 1663/65588/65613; >> >> The responsible PANIC is triggered here: >> src/backend/access/nbtree/nbtxlog.c:555 > This PANIC was introduced in the commit below. AFAICS your proposed > patch and rationale are correct. I'm not very convinced by this. What seems to me to be the core issue is how it can make sense for InHotStandby to read as TRUE in a standalone backend. The best-case situation if that's true is that the standalone backend will do a lot of unnecessary work on the baseless assumption that there are other backends it has to keep things valid for. The worst-case situation is that you trip over some bug and are unable to complete recovery, even though you would have if the hot-standby support hadn't been activated. In short, I don't think control should have been here at all. My proposal for a fix is to force EnableHotStandby to remain false in a standalone backend. regards, tom lane
I wrote: > In short, I don't think control should have been here at all. My proposal > for a fix is to force EnableHotStandby to remain false in a standalone > backend. I tried to reproduce Bernd's problem by starting a standalone backend in a data directory that was configured as a hot standby slave, and soon found that there are much bigger issues than this. The startup sequence soon tries to wait for WAL to arrive, which in HEAD uses WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH, 5000L); which immediately elog(FATAL)s because a standalone backend has no parent postmaster and so postmaster_alive_fds[] isn't set. But if it didn't do that, it'd wait forever because of course there is no active WAL receiver process that would ever provide more WAL. The only way that you'd ever get to a command prompt is if somebody made a promotion trigger file, which would cause the startup code to promote the cluster into master status, which does not really seem like something that would be a good idea in Bernd's proposed use case of "investigating a problem". Alternatively, if we were to force standby_mode off in a standalone backend, it would come to the command prompt right away but again it would have effectively promoted the cluster to master. That is certainly not something we should ever do automatically. So at this point I'm pretty baffled as to what the actual use-case is here. I am tempted to say that a standalone backend should refuse to start at all if a recovery.conf file is present. If we do want to allow the case, we need some careful thought about what it should do. regards, tom lane
On Sun, Aug 21, 2016 at 1:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > So at this point I'm pretty baffled as to what the actual use-case is > here. It is easier to attach a debugger in this case to analyze the problem? > I am tempted to say that a standalone backend should refuse to > start at all if a recovery.conf file is present. If we do want to > allow the case, we need some careful thought about what it should do. +1 for preventing an instance in --single mode to start if recovery.conf is present. It is better to put safeguards than getting unwanted behaviors. -- Michael
--On 20. August 2016 12:41:48 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: > So at this point I'm pretty baffled as to what the actual use-case is > here. I am tempted to say that a standalone backend should refuse to > start at all if a recovery.conf file is present. If we do want to > allow the case, we need some careful thought about what it should do. Well, the "use case" was a crashed hot standby at a customer site (it kept PANICing after restarting), where i decided to have a look through the recovery process with gdb. The exact PANIC was 2016-03-29 15:12:40 CEST [16097]: [26-1] user=,db=,xid=0,vxid=1/0 FATAL: unexpected GIN leaf action: 0 I had the idea that it was quick and dirty to use a single backend. I was surprised that this time it PANIC'ed differently.... That said, i'm okay if --single is not intended to bring up a hot standby. There are many other ways to debug such problems. -- Thanks Bernd
On Wed, Aug 24, 2016 at 5:07 PM, Bernd Helmle <mailings@oopsware.de> wrote: > That said, i'm okay if --single is not intended to bring up a hot standby. > There are many other ways to debug such problems. This patch is still on the CF app: https://commitfest.postgresql.org/10/610/ Even after thinking about it, my opinion is still the same. Let's prevent a standalone backend to start if it sees recovery.conf. Attached is a patch and the CF entry is switched as ready for committer to move on. -- Michael
Attachment
Michael Paquier <michael.paquier@gmail.com> writes: > On Wed, Aug 24, 2016 at 5:07 PM, Bernd Helmle <mailings@oopsware.de> wrote: >> That said, i'm okay if --single is not intended to bring up a hot standby. >> There are many other ways to debug such problems. > This patch is still on the CF app: > https://commitfest.postgresql.org/10/610/ > Even after thinking about it, my opinion is still the same. Let's > prevent a standalone backend to start if it sees recovery.conf. > Attached is a patch and the CF entry is switched as ready for > committer to move on. Hm, StartupXLOG seems like a pretty random place to check that, especially since doing it there requires an extra stat() call. Why didn't you just make readRecoveryCommandFile() error out? regards, tom lane
On Tue, Aug 30, 2016 at 9:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> On Wed, Aug 24, 2016 at 5:07 PM, Bernd Helmle <mailings@oopsware.de> wrote: >>> That said, i'm okay if --single is not intended to bring up a hot standby. >>> There are many other ways to debug such problems. > >> This patch is still on the CF app: >> https://commitfest.postgresql.org/10/610/ >> Even after thinking about it, my opinion is still the same. Let's >> prevent a standalone backend to start if it sees recovery.conf. >> Attached is a patch and the CF entry is switched as ready for >> committer to move on. > > Hm, StartupXLOG seems like a pretty random place to check that, especially > since doing it there requires an extra stat() call. Why didn't you just > make readRecoveryCommandFile() error out? Well, the idea is to do the check before doing anything on PGDATA and leave it intact, particularly the post-crash fsync(). -- Michael
Michael Paquier <michael.paquier@gmail.com> writes: > On Tue, Aug 30, 2016 at 9:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Hm, StartupXLOG seems like a pretty random place to check that, especially >> since doing it there requires an extra stat() call. Why didn't you just >> make readRecoveryCommandFile() error out? > Well, the idea is to do the check before doing anything on PGDATA and > leave it intact, particularly the post-crash fsync(). I don't see anything very exciting between the beginning of StartupXLOG and readRecoveryCommandFile. In particular, doing the fsync seems like a perfectly harmless and maybe-good thing. If there were some operation with potentially bad side-effects in that range, it would be dangerous anyway because of the risk of readRecoveryCommandFile erroring out due to invalid contents of recovery.conf. This might be an argument for re-ordering what we're doing in StartupXLOG, but that seems like an independent discussion. regards, tom lane
On Tue, Aug 30, 2016 at 10:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> On Tue, Aug 30, 2016 at 9:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Hm, StartupXLOG seems like a pretty random place to check that, especially >>> since doing it there requires an extra stat() call. Why didn't you just >>> make readRecoveryCommandFile() error out? > >> Well, the idea is to do the check before doing anything on PGDATA and >> leave it intact, particularly the post-crash fsync(). > > I don't see anything very exciting between the beginning of StartupXLOG > and readRecoveryCommandFile. In particular, doing the fsync seems like > a perfectly harmless and maybe-good thing. If there were some operation > with potentially bad side-effects in that range, it would be dangerous > anyway because of the risk of readRecoveryCommandFile erroring out due > to invalid contents of recovery.conf. Does the attached suit better then? -- Michael
Attachment
Michael Paquier <michael.paquier@gmail.com> writes: > Does the attached suit better then? Thinking about it some more ... what we actually need to prevent, AFAICS, is standby_mode becoming true in a standalone backend. If you don't set that, then the process will do PITR recovery, but I'm not aware that there is any problem with running that standalone, and maybe it's useful to allow it for debugging purposes. So I'm thinking more like /* * Check for recovery control file, and if so set up state for offline * recovery */readRecoveryCommandFile(); + /* + * We don't support standby_mode in standalone backends; that + * requires other processes such as WAL receivers to be alive. + */ + if (StandbyModeRequested && !IsUnderPostmaster) + ereport(FATAL, ...); +/* * Save archive_cleanup_command in shared memory so that other processes * can see it. */ or we could put the check near the bottom of readRecoveryCommandFile. Not sure which placement is clearer. regards, tom lane
On Tue, Aug 30, 2016 at 11:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> Does the attached suit better then? > > Thinking about it some more ... what we actually need to prevent, AFAICS, > is standby_mode becoming true in a standalone backend. If you don't set > that, then the process will do PITR recovery, but I'm not aware that there > is any problem with running that standalone, and maybe it's useful to > allow it for debugging purposes. So I'm thinking more like > > /* > * Check for recovery control file, and if so set up state for offline > * recovery > */ > readRecoveryCommandFile(); > > + /* > + * We don't support standby_mode in standalone backends; that > + * requires other processes such as WAL receivers to be alive. > + */ > + if (StandbyModeRequested && !IsUnderPostmaster) > + ereport(FATAL, ...); > + > /* > * Save archive_cleanup_command in shared memory so that other processes > * can see it. > */ > > or we could put the check near the bottom of readRecoveryCommandFile. > Not sure which placement is clearer. I have spent some time playing with that and you are right. Only standby_mode = on is able trigger a failure here, and the first one is in WaitForWALToBecomeAvailable(). I'd just put the check at the end of readRecoveryCommandFile(), this will avoid extra thinking should readRecoveryCommandFile() be moved around. That's unlikely to happen but it is a cheap insurance. I have added a test on that using 001_stream_rep.pl, now that's not actually a good idea to push this test as if it passes the execution of postgres would just hang and prevent the test to continue to run and move on. But it makes it easier for anybody looking at this patch to test the pattern prevented here. -- Michael
Attachment
Michael Paquier <michael.paquier@gmail.com> writes: > On Tue, Aug 30, 2016 at 11:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Thinking about it some more ... what we actually need to prevent, AFAICS, >> is standby_mode becoming true in a standalone backend. > I have spent some time playing with that and you are right. Only > standby_mode = on is able trigger a failure here, and the first one is > in WaitForWALToBecomeAvailable(). I'd just put the check at the end of > readRecoveryCommandFile(), this will avoid extra thinking should > readRecoveryCommandFile() be moved around. That's unlikely to happen > but it is a cheap insurance. Pushed with minor adjustments --- I moved the check to be closer to existing sanity checks in readRecoveryCommandFile, tweaked the wording (the man page for postgres refers to --single as "single-user server" mode) and added an errcode. regards, tom lane
Bernd Helmle <mailings@oopsware.de> writes: > Well, the "use case" was a crashed hot standby at a customer site (it kept > PANICing after restarting), where i decided to have a look through the > recovery process with gdb. The exact PANIC was > 2016-03-29 15:12:40 CEST [16097]: [26-1] user=,db=,xid=0,vxid=1/0 FATAL: > unexpected GIN leaf action: 0 BTW, that didn't happen to be on big-endian hardware did it? regards, tom lane
--On 3. September 2016 um 02:05:00 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Well, the "use case" was a crashed hot standby at a customer site (it >> kept PANICing after restarting), where i decided to have a look through >> the recovery process with gdb. The exact PANIC was > >> 2016-03-29 15:12:40 CEST [16097]: [26-1] user=,db=,xid=0,vxid=1/0 FATAL: >> unexpected GIN leaf action: 0 > > BTW, that didn't happen to be on big-endian hardware did it? You're right, this was RHEL7 on a POWER7 machine. -- Thanks Bernd