Re: Postgres abort found in 9.3.11 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Postgres abort found in 9.3.11
Date
Msg-id 22750.1472737756@sss.pgh.pa.us
Whole thread Raw
In response to Postgres abort found in 9.3.11  ("K S, Sandhya (Nokia - IN/Bangalore)" <sandhya.k_s@nokia.com>)
Responses Re: Postgres abort found in 9.3.11
List pgsql-hackers
"K S, Sandhya (Nokia - IN/Bangalore)" <sandhya.k_s@nokia.com> writes:
> Our setup is a hot-standby architecture. This crash is occurring only on stand-by node. Postgres continues to run
withoutany issues on active node.
 
> Postmaster is waiting for a start and is throwing this message.

> Aug 22 11:44:21.462555 info node-0 postgres[8222]: [1-2] HINT:  Is another postmaster already running on port 5433?
Ifnot, wait a few seconds and retry.  
 
> Aug 22 11:44:52.065760 crit node-1 postgres[8629]: [18-1] err-3:  btree_xlog_delete_get_latestRemovedXid: cannot
operatewith inconsistent dataAug 22 11:44:52.065971 crit CFPU-1 postgres[8629]: [18-2] CONTEXT:  xlog redo delete:
index1663/16386/17378; iblk 1, heap 1663/16386/16518;
 

Hmm, that HINT seems to be the tail end of a message indicating that the
postmaster is refusing to start because of an existing postmaster.  Why
is that appearing?  If you've got some script that's overeagerly launching
and killing postmasters, maybe that's the ultimate cause of problems.

The only method I've heard of for getting that get_latestRemovedXid
error is to try to launch a standalone backend (postgres --single)
in a standby server directory.  We don't support that, cf
https://www.postgresql.org/message-id/flat/00F0B2CEF6D0CEF8A90119D4%40eje.credativ.lan

BTW, I'm curious about the "err-3:" part.  That would not be expected
in any standard build of Postgres ... is this something custom modified?
        regards, tom lane



pgsql-hackers by date:

Previous
From: "Constantin S. Pan"
Date:
Subject: Re: autonomous transactions
Next
From: Peter Eisentraut
Date:
Subject: Re: PATCH: Exclude additional directories in pg_basebackup