Home > mailing lists

Re: VM corruption on standby - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: VM corruption on standby
Date	August 17 17:33:46
Msg-id	168715.1755441226@sss.pgh.pa.us Whole thread Raw
In response to	Re: VM corruption on standby (Kirill Reshke <reshkekirill@gmail.com>)
List	pgsql-hackers

Tree view

Kirill Reshke <reshkekirill@gmail.com> writes:
> [ v1-0001-Do-not-exit-on-postmaster-death-ever-inside-CRIT-.patch ]

I do not like this patch one bit: it will replace one set of problems
with another set, namely systems that fail to shut down.

I think the actual bug here is the use of proc_exit(1) after
observing postmaster death.  That is what creates the hazard,
because it releases the locks that are preventing other processes
from observing the inconsistent state in shared memory.
Compare this to what we do, for example, on receipt of SIGQUIT:

    /*
     * We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
     * because shared memory may be corrupted, so we don't want to try to
     * clean up our transaction.  Just nail the windows shut and get out of
     * town.  The callbacks wouldn't be safe to run from a signal handler,
     * anyway.
     *
     * Note we do _exit(2) not _exit(0).  This is to force the postmaster into
     * a system reset cycle if someone sends a manual SIGQUIT to a random
     * backend.  This is necessary precisely because we don't clean up our
     * shared memory state.  (The "dead man switch" mechanism in pmsignal.c
     * should ensure the postmaster sees this as a crash, too, but no harm in
     * being doubly sure.)
     */
    _exit(2);

So I think the correct fix here is s/proc_exit(1)/_exit(2)/ in the
places that are responding to postmaster death.  There might be
more than just WaitEventSetWaitBlock; I didn't look.

            regards, tom lane

pgsql-hackers by date:

From: Etsuro Fujita
Date: 17 August, 13:50:51
Subject: Re: Obsolete comments in ResultRelInfo struct

From: Tom Lane
Date: 17 August, 18:19:22
Subject: Re: psql: Count all table footer lines in pager setup

Re: VM corruption on standby - Mailing list pgsql-hackers

Previous

Next