Re: why not kill -9 postmaster - Mailing list pgsql-general

From Tom Lane
Subject Re: why not kill -9 postmaster
Date
Msg-id 15818.1161359667@sss.pgh.pa.us
Whole thread Raw
In response to Re: why not kill -9 postmaster  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: why not kill -9 postmaster
List pgsql-general
Martijn van Oosterhout <kleptog@svana.org> writes:
> Well, if you kill -9 the postmaster all the connections stay alive and
> stay processing tuples and writing to disk, except the coordination is
> gone.

The postmaster isn't involved in any critical inter-backend coordination.
If you kill -9 the postmaster *and then kill or wait out all the
backends*, you won't lose data.  This is not a desirable long-term
operating mode, because it cripples autovacuum and some other things,
but it's not dangerous.

The only really serious risk I'm aware of in this scenario is:

1. DBA does "kill -9" postmaster, but some backends are still alive and
processing.

2. DBA tries to start new postmaster, gets message about "shared memory
segment still in use".

3. DBA does "rm postmaster.pid" (this is the step that qualifies him
as an idiot).

4. DBA starts new postmaster.  Since the interlock file is gone, it
starts up without any awareness that there are old backends still alive.

At this point, you have two separate sets of backends that are not
communicating (they're using two different shared memory segments)
but they are munging the same data files.  It will not take long
to turn the data files into irrecoverable hash --- for just one
reason, transaction numbering will diverge between the two sets of
backends.

            regards, tom lane

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Upgrade 7.4 to 8.1 or 8.2?
Next
From: "Merlin Moncure"
Date:
Subject: Re: skip duplicate key error during inserts