Re: kill -KILL: What happens? - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: kill -KILL: What happens? |
Date | |
Msg-id | AANLkTimqrUtme9L2TDQr-gBViT5pG=kCjNJ+us=_w22Z@mail.gmail.com Whole thread Raw |
In response to | Re: kill -KILL: What happens? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: kill -KILL: What happens?
Re: kill -KILL: What happens? |
List | pgsql-hackers |
On Thu, Jan 13, 2011 at 2:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Thu, Jan 13, 2011 at 2:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Frankly I'd prefer to get rid of PostmasterIsAlive, not extend its use. >>> It sucks because you don't get a signal on parent death. With the >>> arrival of the latch code, having to check for PostmasterIsAlive >>> frequently is the only reason for an idle background process to consume >>> CPU at all. > >> What we really need is SIGPARENT. I wonder if the Linux folks would >> consider adding such a thing. Might be useful to others as well. > > That's pretty much a dead-end idea unfortunately; it would never be > portable enough to let us change our system structure to rely on it. > Even more to the point, "go away when the postmaster does" isn't > really the behavior we want anyway. "Go away when the last backend > does" is what we want. I'm not convinced. I was thinking that we could simply treat it like SIGQUIT, if it's available. I doubt there's a real use case for continuing to run queries after the postmaster and all the background processes are dead. Expedited death seems like much better behavior. Even checking PostmasterIsAlive() once per query would be reasonable, except that it'd add a system call to check for a condition that almost never holds, which I'm not eager to do. > I wonder whether we could have some sort of latch-like counter that > would count the number of active backends and deliver signals when the > count went to zero. However, if the goal is to defend against random > applications of SIGKILL, there's probably no way to make this reliable > in userspace. I don't think you can get there 100%. We could, however, make a rule that when a background process fails a PostmasterIsAlive() check, it sends SIGQUIT to everyone it can find in the ProcArray, which would at least ensure a timely exit in most real-world cases. > Another idea is to have a "postmaster minder" process that respawns the > postmaster when it's killed. The hard part of that is that the minder > can't be connected to shared memory (else its OOM cross-section is just > as big as the postmaster's), and that makes it difficult for it to tell > when all the children have gone away. I suppose it could be coded to > just retry every few seconds until success. This doesn't improve the > behavior of background processes at all, though. It hardly seems worth it. Given a reliable interlock against multiple postmasters, the real concern is making sure that a half-dead postmaster gets itself all-dead quickly so that the DBA can start up a new one before he gets fired. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: