Re: How to shoot yourself in the foot: kill -9 postmaster - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: How to shoot yourself in the foot: kill -9 postmaster |
Date | |
Msg-id | 6536.983902247@sss.pgh.pa.us Whole thread Raw |
In response to | Re: How to shoot yourself in the foot: kill -9 postmaster (Alfred Perlstein <bright@wintelcom.net>) |
Responses |
Re: How to shoot yourself in the foot: kill -9 postmaster
RE: How to shoot yourself in the foot: kill -9 postmaster |
List | pgsql-hackers |
Alfred Perlstein <bright@wintelcom.net> writes: > I'm sure some sort of encoding of the PGDATA directory along with > the pids stored in the shm segment... I thought about this too, but it strikes me as not very trustworthy. The problem is that there's no guarantee that the new postmaster will even notice the old shmem segment: it might select a different shmem key. (The 7.1 coding of shmem key selection makes this more likely than it used to be, but even under 7.0, it will certainly fail to work if I choose to start the new postmaster using a different port number than the old one had. The shmem key is driven primarily by port number not data directory ...) The interlock has to be tightly tied to the PGDATA directory, because what we're trying to protect is the files in and under that directory. It seems that something based on file(s) in that directory is the way to go. The best idea I've seen so far is Hiroshi's idea of having all the backends hold fcntl locks on the same file (probably postmaster.pid would do fine). Then the new postmaster can test whether any backends are still alive by trying to lock the old postmaster.pid file. Unfortunately, I read in the fcntl man page: Locks are not inherited by a child process in a fork(2) system call. This makes the idea much less attractive than I originally thought: a new backend would not automatically inherit a lock on the postmaster.pid file from the postmaster, but would have to open/lock it for itself. That means there's a window where the new backend exists but would be invisible to a hypothetical new postmaster. We could work around this with the following, very ugly protocol: 1. Postmaster normally maintains fcntl read lock on its postmaster.pid file. Each spawned backend immediately opens and read-locks postmaster.pid, too, and holds that file open until it dies. (Thus wasting a kernel FD per backend, which is one of the less attractive things about this.) If the backend is unable to obtain read lock on postmaster.pid, then it complains and dies. We must use read locks here so that all these processes can hold them separately. 2. If a newly started postmaster sees a pre-existing postmaster.pid file, it tries to obtain a *write* lock on that file. If it fails, conclude that an old postmaster or backend is still alive; complain and quit. If it succeeds, sit for say 1 second before deleting the file and creating a new one. (The delay here is to allow any just-started old backends to fail to acquire read lock and quit. A possible objection is that we have no way to guarantee 1 second is enough, though it ought to be plenty if the lock acquisition is just after the fork.) One thing that worries me a little bit is that this means an fcntl read-lock request will exist inside the kernel for each active backend. Does anyone know of any performance problems or hard kernel limits we might run into with large numbers of backends (lots and lots of fcntl locks)? At least the locks are on a file that we don't actually touch in the normal course of business. A small savings is that the backends don't actually need to open new FDs for the postmaster.pid file; they can use the one they inherit from the postmaster, even though they do need to lock it again. I'm not sure how much that saves inside the kernel, but at least something. There are also the usual set of concerns about portability of flock, though this time we're locking a plain file and not a socket, so it shouldn't be as much trouble as it was before. Comments? Does anyone see a better way to do it? regards, tom lane
pgsql-hackers by date: