Better detection of staled postmaster.pid - Mailing list pgsql-hackers

From Pavel Raiskup
Subject Better detection of staled postmaster.pid
Date
Msg-id 1711927.hbtYs8Lf7C@nb.usersys.redhat.com
Whole thread Raw
Responses Re: Better detection of staled postmaster.pid
List pgsql-hackers
This is most likely just a request for brainstorm.

It's been reported [1] that postmaster fails to start against staled
postmaster.pid after (e.g.) power outage on Fedora, its due to init system
parallelism and "some" other newly started process can already have allocated
the same PID as the old postmaster had -- and in this case postmaster refuses
to delete staled pidfile (which is expected as we need to be really
careful).

Don't you see some other possible check we could implement to guarantee that
the PID mentioned in postmaster.pid does not hide concurrent postmaster?
I can think of /proc/<CONCURRENT_PID>/cmdline parsing for possible '-D' option
occurrence, but that is not terribly portable and it could be considered
racy, or?  Some acceptable hack we could use to tell to other processes
that we are running particular data directory?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1257334

Pavel




pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: snapshot too old, configured by time
Next
From: Kevin Grittner
Date:
Subject: Re: Better detection of staled postmaster.pid