Re: [HACKERS] possible self-deadlock window after badProcessStartupPacket - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: [HACKERS] possible self-deadlock window after badProcessStartupPacket |
Date | |
Msg-id | 20180720150323.v2uschqaedovm7ep@alap3.anarazel.de Whole thread Raw |
In response to | Re: [HACKERS] possible self-deadlock window after badProcessStartupPacket (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: [HACKERS] possible self-deadlock window after badProcessStartupPacket
|
List | pgsql-hackers |
On 2018-07-20 11:53:32 +0300, Heikki Linnakangas wrote: > ISTM that no-one has any great ideas on what to do about the ereport() in > quickdie(). I think we actually have some decent ideas how to make that less problematic in a number of cases. Avoid translation (thereby avoiding direct malloc()) and rely on the fact that the error message evaluation happens in a memory context with pre-allocated memory. That still leaves openssl, but is better than nothing. If we wanted, we additionally could force ssl allocations to happen through another context with preallocated memory. > But I think we have consensus on replacing the exit(2) calls > with _exit(2). If we do just that, it would be better than the status quo, > even if it doesn't completely fix the problem. This would prevent the case > that Asim reported, for starters. Right. You're planning to backpatch? > With _exit(), I think we wouldn't really need the "PG_SETMASK(&BlockSig);" > calls in the aux process *_quickdie handlers that don't do anything else > than call _exit(). But I didn't dare to remove them yet. I think we should remove it at least in master. > diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c > index 960d3de204..010d53a6ce 100644 > --- a/src/backend/postmaster/bgwriter.c > +++ b/src/backend/postmaster/bgwriter.c > @@ -404,22 +404,21 @@ bg_quickdie(SIGNAL_ARGS) > /* > * We DO NOT want to run proc_exit() callbacks -- we're here because > * shared memory may be corrupted, so we don't want to try to clean up our > - * transaction. Just nail the windows shut and get out of town. Now that > - * there's an atexit callback to prevent third-party code from breaking > - * things by calling exit() directly, we have to reset the callbacks > - * explicitly to make this work as intended. > - */ > - on_exit_reset(); > - > - /* > - * Note we do exit(2) not exit(0). This is to force the postmaster into a > - * system reset cycle if some idiot DBA sends a manual SIGQUIT to a random > - * backend. This is necessary precisely because we don't clean up our > - * shared memory state. (The "dead man switch" mechanism in pmsignal.c > - * should ensure the postmaster sees this as a crash, too, but no harm in > - * being doubly sure.) > + * transaction. Just nail the windows shut and get out of town. > + * > + * There's an atexit callback to prevent third-party code from breaking > + * things by calling exit() directly. We don't want to trigger that, so > + * we use _exit(), which doesn't run atexit callbacks, rather than exit(). > + * And exit() wouldn't be safe to run from a signal handler, anyway. It's much less the exit() that's unsafe, than the callbacks themselves, right? Perhaps just restate that we wouldn't want to trigger atexit processing due to signal safety? > + * Note we use _exit(2) not _exit(0). This is to force the postmaster > + * into a system reset cycle if some idiot DBA sends a manual SIGQUIT to a > + * random backend. This is necessary precisely because we don't clean up > + * our shared memory state. (The "dead man switch" mechanism in > + * pmsignal.c should ensure the postmaster sees this as a crash, too, but > + * no harm in being doubly sure.) > */ > - exit(2); > + _exit(2); > } Greetings, Andres Freund
pgsql-hackers by date: