Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session |
Date | |
Msg-id | AANLkTikaaAR_g45N8xA5OHN9PjZUa+7AziQJotfvyqvJ@mail.gmail.com Whole thread Raw |
In response to | Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: [BUGS] BUG #5305: Postgres service stops when
closing Windows session
|
List | pgsql-hackers |
On Tue, Aug 24, 2010 at 8:57 AM, Bruce Momjian <bruce@momjian.us> wrote: > Robert Haas wrote: >> [moving to -hackers] >> >> On Thu, Aug 19, 2010 at 9:43 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> > I suspect this is the same problem as bug #4897, and probably also the >> > same problem as this: >> > http://archives.postgresql.org/pgsql-bugs/2009-08/msg00114.php >> > >> > and maybe also this and this: >> > http://archives.postgresql.org/pgsql-bugs/2010-02/msg00179.php >> > http://archives.postgresql.org/pgsql-admin/2009-05/msg00105.php >> > >> > Unfortunately, it seems that no one has been able to get a stack trace yet. >> >> Bruce pointed out yet another report of this problem to me: >> >> http://archives.postgresql.org/pgsql-general/2010-08/msg00550.php >> >> After some discussion with Magnus, I think what is going on here is >> that the postmaster kicks off a new child process, which terminates >> before it actually starts running our code, either in OS-supplied code >> or some sort of "filter" like anti-spam or anti-virus software. It's >> presumably NOT dying in our code because - at least AFAICS - we don't >> exit(128) anywhere. One way we could possibly improve the situation >> is to not treat this as a child crash - that is, don't do a >> crash-and-restart cycle; just treat that backend as having done >> elog(FATAL). The trick is that you need a reliable way to distinguish >> between a regular child crash and an "early" child crash. Magnus >> suggested perhaps we could create a mutex that the child grabs before >> mapping shared memory; the postmaster could check whether the mutex >> had been taken. If so, we handle the crash normally; if not, we just >> chalk it up to experience and continue on. >> >> This isn't really a "fix" for the bug in the sense that the nicest >> thing of all would be to prevent the child from exiting abnormally in >> the first place. But it's far from clear that we can control that. > > This URL has some interesting details on our problem: > > http://stackoverflow.com/questions/139090/getexitcodeprocess-returns-128 > > Error code 128 is identified as: > > error code 128 RROR_WAIT_NO_CHILDREN 128 0x80 There are no child > processes to wait for > > and the suggested cause is: > > Have a look at Desktop Heap memory. > > Essentially the desktop heap issue comes down to exhausted resources (eg > starting too many processes). When your app runs out of these resources, > one of the symptoms is that you won't be able to start a new process, > and the call to CreateProcess will fail with code 128. > > My guess is that at the time of CreateProcess(), there is enough desktop > heap memory, but at some later time, perhaps caused by a logout, there > isn't and the process never gets started. Yeah, that seems very plausible, although exactly how to verify I don't know. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-hackers by date: