Thread: database died

database died

From

"Martin A. Marques"

Date:

14 March 2001, 08:33:09

I have a server, with Postgres, and a squid running. The squid is a real 
CPU-mem eater, and in one of the inserts to the DB, I got this on the log 
file and then the postmaster died.

/dbs/postgres/bin/postmaster: reaping dead processes...
/dbs/postgres/bin/postmaster: CleanupProc: pid 27317 exited with status 0
2001-03-13 18:34:03 DEBUG:  proc_exit(0)
2001-03-13 18:34:03 DEBUG:  shmem_exit(0)
2001-03-13 18:34:03 DEBUG:  exit(0)
/dbs/postgres/bin/postmaster: reaping dead processes...
/dbs/postgres/bin/postmaster: CleanupProc: pid 27557 exited with status 0
CheckPoint Data Base: fork failed: Not enough space
invoking IpcMemoryCreate(size=1245184)
FindExec: found "/dbs/postgres/bin/postmaster" using argv[0] 

I'm on Postgresql-7.1beta5 on Solaris 7, compiled with gcc.

I think today we are going to reconfigure the squid so it doesn't eat so much 
mem (other things on that server don't work good either).

Any idea on this? I think the the postmaster shouldn't die, at least it's 
what I first thought.

-- 
System Administration: It's a dirty job, 
but someone told I had to do it.
-----------------------------------------------------------------
Martín Marqués            email:     martin@math.unl.edu.ar
Santa Fe - Argentina        http://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-----------------------------------------------------------------

Re: database died

From

Tom Lane

Date:

14 March 2001, 10:30:36

"Martin A. Marques" <martin@math.unl.edu.ar> writes:
> CheckPoint Data Base: fork failed: Not enough space
> [ whereupon postmaster quits ]

> Any idea on this? I think the the postmaster shouldn't die, at least it's 
> what I first thought.

I agree.  Dying if the startup subjob fails is one thing, but dying
because a routine checkpoint fails is another.  The code is treating
those two cases alike however ... will change it.
        regards, tom lane

Re: database died

From

"Martin A. Marques"

Date:

14 March 2001, 16:16:28

El Mié 14 Mar 2001 12:02, Tom Lane escribió:
> "Martin A. Marques" <martin@math.unl.edu.ar> writes:
> > CheckPoint Data Base: fork failed: Not enough space
> > [ whereupon postmaster quits ]
> >
> > Any idea on this? I think the the postmaster shouldn't die, at least it's
> > what I first thought.
>
> I agree.  Dying if the startup subjob fails is one thing, but dying
> because a routine checkpoint fails is another.  The code is treating
> those two cases alike however ... will change it.

Just happend again. At this moment the postgres on that machine is not in 
production, but should be in a short future. Is there a chance of getting 
some kind of patch, or maybe changing some configuratin parameters of the OS 
or the postmaster?
I'm getting to feel the pain on the back that one can get with Solaris.

Thanks for the feed back.

Saludos... :-)

-- 
System Administration: It's a dirty job, 
but someone told I had to do it.
-----------------------------------------------------------------
Martín Marqués            email:     martin@math.unl.edu.ar
Santa Fe - Argentina        http://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-----------------------------------------------------------------

Re: Re: database died

From

Tom Lane

Date:

14 March 2001, 17:15:54

"Martin A. Marques" <martin@math.unl.edu.ar> writes:
>> I agree.  Dying if the startup subjob fails is one thing, but dying
>> because a routine checkpoint fails is another.  The code is treating
>> those two cases alike however ... will change it.

> Just happend again. At this moment the postgres on that machine is not in 
> production, but should be in a short future. Is there a chance of getting 
> some kind of patch, or maybe changing some configuratin parameters of the OS 
> or the postmaster?

The fix is in CVS, pull it out if you need it:
http://www.postgresql.org/cgi/cvsweb.cgi/pgsql/src/backend/postmaster/postmaster.c
        regards, tom lane