Re: Further info : Very high load average but no cpu utilization ? - Mailing list pgsql-sql
From | Rajesh Kumar Mallah. |
---|---|
Subject | Re: Further info : Very high load average but no cpu utilization ? |
Date | |
Msg-id | 200205121116.30681.mallah@trade-india.com Whole thread Raw |
In response to | Re: Further info : Very high load average but no cpu utilization ? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Further info : Very high load average but no cpu utilization ?
|
List | pgsql-sql |
Hi there, I have observed that it is nearly impossible to get rid of postmaster or backends by any signal when it decides not to quit. Even the OS( Linux rh62) refuses to reboot in such a situation. and my system admin had to power off the system , then fsck .... and stuff. but this only happens when postmaster is stuck for some reason , i feel filling up of postmasters log file was the reason of my postmaster getting stuck. regds mallah. On Saturday 11 May 2002 09:29 pm, Tom Lane wrote: > "Rajesh Kumar Mallah." <mallah@trade-india.com> writes: > > [root@linux10320 root2]# ps auxwww| grep post > > postgres 1131 0.0 0.0 139424 4 ? D > > May1004/usr/local/pgsql/bin/postmaster postgres 1132 0.0 0.0 140412 > > 4 ? D May10 0:13 postgres: stats buffer process postgres > > 1133 0.0 0.0 139576 4 ? S May10 0:18 postgres: stats > > collector process postgres 8046 0.0 0.0 238712 4 ? D 00:25 > > 0:13 postgres: tradein tradein_clients 130.94.20.27 SELECT postgres > > 8089 0.0 0.0 139812 4 ? D 00:26 0:00 postgres: checkpoint > > subprocess postgres 11442 0.0 0.0 218152 4 ? D 04:25 0:03 > > postgres: tradein tradein_clients 130.94.20.27 SELECT postgres 15453 0.1 > > 0.0 0 0 ? Z 08:17 0:09 [postmaster <defunct>] > > postgres 15455 0.0 0.0 0 0 ? Z 08:17 0:00 > > [postmaster <defunct>] postgres 15456 0.0 0.0 0 0 ? Z > > 08:18 0:00 [postmaster <defunct>] postgres 15457 0.0 0.0 0 0 ? > > Z 08:19 0:00 [postmaster <defunct>] postgres 15462 0.0 0.0 > > 0 0 ? Z 08:20 0:01 [postmaster <defunct>] > > I think your postmaster is stuck; it should have reaped those defunct > subprocesses instantly. Given that you also seem to have a stuck > checkpoint process (8 hours to run a checkpoint?) there is probably > something hosed in the interprocess communication logic, but it's hard > to guess what from this amount of info. > > At this point probably your best bet is to kill all the running postgres > processes (try SIGTERM first, then SIGKILL if that doesn't work) and > launch a postmaster from a fresh start. Don't forget the ulimit this > time. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Rajesh Kumar Mallah, Project Manager (Development) Infocom Network Limited, New Delhi phone: +91(11)6152172 (221) (L) ,9811255597 (M) Visit http://www.trade-india.com , India's Leading B2B eMarketplace.