Cannot vacuum! Stops on first table, pg_type - Mailing list pgsql-admin
From | Palle Girgensohn |
---|---|
Subject | Cannot vacuum! Stops on first table, pg_type |
Date | |
Msg-id | 874s44sxdi.fsf@localhost.palle.se Whole thread Raw |
Responses |
Re: Cannot vacuum! Stops on first table, pg_type
|
List | pgsql-admin |
Hi! This is really strange. Starting from the beginning, I was running postmaster with logs redirected to a file (on a different filesystem). This has been going OK for quite some time. Suddenly, it filled up quickly, and I had to rm it and restart the server this friday. Today, it happened again; the log file area was filled up completely by postgres' 700 Mbyte log file. I threw it away immediately, so unfortunately I don't have these logs handy (I think I can find some on tape, though...umm...). Have set up this machine to use rotatelogs(8) (from apache) now, like on most of my newer machines :) Also, when this happened, two postgres processes went amok filling up the data directory with pg_temp.$$* files. About 25000 files (more or less empty, if memory serves me) before I realized this had happened (maybe a minute or two...?) Ran out of file descriptors, it was really hot for a while :) I shut down postmaster and removed them, since nothing worked anymore while they were there (I guess postgres couldn't create more tempfiles in that dir :) Now, $PGDATA/pg_log is about 480 Meg. Hardly normal, heh? # ls -l /usr/local/pgsql/data/pg_log -rw------- 1 pgsql pgsql 471556096 Aug 29 02:09 /usr/local/pgsql/data/pg_log But: # df -k Filesystem 1K-blocks Used Avail Capacity Mounted on ... /dev/da0s1h 1986495 120813 1706763 7% /usr/local/pgsql/data Used 120 MB, but the file is 480 MB? One of those files with "holes" in them, I presume? Anyway, please note that the PGDATA file system was never filled up, only the log FS. Long story, sorry, but I thought it be best if no details were left out... The problem now: After the last incident, I cannot vacuum one database (the big one)... other DBs are all right, but that doesn't help me... :-/ vacuum stops on the very first table to vacuum, pg_type: StartTransactionCommand query: vacuum; ProcessUtility: vacuum pg_type; DEBUG: --Relation pg_type-- DEBUG: Pages 6: Changed 0, Reapped 2, Empty 0, New 0; Tup 354: Vac 22, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 105, MaxLen109; Re-using: Free/Avail. Space 8160/176; EndEmpty/Avail. Pages 0/1. Elapsed 0/0 sec. Here it can sit for hours (believe me, I've tried!) eating CPU time (top(1) reports 95-100%) Killing the vacuuming psql off (kill -TERM) gives: proc_exit(0) [#0] shmem_exit(0) [#0] exit(0) I seems, the database and it application is working OK, but my guess is some indices are probably broken, so things may fail that I cannot see immediately, and I can't vacuum to fix this... Can I do anything besides pg_dump; dropdb; createdb; psql < dump.sql? This server is live, and I'd rather not take it down for more than seconds, if possible. Also, this should really be recoverable, right? What about the large pg_log file? I use this combo: FreeBSD-3.5 RELEASE Postgres 6.5.2 -- Palle
pgsql-admin by date: