Thread: Fwd: Stalled post to pgsql-es-ayuda
Hello everyone. I have a problem
First, some basic data:
OS: Red Hat
Version 8.3
Size 15 GB
The directory which contain the PGDATA/data is full, and i move the entire directory data to another partition, and i link symbolic to PGDATA.
Now, when i try to start the server, keep on "startup process recovering "
When i try to connect, this is the message:
psql: FATAL: the database system is starting up
And never go on.
I attach another information that is interesting
---------------------------------------------------------------------------------------------------------------
$ ps aux | grep postgres
root 24586 0.0 0.1 4920 1236 pts/2 S 10:52 0:00 su - postgres
postgres 24587 0.0 0.1 4540 1440 pts/2 S+ 10:52 0:00 -bash
root 27444 0.0 0.1 4920 1240 pts/1 S 14:40 0:00 su - postgres
postgres 27445 0.0 0.1 4536 1436 pts/1 S+ 14:40 0:00 -bash
root 27593 0.0 0.1 4920 1244 pts/3 S 14:44 0:00 su - postgres
postgres 27594 0.0 0.1 4540 1468 pts/3 S 14:44 0:00 -bash
postgres 27725 4.1 0.8 283372 7236 pts/3 S 14:47 0:00 /usr/bin/postgres
postgres 27726 0.0 0.0 13952 696 ? Ss 14:47 0:00 postgres: logger process
postgres 27727 2.0 0.5 283640 4828 ? Ds 14:47 0:00 postgres: startup process recovering 000000010000001A00000074
postgres 27730 0.0 0.1 4260 940 pts/3 R+ 14:48 0:00 ps aux
postgres 27731 0.0 0.0 3920 696 pts/3 S+ 14:48 0:00 grep postgres
root 24586 0.0 0.1 4920 1236 pts/2 S 10:52 0:00 su - postgres
postgres 24587 0.0 0.1 4540 1440 pts/2 S+ 10:52 0:00 -bash
root 27444 0.0 0.1 4920 1240 pts/1 S 14:40 0:00 su - postgres
postgres 27445 0.0 0.1 4536 1436 pts/1 S+ 14:40 0:00 -bash
root 27593 0.0 0.1 4920 1244 pts/3 S 14:44 0:00 su - postgres
postgres 27594 0.0 0.1 4540 1468 pts/3 S 14:44 0:00 -bash
postgres 27725 4.1 0.8 283372 7236 pts/3 S 14:47 0:00 /usr/bin/postgres
postgres 27726 0.0 0.0 13952 696 ? Ss 14:47 0:00 postgres: logger process
postgres 27727 2.0 0.5 283640 4828 ? Ds 14:47 0:00 postgres: startup process recovering 000000010000001A00000074
postgres 27730 0.0 0.1 4260 940 pts/3 R+ 14:48 0:00 ps aux
postgres 27731 0.0 0.0 3920 696 pts/3 S+ 14:48 0:00 grep postgres
---------------------------------------------------------------------------------------------------------------
$ pg_ctl status
pg_ctl: server is running (PID: 27725)
/usr/bin/postgres
pg_ctl: server is running (PID: 27725)
/usr/bin/postgres
---------------------------------------------------------------------------------------------------------------
$ pg_controldata
pg_control version number: 843
Catalog version number: 200904091
Database system identifier: 5409925551577885669
Database cluster state: in crash recovery
pg_control last modified: Wed 23 Jun 2010 11:20:42 AM CLT
Latest checkpoint location: 1A/73839208
Prior checkpoint location: 1A/73839208
Latest checkpoint's REDO location: 1A/7382D978
Latest checkpoint's TimeLineID: 1
Latest checkpoint's NextXID: 0/47874035
Latest checkpoint's NextOID: 51407
Latest checkpoint's NextMultiXactId: 1647347
Latest checkpoint's NextMultiOffset: 3431468
Time of latest checkpoint: Sun 30 May 2010 11:10:30 AM CLT
Minimum recovery ending location: 0/0
Maximum data alignment: 4
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 2000
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by reference
pg_control version number: 843
Catalog version number: 200904091
Database system identifier: 5409925551577885669
Database cluster state: in crash recovery
pg_control last modified: Wed 23 Jun 2010 11:20:42 AM CLT
Latest checkpoint location: 1A/73839208
Prior checkpoint location: 1A/73839208
Latest checkpoint's REDO location: 1A/7382D978
Latest checkpoint's TimeLineID: 1
Latest checkpoint's NextXID: 0/47874035
Latest checkpoint's NextOID: 51407
Latest checkpoint's NextMultiXactId: 1647347
Latest checkpoint's NextMultiOffset: 3431468
Time of latest checkpoint: Sun 30 May 2010 11:10:30 AM CLT
Minimum recovery ending location: 0/0
Maximum data alignment: 4
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 2000
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by reference
---------------------------------------------------------------------------------------------------------------
The last day with transacctin, the log finish at follow:
$ tail PG_DATA/pg_log/postgresql-Sun.log
ERROR: null value in column "protocolo" violates not-null constraint
STATEMENT: INSERT INTO trx_5(nodo, puerto, dnis, ani, protocolo, time, status, service, fecha, trama) VALUES (2, 1, 900, 0, (SELECT pk FROM protocolos WHERE data = 78), 247, (SELECT pk FROM status WHERE status = 55), (SELECT pk FROM services WHERE service = 0),now(), (SELECT pk FROM tramas WHERE equipo = 2 AND puerto = 1))
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_st-bash-3.2$
STATEMENT: INSERT INTO trx_5(nodo, puerto, dnis, ani, protocolo, time, status, service, fecha, trama) VALUES (2, 1, 900, 0, (SELECT pk FROM protocolos WHERE data = 78), 247, (SELECT pk FROM status WHERE status = 55), (SELECT pk FROM services WHERE service = 0),now(), (SELECT pk FROM tramas WHERE equipo = 2 AND puerto = 1))
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_st-bash-3.2$
---------------------------------------------------------------------------------------------------------------
$ psql
psql: FATAL: the database system is starting up
psql: FATAL: the database system is starting up
---------------------------------------------------------------------------------------------------------------
In the server i don't have lsof, but this is the information that i get:
With vmstat, the %wa is over 50%. %CPU very slow, postgres don't use over 2%.
In the /proc/27725/ ( pid of postgres 27727 2.0 0.5 283640 4828 ? Ds 14:47 0:00 postgres: startup process recovering 000000010000001A00000074) the file io have a intensive rate of write: aprox 1GB/20 minutes.
Any suggest??
--
Sebastian Machuca
Estudiante Ingeniería Civil en Computación
+56 9 77449117
http://twitter.com/serroba
Sebastian Machuca
Estudiante Ingeniería Civil en Computación
+56 9 77449117
http://twitter.com/serroba
Sebastian Machuca <serroba@gmail.com> writes: > The directory which contain the PGDATA/data is full, and i move the entire > directory data to another partition, and i link symbolic to PGDATA. > Now, when i try to start the server, keep on "startup process recovering > " > When i try to connect, this is the message: > psql: FATAL: the database system is starting up > And never go on. I think you just need to wait. The thing that jumps out at me from your details is > Time of latest checkpoint: Sun 30 May 2010 11:10:30 AM CLT If it hasn't checkpointed since May, then either you've got some seriously silly checkpoint settings or there was something blocking checkpoints. I'm guessing that it was the latter, and the reason you ran out of disk space was that WAL files kept on accumulating (since they can't be recycled until a checkpoint finishes). So my theory is that you have got a *whole lot* of WAL to churn through, and there is nothing to do but wait for that to happen. You could get an idea of how long it will take by noting how fast the startup process is advancing through WAL segments (watch its display in "ps") and looking to see what the last WAL segment in $PGDATA/pg_xlog is. Once you have a working database again, you should look into why checkpoints were blocked. Is there anything unusual in the postmaster log from before you had the out-of-space condition? regards, tom lane