Home > mailing lists

Re: production server down - Mailing list pgsql-hackers

From	Joe Conway
Subject	Re: production server down
Date	December 15, 2004 05:50:13
Msg-id	41BFD08A.5000501@joeconway.com Whole thread Raw
In response to	Re: production server down (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: production server down
List	pgsql-hackers

Tree view

Tom Lane wrote:
>>...
>>pg_control last modified:             Tue Dec 14 15:39:26 2004
>>...
>>Time of latest checkpoint:            Tue Nov  2 17:05:32 2004
> 
> [ blink... ]  That seems like an unreasonable gap between checkpoints,
> especially for a production server.  Can you see an explanation?

Hmmm, this is even more scary. We have two database clusters on this 
server, one on /replica/pgdata, and one on /production/pgdata (ignore 
the names -- /replica is actually the "production" instance at the moment).

# pg_controldata /replica/pgdata
pg_control version number:            72
Catalog version number:               200310211
Database cluster state:               shutting down
pg_control last modified:             Tue Dec 14 15:39:26 2004
Current log file ID:                  0
Next log file segment:                1
Latest checkpoint location:           0/9B0B8C
Prior checkpoint location:            0/9AA1B4
Latest checkpoint's REDO location:    0/9B0B8C
Latest checkpoint's UNDO location:    0/0
Latest checkpoint's StartUpID:        12
Latest checkpoint's NextXID:          536
Latest checkpoint's NextOID:          17142
Time of latest checkpoint:            Tue Nov  2 17:05:32 2004
Database block size:                  8192
Blocks per segment of large relation: 131072
Maximum length of identifiers:        64
Maximum number of function arguments: 32
Date/time type storage:               64-bit integers
Maximum length of locale name:        128
LC_COLLATE:                           C
LC_CTYPE:                             C

# pg_controldata /production/pgdata
pg_control version number:            72
Catalog version number:               200310211
Database cluster state:               shutting down
pg_control last modified:             Tue Nov  2 21:57:49 2004
Current log file ID:                  0
Next log file segment:                1
Latest checkpoint location:           0/9B0B8C
Prior checkpoint location:            0/9AA1B4
Latest checkpoint's REDO location:    0/9B0B8C
Latest checkpoint's UNDO location:    0/0
Latest checkpoint's StartUpID:        12
Latest checkpoint's NextXID:          536
Latest checkpoint's NextOID:          17142
Time of latest checkpoint:            Tue Nov  2 17:05:32 2004
Database block size:                  8192
Blocks per segment of large relation: 131072
Maximum length of identifiers:        64
Maximum number of function arguments: 32
Date/time type storage:               64-bit integers
Maximum length of locale name:        128
LC_COLLATE:                           C
LC_CTYPE:                             C

I have no idea how this happened, but those look too similar except for 
the "last modified" date. The space used is quite what I'd expect:

# du -h --max-depth=1 /replica
403G    /replica/pgdata

# du -h --max-depth=1 /production
201G    /production/pgdata

The "/production/pgdata" cluster has not been in use since Nov 2. But 
we've been loading data aggressively into "/replica/pgdata".

Any theories on how we screwed up?

Joe

pgsql-hackers by date:

From: strk@refractions.net
Date: 15 December 2004, 05:43:09
Subject: Re: [postgis-devel] RE: join selectivity

From: Tom Lane
Date: 15 December 2004, 06:10:59
Subject: Re: production server down

Re: production server down - Mailing list pgsql-hackers

Previous

Next