Thread: db corruption/recovery help

db corruption/recovery help

From

"Ed L."

Date:

06 June 2005, 17:16:51

Someone flipped a breaker switch, and evidently triggered
corruption in one of our major clusters:

$ cat server_log.Mon
postmaster successfully started
2005-06-06 14:31:11.950 [20124]  LOG:  database system was interrupted being in recovery at 2005-06-06 14:29:01 EDT
        This probably means that some data blocks are corrupted
        and you will have to use the last backup for recovery.
2005-06-06 14:31:11.950 [20124]  LOG:  checkpoint record is at EF/EBB7AFC8
2005-06-06 14:31:11.950 [20124]  LOG:  redo record is at EF/EBA91EF0; undo record is at 0/0; shutdown FALSE
2005-06-06 14:31:11.950 [20124]  LOG:  next transaction id: 577477594; next oid: 89750885
2005-06-06 14:31:11.951 [20124]  LOG:  database system was not properly shut down; automatic recovery in progress
2005-06-06 14:31:11.952 [20124]  LOG:  redo starts at EF/EBA91EF0
2005-06-06 14:31:11.984 [20124]  PANIC:  Invalid page header in block 22376 of 79189398
2005-06-06 14:31:12.275 [20121]  LOG:  startup process (pid 20124) was terminated by signal 6
2005-06-06 14:31:12.275 [20121]  LOG:  aborting startup due to startup process failure

We have backups from 10 hours earlier, but the obvious
question:  what, if anything, can I do now to salvage those
10 hours of data from this?

I guess I could zero the block?  I'm a little uncertain since
I don't know what I'm zeroing (pg_database? pg_class?), and
can't start up to see what that relfilenode maps to...

Going to look at it with pg_filedump, maybe oid2filename or
whatever that utility is...

Thanks,
Ed

Re: db corruption/recovery help

From

"Ed L."

Date:

06 June 2005, 18:02:10

On Monday June 6 2005 2:16 pm, Ed L. wrote:
> Someone flipped a breaker switch, and evidently triggered
> corruption in one of our major clusters:

BTW, this is a 7.3.4 cluster ...

Ed

Re: db corruption/recovery help

From

Scott Marlowe

Date:

06 June 2005, 18:17:45

On Mon, 2005-06-06 at 15:16, Ed L. wrote:
> Someone flipped a breaker switch, and evidently triggered
> corruption in one of our major clusters:
>
> $ cat server_log.Mon
> postmaster successfully started
> 2005-06-06 14:31:11.950 [20124]  LOG:  database system was interrupted being in recovery at 2005-06-06 14:29:01 EDT
>         This probably means that some data blocks are corrupted
>         and you will have to use the last backup for recovery.
> 2005-06-06 14:31:11.950 [20124]  LOG:  checkpoint record is at EF/EBB7AFC8
> 2005-06-06 14:31:11.950 [20124]  LOG:  redo record is at EF/EBA91EF0; undo record is at 0/0; shutdown FALSE
> 2005-06-06 14:31:11.950 [20124]  LOG:  next transaction id: 577477594; next oid: 89750885
> 2005-06-06 14:31:11.951 [20124]  LOG:  database system was not properly shut down; automatic recovery in progress
> 2005-06-06 14:31:11.952 [20124]  LOG:  redo starts at EF/EBA91EF0
> 2005-06-06 14:31:11.984 [20124]  PANIC:  Invalid page header in block 22376 of 79189398
> 2005-06-06 14:31:12.275 [20121]  LOG:  startup process (pid 20124) was terminated by signal 6
> 2005-06-06 14:31:12.275 [20121]  LOG:  aborting startup due to startup process failure
>
> We have backups from 10 hours earlier, but the obvious
> question:  what, if anything, can I do now to salvage those
> 10 hours of data from this?
>
> I guess I could zero the block?  I'm a little uncertain since
> I don't know what I'm zeroing (pg_database? pg_class?), and
> can't start up to see what that relfilenode maps to...
>
> Going to look at it with pg_filedump, maybe oid2filename or
> whatever that utility is...

OK, if postgresql is running on hardware that doe NOT lie about
fsyncing, and it is set to fsync, this should NEVER happen.

The fact that it happened means either A: the hardware is lying / broken
(like default IDE drives or a RAID controller with a non-battery backed
cache) or B: the database was set to not fsync.  Since the default IS to
fsync, this would require effort on your part to turn it off.

Now, once the database comes up like this, your data may or may not be
salvagable, and may or may not be coherent.  I.e. you may have orphaned
records where you shouldn't, or duplicate entries in some unique index
or something like that.

Look for pg_resetxlog.  There's a man page for it on my machine, but I'm
not sure about 7.3.x

Before doing anything, make a file system level backup of all your data
so you have a pristine set to play about with.

Re: db corruption/recovery help

From

"Ed L."

Date:

06 June 2005, 18:29:52

On Monday June 6 2005 3:17 pm, Scott Marlowe wrote:
> On Mon, 2005-06-06 at 15:16, Ed L. wrote:
> > Someone flipped a breaker switch, and evidently triggered
> > corruption in one of our major clusters:
>
> OK, if postgresql is running on hardware that doe NOT lie
> about fsyncing, and it is set to fsync, this should NEVER
> happen.

This is 7.3.4 running on an HP-UX 11.00 9000/800 PA-RISC box with
fsync = TRUE, built with gcc 3.2.2.  Database is entirely on a
SAN.

We got very lucky:  the corrupted database was expendable
(essentially a log database).  I was able to just move the
data/base/NNNN directory off to the side, restart, drop the
corrupted db, and recreate schema...

Thanks,
Ed

Re: db corruption/recovery help

From

"Ed L."

Date:

06 June 2005, 18:39:59

On Monday June 6 2005 3:29 pm, Ed L. wrote:
> On Monday June 6 2005 3:17 pm, Scott Marlowe wrote:
> > On Mon, 2005-06-06 at 15:16, Ed L. wrote:
> > > Someone flipped a breaker switch, and evidently triggered
> > > corruption in one of our major clusters:
> >
> > OK, if postgresql is running on hardware that doe NOT lie
> > about fsyncing, and it is set to fsync, this should NEVER
> > happen.
>
> This is 7.3.4 running on an HP-UX 11.00 9000/800 PA-RISC box
> with fsync = TRUE, built with gcc 3.2.2.  Database is entirely
> on a SAN.
>
> We got very lucky:  the corrupted database was expendable
> (essentially a log database).  I was able to just move the
> data/base/NNNN directory off to the side, restart, drop the
> corrupted db, and recreate schema...

The SAN never lost power, only the system itself.  I'd really
like to chase this to the root if possible.  Ideas?

Ed

Re: db corruption/recovery help

From

Scott Marlowe

Date:

06 June 2005, 18:56:09

On Mon, 2005-06-06 at 16:39, Ed L. wrote:
> On Monday June 6 2005 3:29 pm, Ed L. wrote:
> > On Monday June 6 2005 3:17 pm, Scott Marlowe wrote:
> > > On Mon, 2005-06-06 at 15:16, Ed L. wrote:
> > > > Someone flipped a breaker switch, and evidently triggered
> > > > corruption in one of our major clusters:
> > >
> > > OK, if postgresql is running on hardware that doe NOT lie
> > > about fsyncing, and it is set to fsync, this should NEVER
> > > happen.
> >
> > This is 7.3.4 running on an HP-UX 11.00 9000/800 PA-RISC box
> > with fsync = TRUE, built with gcc 3.2.2.  Database is entirely
> > on a SAN.
> >
> > We got very lucky:  the corrupted database was expendable
> > (essentially a log database).  I was able to just move the
> > data/base/NNNN directory off to the side, restart, drop the
> > corrupted db, and recreate schema...
>
> The SAN never lost power, only the system itself.  I'd really
> like to chase this to the root if possible.  Ideas?

It sounds like somewhere between postgresql and the SAN connector going
out the back, something is lying about fsync. I'm not that familiar with
lots of different SAN setups, so you might want to describe how things
are set up and see if anyone else knows more about them than me.

Re: db corruption/recovery help

From

Tom Lane

Date:

07 June 2005, 02:16:01

"Ed L." <pgsql@bluepolka.net> writes:
> Someone flipped a breaker switch, and evidently triggered
> corruption in one of our major clusters:

> 2005-06-06 14:31:11.984 [20124]  PANIC:  Invalid page header in block 22376 of 79189398

It's possible that you are one minor release short of having dodged this
problem, as I see in the 7.3.5 CVS log

2003-12-01 11:53  tgl

    * src/backend/storage/buffer/bufmgr.c (REL7_3_STABLE): Force
    zero_damaged_pages to be effectively ON during recovery from WAL,
    since there is no need to worry about damaged pages when we are
    going to overwrite them anyway from the WAL.  Per recent
    discussion.

This doesn't really address the question of how the page header got
clobbered in the first place, though.  Did you by any chance make a dump
to see what data was in there?

            regards, tom lane

Re: db corruption/recovery help

From

"Ed L."

Date:

07 June 2005, 02:25:35

On Monday June 6 2005 11:15 pm, Tom Lane wrote:
> It's possible that you are one minor release short of having
> dodged this problem, as I see in the 7.3.5 CVS log
>
>     * src/backend/storage/buffer/bufmgr.c (REL7_3_STABLE): Force
>     zero_damaged_pages to be effectively ON during recovery from
> WAL, since there is no need to worry about damaged pages when
> we are going to overwrite them anyway from the WAL.  Per
> recent discussion.

I remember all too well.

> This doesn't really address the question of how the page
> header got clobbered in the first place, though.  Did you by
> any chance make a dump to see what data was in there?

Well, I do have a copy of the corrupted database/file, if that's
what you mean.  I rebuilt 7.3.4 source (which I don't normally
delete) so I could build pg_filedump 2.0, but then pg_filedump
wouldn't build.  Not sure why; I didn't have the luxury of
digging deeper.  Maybe I can find a pg_filedump laying around
somewhere here, or sort out the build issue...

Ed

Re: db corruption/recovery help

From

"Ed L."

Date:

07 June 2005, 17:57:13

On Monday June 6 2005 11:15 pm, Tom Lane wrote:
> This doesn't really address the question of how the page
> header got clobbered in the first place, though.  Did you by
> any chance make a dump to see what data was in there?

I couldn't start the postmaster at all with that data in the
cluster, so no dump.  But I do have the corrupted file.  Haven't
gotten free yet to setup a pg_filedump on it.

The hardware setup is an HP RP5470 4-way running 11.00, with dual
HP Tachyon XL2 Fibre Channel Mass Storage Adapters going into
dual Cisco 9509 SAN switches to dual EVA 5000 disk arrays,
CPQswsp A.3.0B.01F.00F Patch upgrade for Sanworks Secure Path
Device Driver and utilities.  Not sure that reveals much of the
problem...

Ed