Re: Corrupt index - Mailing list pgsql-general
From | Amir Becher |
---|---|
Subject | Re: Corrupt index |
Date | |
Msg-id | 20030410203247.63208.qmail@web13902.mail.yahoo.com Whole thread Raw |
In response to | Re: Corrupt index (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Corrupt index
|
List | pgsql-general |
The source of the problem was the VERITAS Backup Exec. I was able to replicate the index corruption in under a minute by running a backup and updating the database at the same time. I have never been able to replicate the problem before because I was testing during the day, when the backup was not running. As far as backups are concerned, we will no longer backup the data directory itself - that was clearly a dumb thing to do in the first place. We actually have been backing up the data using pg_dumpall as well (so there is still hope for us). Thanks for all the help - I greatly appreciate it. --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > Amir Becher <abecher@yahoo.com> writes: > > I don't know if this may have something to do with > it, > > but we do backup the data every night using > VERITAS > > Backup Exec. We are not restoring anything, though > > (the data is backed up to tape). The VERITAS > software > > runs on Windows, but there is an agent that runs > on > > our Linux box where the PostgreSQL data is stored. > I > > should also mention that the backup is running > while > > the database is being modified (we modify the > database > > 24/7). > > You're wasting your time making such a backup --- if > you ever have to > use it, it'll be corrupt, because the individual > files in the database > won't be in sync. But that's not the immediate > problem. > > > There is another unexpected behavior that I > noticed > > for the first time this morning (so I am not sure > if > > it's recurring, related or relevant). The database > > "blinked" in the sense that all database > connections > > were lost - but new connections could be obtained > > immediately after the "blink". The error message > that > > I got said something about possible "corrupted > shared > > memory" and I guess the shutting down of the > > connections was a precautionary measure. > > That sounds like a backend crash, all right. Given > that, I'm thinking > that you have more extensive problems than just this > one symptom. The > odds are good that it's a hardware issue, because we > haven't heard any > reports of comparable misbehavior from anyone else. > > I'd recommend running some hardware diagnostics --- > memtest86 and > badblocks seem to be the most widely used, although > they aren't always > able to find problems. > > It would also be a good idea to start taking some > *real* backups, using > pg_dump or pg_dumpall. You will be lucky if you > don't find any more > serious corruption in the database, if I'm right > that there's hardware > flakiness involved. You may find yourself forced to > initdb and restore > from a backup, so you'd better have one. > > regards, tom lane __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com
pgsql-general by date: