Database corruption in 7.0.3 - Mailing list pgsql-hackers
From | Tim Allen |
---|---|
Subject | Database corruption in 7.0.3 |
Date | |
Msg-id | Pine.LNX.4.21.0103151916550.16580-100000@bee.proximity.com.au Whole thread Raw |
Responses |
Re: Database corruption in 7.0.3
Re: Database corruption in 7.0.3 |
List | pgsql-hackers |
We have an application that we were running quite happily using pg6.5.3 in various customer sites. Now we are about to roll out a new version of our application, and we are going to use pg7.0.3. However, in testing we've come across a couple of isolated incidents of database corruption. They are sufficiently rare that I can't reproduce the problem, nor can I put my finger on just what application behaviour causes the problems. The symptoms most often involve some sort of index corruption, which is reported by vacuum and it seems that vacuum can fix it. On occasion vacuum reports "invalid OID" or similar (sorry, don't have exact wording of message). On one occasion the database has been corrupted to the point of unusability (ie vacuum admitted that it couldn't fix the problem), and a dump/restore was required (thankfully that at least worked). The index corruption also occasionally manifests itself in the form of spurious uniqueness constraint violation errors. The previous version of our app using 6.5.3 has never shown the slightest symptom of database misbehaviour, to the best of my knowledge, despite fairly extensive use. So our expectations are fairly high :-). One thing that is different about the new version of our app is that we now use multiple connections to the database (previously we only had one). We can in practice have transactions in progress on several connections at once, and it is possible for some transactions to be rolled back under application control (ie explicit ROLLBACK; statement). I realise I haven't really provided an awful lot of information that would help identify the problem, so I shall attempt to be understanding if no-one can offer any useful suggestions. But I hope someone can :-). Has anyone seen this sort of problem before? Are there any known database-corrupting bugs in 7.0.3? I don't recall anyone mentioning any in the mailing lists. Is using multiple connections likely to stimulate any known areas of risk? BTW we are using plain vanilla SQL, no triggers, no new types defined, no functions, no referential integrity checks, nothing more ambitious than a multi-column primary key. The platform is x86 Red Hat Linux 6.2. Curiously enough, on one of our testing boxes and on my development box we have never seen this, but we have seen it several times on our other test box and at least one customer site, so there is some possibility it's related to dodgy hardware. The customer box with the problem is a multi-processor box, all the other boxes we've tested on are single-processor. TIA for any help, Tim -- ----------------------------------------------- Tim Allen tim@proximity.com.au Proximity Pty Ltd http://www.proximity.com.au/ http://www4.tpg.com.au/users/rita_tim/
pgsql-hackers by date: