Thread: can't find files in pg_clog and invalid page headers
I am running postgres 7.4Beta1 and have found a problem, I'm not sure if its a bug or something I did wrong. I ran ANALYZE on a large table while the database was receiving data, after running it on just my large table I decided to run it on my entire database. When I issued the database wide analyze I got an error: data=# analyze; ERROR: could not access status of transaction 739760128 DETAIL: open of file "/usr/local/pgsql/data/pg_clog/02C1" failed: No such file or directory There is only one file in pg_clog its called 0000. I ran it again and got: data=# analyze; ERROR: could not access status of transaction 3942652937 DETAIL: open of file "/usr/local/pgsql/data/pg_clog/0EB0" failed: No such file or directory Then I tried running analyze just on my large table again and got: data=# ANALYZE "DetailedMeters"; ERROR: invalid page header in block 10836 of "DetailedMeters" Now whenever I try to read all of "DetailedMeters" I get that error message though the block numbers can change (10832,10834,10836). Anyone know what I did to cause this? Or is this some sort of bug or hardware failure? ---- Adam Kavan ---- akavan@cox.net
Adam Kavan <akavan@cox.net> writes: > [unstable failure messages] > Anyone know what I did to cause this? Or is this some sort of bug or > hardware failure? I think you've got a hardware problem. Have you run memtest86 or badblocks lately? regards, tom lane
Adam Kavan <akavan@cox.net> writes: > At 03:19 PM 8/14/03 -0400, Tom Lane wrote: >> I think you've got a hardware problem. Have you run memtest86 or >> badblocks lately? > My machine has passed badblocks and memtest86 with flying colors. To fix > the table I did a TRUNCATE on it which seems to have worked, but I'm left > with a couple of questions. First, will a TRUNCATEd page wind up in > another table and break it or is that bit fixed for good. TRUNCATE would just return the page back to the filesystem, which would eventually reallocate it to some other file (or even the same file). If it was a one-time problem then it's gone, but I still suspect that your hardware dropped some bits, and if so it may do so again. FWIW, I have heard reports of disk problems that were not found by badblocks unless you used the "destructive test" option. regards, tom lane
At 03:19 PM 8/14/03 -0400, Tom Lane wrote: >Adam Kavan <akavan@cox.net> writes: > > [unstable failure messages] > > Anyone know what I did to cause this? Or is this some sort of bug or > > hardware failure? > >I think you've got a hardware problem. Have you run memtest86 or >badblocks lately? > > regards, tom lane My machine has passed badblocks and memtest86 with flying colors. To fix the table I did a TRUNCATE on it which seems to have worked, but I'm left with a couple of questions. First, will a TRUNCATEd page wind up in another table and break it or is that bit fixed for good. Secondly, I have been unable to recreate the error, ,so I feel it was probably just a fluke does anyone disagree? --- Adam Kavan --- akavan@cox.net