TOAST table repeatedly corrupted - Mailing list pgsql-bugs
From | Niles Oien |
---|---|
Subject | TOAST table repeatedly corrupted |
Date | |
Msg-id | CANQ3m6Nhq8VqtZ8SnC3ZLVdQf_1Upb1R+hsPpbFsS7fJOw+Jqg@mail.gmail.com Whole thread Raw |
Responses |
TOAST table repeatedly corrupted
Re: TOAST table repeatedly corrupted Re: TOAST table repeatedly corrupted |
List | pgsql-bugs |
I have what I think is a bug in postgres. I'd like to help track it down, if that
is indeed what's going on.
I am running a reasonably recent version of postgres :
data=# select version();
version
----------------------------------------------------------------------------------------------------------
PostgreSQL 9.5.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-17), 64-bit
(1 row)
About every six months, the logs show that vacuuming of
one of our tables (the same table every time) starts to fail :
grep ERROR postgresql-09.csv
2018-05-09 16:14:03.834 GMT,,,27018,,5af31e4b.698a,1,,2018-05-09 16:14:03 GMT,12/611211,0,ERROR,XX001,"invalid page in block 1374551 of relation base/16384/36298640",,,,,"automatic vacuum of table ""data.pg_toast.pg_toast_36298637""",,,,""
2018-05-09 16:15:03.865 GMT,,,27696,,5af31e87.6c30,1,,2018-05-09 16:15:03 GMT,12/611219,0,ERROR,XX001,"invalid page in block 1374551 of relation base/16384/36298640",,,,,"automatic vacuum of table ""data.pg_toast.pg_toast_36298637""",,,,""
2018-05-09 16:16:03.897 GMT,,,31666,,5af31ec3.7bb2,1,,2018-05-09 16:16:03 GMT,10/1330957,0,ERROR,XX001,"invalid page in block 1374551 of relation base/16384/36298640",,,,,"automatic vacuum of table ""data.pg_toast.pg_toast_36298637""",,,,""
So it looks like a TOAST table has been corrupted. I can see which
table the TOAST table is supporting :
data=# select relname from pg_class where reltoastrelid = 'pg_toast.pg_toast_36298637'::regclass;
relname
---------------
rdvtrack_fd05
(1 row)
And sure enough, I now cannot dump that table :
linux> pg_dump -p5432 -Upostgres -t hmi.rdvtrack_fd05 data > hmi_rdvtrack_fd05.sql
pg_dump: Dumping the contents of table "rdvtrack_fd05" failed: PQgetResult() failed.
pg_dump: Error message from server: ERROR: compressed data is corrupted
pg_dump: The command was: COPY hmi.rdvtrack_fd05 (recnum, sunum, slotnum, sessionid, sessionns, cparms_sg000, v_bzero, v_bscale, carrrot, cmlon, lonhg, lathg, loncm, midtime, crval1, crval2, crval3, t_start, t_stop, lonspan, coverage, zonalvel, mai, module, source, input, created, bld_vers, backgrnd, rejectlist, cmlon_index, loncm_index, lathg_index, lonhg_index, sg_000_file, sg_001_file, history, datavals, missvals, datamin, datamax, datamean, datarms, dataskew, datakurt, meanmu, meanlat) TO stdout;
I think I can see the disk files where the tables are stored :
data=# select pg_relation_filepath('hmi.rdvtrack_fd05');
pg_relation_filepath
----------------------
base/16384/36298637
(1 row)
data=# select pg_relation_filepath('pg_toast.pg_toast_36298637');
pg_relation_filepath
----------------------
base/16384/36298640
(1 row)
I can run pg_filedump on those two files, in both cases all
the flags seem unremarkable (although I may not be knowing
what to look for) and it runs to the end of the file, giving
me a message like :
*** End of File Encountered. Last Block Read: 131071 ***
I think this is probably a bug? Every time it happens
it affects the same table, hmi.rdvtrack_fd05. The hardware
we're running on is fairly new, disks and memory are
not throwing errors, and we've never had problems with any
other tables. We're running on CentOS 6.8 - again, not brand
new but hardly decrepit, either. I've rebuilt the table
several times. This time I'd like to give the PG community
a chance to chase it down before I do that.
I've put the two data files, the postgres config and
essentially what I've said here online at :
If it helps, the TOAST table looks like this :
data=# \d pg_toast.pg_toast_36298637
TOAST table "pg_toast.pg_toast_36298637"
Column | Type
------------+---------
chunk_id | oid
chunk_seq | integer
chunk_data | bytea
And the table the TOAST table is supporting looks like this :
data=# \d hmi.rdvtrack_fd05
Table "hmi.rdvtrack_fd05"
Column | Type | Modifiers
--------------+------------------+-----------
recnum | bigint | not null
sunum | bigint |
slotnum | integer |
sessionid | bigint |
sessionns | text |
cparms_sg000 | text |
v_bzero | double precision |
v_bscale | double precision |
carrrot | integer |
cmlon | real |
lonhg | real |
lathg | real |
loncm | real |
midtime | double precision |
crval1 | real |
crval2 | real |
crval3 | double precision |
t_start | double precision |
t_stop | double precision |
lonspan | real |
coverage | real |
zonalvel | real |
mai | real |
module | text |
source | text |
input | text |
created | double precision |
bld_vers | text |
backgrnd | text |
rejectlist | text |
cmlon_index | bigint |
loncm_index | bigint |
lathg_index | bigint |
lonhg_index | bigint |
sg_000_file | text |
sg_001_file | text |
history | text |
datavals | integer |
missvals | integer |
datamin | real |
datamax | real |
datamean | double precision |
datarms | double precision |
dataskew | double precision |
datakurt | double precision |
meanmu | real |
meanlat | real |
Indexes:
"rdvtrack_fd05_pkey" PRIMARY KEY, btree (recnum)
"rdvtrack_fd05_carrrot" btree (carrrot)
"rdvtrack_fd05_cmlon_index" btree (cmlon_index)
"rdvtrack_fd05_lathg_index" btree (lathg_index)
"rdvtrack_fd05_loncm_index" btree (loncm_index)
"rdvtrack_fd05_lonhg_index" btree (lonhg_index)
Niles Oien, noien@nso.edu
Niles Oien, National Solar Observatory, Boulder Colorado USA
pgsql-bugs by date: