Thread: missing chunk number 0 for toast value 25693266 in pg_toast_25497233
I'm running into this issue again: psql --version psql (PostgreSQL) 8.3.7 COPY items_extended TO '/dev/null'; ERROR: missing chunk number 0 for toast value 25693266 in pg_toast_25497233 Unfortunately, I do not know where these are coming from and I cannot replicate the data in at least one of my tables (which has 20 million records) because of this. I've already found 10 bad records. There are more. I have four tables with 20 million records (and a fifth which has even more) that may have this problem. Right now, checking every record one by one is going to take *DAYS* to complete, and I'm in the middle of adding additional nodes to our cluster because we are already stretched to the limit. I have a few options, such as trying to check batches of records and spinning up multiple checkers in parallel looking at different subsets of the data on wal shipped spares (assuming the wal shipped spares would suffer the same problem, which is a big assumption), but this is a lot of effort to get going. Help! Thanks, Bryan
On Fri, May 7, 2010 at 01:24, Bryan Murphy <bmurphy1976@gmail.com> wrote: > I'm running into this issue again: > > psql --version > psql (PostgreSQL) 8.3.7 > > COPY items_extended TO '/dev/null'; > ERROR: missing chunk number 0 for toast value 25693266 in pg_toast_25497233 > > Unfortunately, I do not know where these are coming from and I cannot > replicate the data in at least one of my tables (which has 20 million > records) because of this. I've already found 10 bad records. There > are more. > > I have four tables with 20 million records (and a fifth which has even > more) that may have this problem. Right now, checking every record > one by one is going to take *DAYS* to complete, and I'm in the middle > of adding additional nodes to our cluster because we are already > stretched to the limit. > > I have a few options, such as trying to check batches of records and > spinning up multiple checkers in parallel looking at different subsets > of the data on wal shipped spares (assuming the wal shipped spares > would suffer the same problem, which is a big assumption), but this is > a lot of effort to get going. Try doing a binary search with LIMIT. E.g., if you have 20M reecords, do a SELECT * FROM ... LIMIT 10M. (throw away the results) If that broke, check the upper half, if not, check the lower one (with OFFSET). If you have a serial primary key or something, you can use WHERE on it which will likely be a lot faster than using LIMIT, but the same idea applies - do a binary search. Should take a lot less than days, and is reasonably easy to script. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Fri, May 7, 2010 at 9:02 AM, Magnus Hagander <magnus@hagander.net> wrote: > Try doing a binary search with LIMIT. E.g., if you have 20M reecords, > do a SELECT * FROM ... LIMIT 10M. (throw away the results) If that > broke, check the upper half, if not, check the lower one (with > OFFSET). > > If you have a serial primary key or something, you can use WHERE on it > which will likely be a lot faster than using LIMIT, but the same idea > applies - do a binary search. Should take a lot less than days, and is > reasonably easy to script. That's my next step if I can't find a quicker/simpler method. My math tells me that my current script is going to take 24 days to test every record. Obviously, there are ways I can speed that up if I have no choice but I'm hoping for a simpler solution. I'd prefer to run a COPY TABLE like command and have it skip the bad records. Thanks, Bryan