Thread: duplicated values on primary key field on reindex
Hi all! This was 8.1.3 and now is 8.1.4 running on Debian Sarge, locally compiled without any fancy options. Since Tuesday we experience strange diconnects from our database used as a source code respository for the developers. First indicators of trouble are log lines like :ERROR: invalid memory alloc request size 2003127913 Later on follow some disconnects due to process termination :LOG: server process (PID 8276) was terminated by signal 11 :LOG: terminating any other active server processes :WARNING: terminating connection because of crash of another server process :DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. :HINT: In a moment you should be able to reconnect to the database and repeat your command. Trying to dump the DB results in an error for one table named tw_blob (given by the source code management tool). The table is: Table "public.tw_blob" Column | Type | Modifiers ------------+---------+----------- primarykey | integer | not null blobtype | integer | blobdata | bytea | Indexes: "tw_blob_pkey" PRIMARY KEY, btree (primarykey) After some crawling I found a row to be damaged # select * from tw_blob order by primarykey limit 1 offset 1636022; primarykey | blobtype | blobdata ------------+----------+--------- 1637694 | 2 | dXBkYXRl (1 row) (last field is truncated for the sake of this email) trying to read this and the next row # select * from tw_blob order by primarykey limit 2 offset 1636022; ERROR: invalid memory alloc request size 2003127913 The next query is answered in two different flavors # select primarykey from tw_blob order by primarykey limit 2 offset 1636022; First answer: primarykey ------------ 1636694 1636695 (2 rows) Second answer: primarykey ------------ 1637694 216305 (2 rows) While the first answer seems much more valid (the primarkey is an artificially created number), the second answer seems to be the one being presented for all further invocations of the above query. I noted, that the second row does not fit the "order by" clause, so I tried a reindex of the db, but that led to a duplicate value error: # reindex index tw_blob_pkey; ERROR: could not create unique index DETAIL: Table contains duplicated values. Now that is something I don't understand at all. Since the backup for said server went postal too long ago unnoticed, I would prefer a "repair" solution. Any ideas? TIA Jan
On Thu, 2006-07-06 at 16:36, Weerts, Jan wrote: > Hi all! > > This was 8.1.3 and now is 8.1.4 running on Debian Sarge, locally > compiled without any fancy options. > > While the first answer seems much more valid (the primarkey is > an artificially created number), the second answer seems to > be the one being presented for all further invocations of the > above query. > > I noted, that the second row does not fit the "order by" clause, > so I tried a reindex of the db, but that led to a duplicate > value error: > # reindex index tw_blob_pkey; > ERROR: could not create unique index > DETAIL: Table contains duplicated values. > > Now that is something I don't understand at all. > > Since the backup for said server went postal too long ago > unnoticed, I would prefer a "repair" solution. Any ideas? Can you get set of fields in that row to uniquely identify it by? If so, see if you can update that column to something else and continue
Scott Marlowe wrote: > On Thu, 2006-07-06 at 16:36, Weerts, Jan wrote: >> Hi all! >> >> This was 8.1.3 and now is 8.1.4 running on Debian Sarge, locally >> compiled without any fancy options. > >> >> While the first answer seems much more valid (the primarkey is >> an artificially created number), the second answer seems to >> be the one being presented for all further invocations of the >> above query. >> >> I noted, that the second row does not fit the "order by" clause, >> so I tried a reindex of the db, but that led to a duplicate value >> error: # reindex index tw_blob_pkey; >> ERROR: could not create unique index >> DETAIL: Table contains duplicated values. >> >> Now that is something I don't understand at all. >> >> Since the backup for said server went postal too long ago >> unnoticed, I would prefer a "repair" solution. Any ideas? > > Can you get set of fields in that row to uniquely identify it by? > > If so, see if you can update that column to something else and > continue The only way would be to update by primarykey. But since the select on the primarykey field shows this "strange" ordering, I wonder, what effect an update would have. My guess would be, that the correct pk value should be 1636695, but seeing only 216305 on subsequent calls makes me think. I even have executed # select * from tw_blob where primarykey = 216305; and receive a single row, which I don't really trust to be the same one producing the error. Jan
On Thu, 2006-07-06 at 17:30, Weerts, Jan wrote: > Scott Marlowe wrote: > > On Thu, 2006-07-06 at 16:36, Weerts, Jan wrote: > >> Hi all! > >> > >> This was 8.1.3 and now is 8.1.4 running on Debian Sarge, locally > >> compiled without any fancy options. > > > >> > >> While the first answer seems much more valid (the primarkey is > >> an artificially created number), the second answer seems to > >> be the one being presented for all further invocations of the > >> above query. > >> > >> I noted, that the second row does not fit the "order by" clause, > >> so I tried a reindex of the db, but that led to a duplicate value > >> error: # reindex index tw_blob_pkey; > >> ERROR: could not create unique index > >> DETAIL: Table contains duplicated values. > >> > >> Now that is something I don't understand at all. > >> > >> Since the backup for said server went postal too long ago > >> unnoticed, I would prefer a "repair" solution. Any ideas? > > > > Can you get set of fields in that row to uniquely identify it by? > > > > If so, see if you can update that column to something else and > > continue > > The only way would be to update by primarykey. But since the > select on the primarykey field shows this "strange" ordering, > I wonder, what effect an update would have. My guess would be, > that the correct pk value should be 1636695, but seeing only > 216305 on subsequent calls makes me think. > > I even have executed > # select * from tw_blob where primarykey = 216305; > and receive a single row, which I don't really trust to be > the same one producing the error. If there are no other fields you can use to uniquely identify that row, then add a new row to the table and update it from a sequence... create sequence deargodsaveme; alter table brokentable add column emergencyint int; update brokentable set emergencyint=nextval('deargodsaveme'); then use that new column, emergencyint to select it for an update. Note that these kind of problem is generally a sign of faulty hardware, so you'll need to be looking at your machine's hardware (memory, CPU, etc..) and possible problems like faulty fsyncing etc... to make sure it doesn't happen again.
"Weerts, Jan" <j.weerts@i-views.de> writes: > Scott Marlowe wrote: >> Can you get set of fields in that row to uniquely identify it by? >> >> If so, see if you can update that column to something else and >> continue > The only way would be to update by primarykey. But since the > select on the primarykey field shows this "strange" ordering, > I wonder, what effect an update would have. CTID always works: SELECT ctid, otherstuff FROM table WHERE ... ; eyeball otherstuff to determine row you wish to hack UPDATE table SET ... WHERE ctid = '...'; Note: the act of UPDATE changes the row's ctid, don't be surprised. regards, tom lane