Re: Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE - Mailing list pgsql-hackers
From | Travis Cross |
---|---|
Subject | Re: Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE |
Date | |
Msg-id | 4485B422.1050604@crosswirecorp.com Whole thread Raw |
In response to | Re: Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE constraint (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE
|
List | pgsql-hackers |
Tom Lane wrote: > Travis Cross <travis@crosswirecorp.com> writes: >> I'm noticing that a handful (4-16) of rows with duplicate columns >> (uid,token) are sneaking into the table every day despite the >> primary key constraint. > > Corrupt index, looks like ... you might try reindexing the index. I probably should have mentioned that I have indeed done a REINDEX on the table a couple of times in the past, suspecting that issue, and having seen it resolve similar issues on this list. Upon your suggestion, I'm running one right now, and I will probably dump and reload the entire database after hours, unless anyone thinks that would be a bad idea (or unproductive in tracking this down). > I don't believe that the PANIC you show has anything directly to do > with duplicate entries. It is a symptom of corrupt index structure. > Now a corrupt index might also explain failure to notice duplications, > but changing your application isn't going to fix whatever is causing > it. You need to look for server-side causes. Indeed, you are correct. I should also mention that the problem seems to build over time, in the sense that everything will run fine for awhile (a few days), and then will crash repeatedly. Deleting the duplicate rows seems to reset the counter -- of course, I cannot run a successful REINDEX until I have deleted those duplicate rows. > Any database or system crashes on this server (before this problem > started)? No. In fact, this box, and a sister box running similar hardware, have been models of system stability. My uptimes are 46 and 87 days, respectively, representing the time since I've done a kernel upgrade and the time since I plugged the boxes into the rack. The sister box is running real-time voice services. > Do you *know* that the disk drive will not lie about write > complete? "Know" is such a strong word ;) Honestly, I have very little idea. I understand the nature of the problem this presents,as I've read the very fine PostgreSQL manual many times over the years. Because the drives I use are specifically designed to operate well in a RAID environment, I would 'hope' that the drives perform honest write operations. I wonder if there is a utility to perform a deterministic test of this... > What is the platform and storage system, anyway? The platform is: Linux 2.6.16.9 (w/o loadable modules) Supermicro PDSMi (a single processor P-D board) 2G ECC DDRII SDRAM The storage system is: On-board SATA ICH7R Controller 2 x WD3200SD hard drives running in a Linux RAID 1 configuration. That is to say: Western Digital 320G SATA 'enterprise' drives. The drives have a somewhat unique feature: time-limited error recovery, which is supposed to let the RAID controller/software deal with errors after a certain point (7 seconds), rather than continuing to block, and causing the drive to fall out of the array. The drive: http://www.westerndigital.com/en/products/products.asp?driveid=114&language=en I'll run file system consistency checks tonight to see if I can pick out a proximal cause for all this chaos. I really do appreciate the assistance. Cheers, -- Travis
pgsql-hackers by date: