Re: Corrupted Data ? - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: Corrupted Data ? |
Date | |
Msg-id | 6499cfc7-2c89-4d3b-905d-18ceac71440d@aklaver.com Whole thread Raw |
In response to | Re: Corrupted Data ? (Ioana Danes <ioanadanes@gmail.com>) |
Responses |
Re: Corrupted Data ?
|
List | pgsql-general |
On 08/12/2016 08:30 AM, Ioana Danes wrote: > > > On Fri, Aug 12, 2016 at 11:26 AM, Adrian Klaver > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote: > > On 08/12/2016 08:10 AM, Ioana Danes wrote: > > > > On Fri, Aug 12, 2016 at 10:47 AM, Francisco Olarte > <folarte@peoplecall.com <mailto:folarte@peoplecall.com> > <mailto:folarte@peoplecall.com <mailto:folarte@peoplecall.com>>> > wrote: > > CCing to the list... > > Thanks > > > On Fri, Aug 12, 2016 at 4:10 PM, Ioana Danes > <ioanadanes@gmail.com <mailto:ioanadanes@gmail.com> > <mailto:ioanadanes@gmail.com <mailto:ioanadanes@gmail.com>>> > wrote: > >> given 318220 and 318216 are just a bit away ( 4db08/4db0c > ), and it > >> repeats sporadically, have you ruled out ( by having page > checksums or > >> other mechanism ) a potential disk read/write error ? > >> > >> > >> > Also the index is correct on db3 as the record in case > (with > drawid = > >> > 318216) is retrieved if I filter by drawid = 318220 > >> > >> Specially if this happens, you may have some slightly bad > disks/ram/ > >> leading to this kind of problems. > >> > > > > Could be. I also had some issues with an rsync between db3 and > drdb a week > > ago that did not complete for bigger files (> 200MB) and > gave me some > > corruption messages. Then the system was revbooted and > everything > seemed > > fine but apparently it is not. > > I am planning to drop & create the table from a good > backup and if > that does > > not fix the issue then I will rebuild the server. > > I would check whatever logs you can ( syslog or eventlog, > smart log, > etc.. ) hunting for disk errors ( sometimes they are > reported ). This > kind of problems, with programs as tested as postgres and > rsync, tend > to indicate controller/RAM/disk going bad ( in your case it > could be > caused by a single bit getting flipped in a sector for the data > portion of the table, and not being propagated either because it > happened after your sync of drdb or because it was synced > from the WAL > and not the table, or because it was read from the disk cache ). > > I agree, unfortunately I did not find any clues about corruption > or any > anomalies in the logs. > I will work tonight to rebuild that table and see where I go > from there. > > > The db3 database is on a different machine from all the other > databases you set up, correct? > > Yes, they are all different vms first 3 dbs are on the same cluster but > drdb is a remote machine, Aah, another player in the mix. What virtualization technology are you using? > > Thank you > > > > Thanks, > ioana > > Francisco Olarte. > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: