Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1 - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1 |
Date | |
Msg-id | 20131119142001.GA10498@alap2.anarazel.de Whole thread Raw |
In response to | Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1 (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: Data corruption issues using streaming replication on
9.0.14/9.2.5/9.3.1
Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1 Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1 |
List | pgsql-hackers |
Hi, On 2013-11-18 23:15:59 +0100, Andres Freund wrote: > Afaics it's likely a combination/interaction of bugs and fixes between: > * the initial HS code > * 5a031a5556ff83b8a9646892715d7fef415b83c3 > * f44eedc3f0f347a856eea8590730769125964597 Yes, the combination of those is guilty. Man, this is (to a good part my) bad. > But that'd mean nobody noticed it during 9.3's beta... It's fairly hard to reproduce artificially since a) there have to be enough transactions starting and committing from the start of the checkpoint the standby is starting from to the point it does LogStandbySnapshot() to cross a 32768 boundary b) hint bits often save the game by not accessing clog at all anymore and thus not noticing the corruption. I've reproduced the issue by having an INSERT ONLY table that's never read from. It's helpful to disable autovacuum. Imo something the attached patch should be done. The description I came up with is: Fix Hot-Standby initialization of clog and subtrans. These bugs can cause data loss on standbys started with hot_standby=on at the moment they start to accept read onlyqueries by marking committed transactions as uncommited. The likelihood of such corruptions is small unless the primaryhas a high transaction rate. 5a031a5556ff83b8a9646892715d7fef415b83c3 fixed bugs in HS's startup logic by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state was reached, missing the fact that both clog and subtrans are written to before that. Thisonly failed to fail in common cases because the usage of ExtendCLOG in procarray.c was superflous since clog extensionsare actually WAL logged. f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the missing extensions of pg_subtrans due to the formercommit's changes - which are not WAL logged - by performing the extensions when switching to a state > STANDBY_INITIALIZEDand not performing xid assignments before that - again missing the fact that ExtendCLOG is unneccessary- but screwed up twice: Once because latestObservedXid wasn't updated anymore in that state due to the earliercommit and once by having an off-by-one error in the loop performing extensions. This means that whenever a CLOG_XACTS_PER_PAGE(32768 with default settings) boundary was crossed between the start of the checkpoint recovery startedfrom and the first xl_running_xact record old transactions commit bits in pg_clog could be overwritten if they started and committed in that window. Fix this mess by not performing ExtendCLOG() in HS at all anymore since it's unneeded and evidently dangerous and byperforming subtrans extensions even before reaching STANDBY_SNAPSHOT_PENDING. Imo this warrants and expedited point release :( Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: