Re: Re: [COMMITTERS] pgsql: Reduce spurious Hot Standby conflicts from never-visible records - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Re: [COMMITTERS] pgsql: Reduce spurious Hot Standby conflicts from never-visible records |
Date | |
Msg-id | AANLkTi=D+m36jh9oApnqc6P2U64ibP92s_cDHk_A2vY7@mail.gmail.com Whole thread Raw |
In response to | Re: [COMMITTERS] pgsql: Reduce spurious Hot Standby conflicts from never-visible records (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: Re: [COMMITTERS] pgsql: Reduce spurious Hot Standby
conflicts from never-visible records
|
List | pgsql-hackers |
On Sun, Dec 12, 2010 at 5:15 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Sat, 2010-12-11 at 22:03 +0100, Heikki Linnakangas wrote: >> (Moving to pgsql-hackers) >> >> On 10.12.2010 20:21, Tom Lane wrote: >> > Simon Riggs<simon@2ndQuadrant.com> writes: >> >> Reduce spurious Hot Standby conflicts from never-visible records. >> >> Hot Standby conflicts only with tuples that were visible at >> >> some point. So ignore tuples from aborted transactions or for >> >> tuples updated/deleted during the inserting transaction when >> >> generating the conflict transaction ids. >> > >> >> Following detailed analysis and test case by Noah Misch. >> >> Original report covered btree delete records, correctly observed >> >> by Heikki Linnakangas that this applies to other cases also. >> >> Fix covers all sources of cleanup records via common code. >> >> Includes additional fix compared to commit on HEAD >> > >> > ISTM HeapTupleHeaderAdvanceLatestRemovedXid is still pretty broken, >> > in that it's examining xmax without having checked that xmax is (a) >> > valid or (b) a lock rather than a deletion xmax. >> >> In current use, it's only called for tuples that are known to be dead, >> so either xmax is a valid deletion, or xmin didn't commit in which case >> the function doesn't use xmax for anything. So I think it actually works >> as it is. > > Well, I think you're both right. > > The function shouldn't be called in places where xmax is the wrong > flavour, but there should be specific safeguards in case of mistake. Should we do something about this? An Assert(), maybe? >> Also, I'm not totally convinced it's correct when xmin > xmax, despite >> Simon's follow-up commit to fix that. Shouldn't it advance >> latestRemovedXid to xmin in that case? Or maybe it's ok as it is because >> we know that xmax committed after xmin. The impression I get from the >> comment above the function now is that it advances latestRemovedXid to >> the highest XID present in the tuple, but that's not what it does in the >> xmin > xmax case. That comment needs clarification. > > Hmmm, my earlier code took xmax only if xmax > xmin. That was wrong; > what I have now is better, but your point is there may be an even better > truth. I'll think on that a little more. I guess the problem case here is something like: 1. T1 begins. T1 writes a tuple A (so it gets an XID). 2. T2 begins. T2 writes a tuple B (so it gets a later XID). 3. T1 takes a new snapshot that can see B and deletes B. 4. T2 commits. 5. T1 commits. At this point we have a tuple B with XMAX (T1's XID) < XMIN (T2's XID). Now, on the standby, there can be a transaction TS which takes a snapshot that can see T2's XID but not T1's XID. While that transaction is still running, VACUUM (or a HOT prune) comes along and zaps B, and this record is replayed on the standby, advancing latestRemovedXID to T1's XID, when in fact we also removed T2's later XID. This means we MUST kill TS (at least if it tries to read that block) because otherwise it'll fail to see B and return the wrong answer. Will we actually kill TS? GetConflictingVirtualXIDs() looks for transactions where proc->xmin precedes limitXmin, and if TS has a snapshot that can't see T1's XID then its proc->xmin might be exactly equal to limitXmin = latestRemovedXID = T1s XID, causing it to not get killed. I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: