Re: Multixact truncation for FK locks patch - Mailing list pgsql-hackers
From | Alvaro Herrera |
---|---|
Subject | Re: Multixact truncation for FK locks patch |
Date | |
Msg-id | 1317840445-sup-7142@alvh.no-ip.org Whole thread Raw |
In response to | Multixact truncation for FK locks patch (Alvaro Herrera <alvherre@alvh.no-ip.org>) |
List | pgsql-hackers |
Excerpts from Alvaro Herrera's message of lun sep 26 13:16:24 -0300 2011: > So I had to look for something else -- and I think I have it: have > multixact itself track its truncation position relative to Xid. Each > pg_multixact/offset segment would store ReadNewTransactionId at the > time it is created. Whenever vacuum attempts to run pg_clog > truncation, it would also pass that Xid to multixact truncation; this > would scan existing segments and delete those that come before the one > marked with the maximum Xid previous to the pg_clog truncation point. So here's how it would really work in detail. The first MultiXactId in each segment, i.e. one pg_multixact/offset entry every 32 pages, stores RecentGlobalXmin at the time the segment is created, that is, when we get to ZeroOffsetPage the first page of the segment. Such positions are skipped when handing out MultiXactIds. (We abuse the system a bit by storing a TransactionId in a place that normally holds an offset. It's not like we don't do exactly this in MultiXactId themselves.) The semantics of this value is "segments prior to this one can be removed as soon as we freeze Xids behind this one". (However, note that the value is actually capped to RecentGlobalXmin.) The reason that this is correct is that any MultiXactId that we might be interested in knowing must be after the freeze point: any tuple before the freeze point must have already been visited by vacuum; so either the mxact contained only locks (in which case the interesting lifetime is RecentGlobalXmin, which we already covered above), or it contained at least one update; and since we're freezing, that update must be either committed (so the tuple is invisible to everyone and would have been removed by vacuum), or aborted (in which case the tuple would have been relabeled HEAP_XMAX_INVALID). /* FIXME what if the update is gone but was locked by a later transaction? */ pg_control contains a new field, TransactionId mxactFrozenXid. This value is bootstrapped to InvalidTransactionId, On CHECKPOINT, we call TruncateMultiXact(oldestXid). (This is the most recent value we know from VACUUM). In TruncateMultiXact, if we see that pg_control.mxactFrozenXid is older than oldestXid, we know we have segments to remove. We scan the whole directory to remove them, and then update mxactFrozenXid to the value from the oldest remaining segment. (Both things can be done in a single scan). I considered the idea that Xids might advance faster than we consume multixact segments, making the offset's freezeXid wrapped around. After thinking about this, my conclusion is that there isn't really a problem here (but I'm very open to be mistaken). One thing of note is that the first page of the first segment is zeroed twice: first when it is created by bootstrap, and second when the first multixactid is created. This is a bit annoying, so I'm going to have bootstrap set mxact 2 as the first one to be created, not 1. This means we no longer zero that page twice; it also means we don't set a freezeXid value uselessly for that page, which causes all sorts of issues. (After wraparound, mxact 1 would be assigned normally and segment zero would behave just like any other segment). In a directory scan to remove segments, we need to open the first page of each segment to fetch its freezeXid. Therefore it would be nice if we could skip doing this if possible. I think the way to do this is to have the callback keep track of "earliest segment that we have to keep" and "oldest segment that we can remove". That way, any segment in between needn't be opened. In reality, I doubt this is going to save much, because removing segments is not all that frequent anyway (unless you're eating tons of MultiXactIds), so maybe we shouldn't implement this bit. Note that this is all about truncating pg_multixact/offset only. To truncate pg_multixact/members, we need to check the earliest kept offsets segment, to know what's the earliest members segment we need to keep. This is the same we do now, IIRC. Thoughts? Does anybody see any serious flaw? -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
pgsql-hackers by date: