Re: pg_subtrans and WAL - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: pg_subtrans and WAL |
Date | |
Msg-id | 751.1093023399@sss.pgh.pa.us Whole thread Raw |
In response to | Re: pg_subtrans and WAL (Alvaro Herrera <alvherre@dcc.uchile.cl>) |
Responses |
Re: pg_subtrans and WAL
|
List | pgsql-hackers |
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > On Tue, Aug 10, 2004 at 12:24:06PM -0400, Tom Lane wrote: >> It may be that we do not care because pg_subtrans doesn't have to be >> valid after a crash, but I haven't seen any proof of that theory. > The whole point of the subtrans info is to be available _while_ the > transaction tree is running. If there is a crash, then by definition no > backend can be running when we return, so pg_subtrans info is useless at > that point. We only need pg_clog to be correct. But we also have to be sure that we don't try to access the useless info anyway. For instance some pre-crash subxacts might remain marked SUBCOMMITTED in clog indefinitely. I think this could be worked around: for example, TransactionIdDidCommit could assume that any SUBCOMMITTED xact older than RecentGlobalXmin must represent a child of a crashed parent. It shouldn't be too hard to guarantee that we never touch pg_subtrans for XIDs older than RecentGlobalXmin. We don't have that guarantee in place at the moment though. >> And if that theory is correct, then it is a seriously bad design to be >> using the same code infrastructure for both pg_clog and pg_subtrans. >> Every fsync on pg_subtrans is wasted effort if that is going to be our >> approach. > Right, but AFAICS both pg_clog and pg_subtrans are only fsync'ed during > checkpoint and shutdown, so it doesn't seem that costly. We could > certainly skip calling CheckPointSUBTRANS() or making it a noop ... The point is that the behaviors are fundamentally different. We have no need for any WAL log entries for pg_subtrans; we should never fsync it; and the rules for deciding when and where to truncate it are a lot different (or at least should be different). I thought from the beginning that the slru layer underneath pg_clog was bad from the point of view of obfuscating the code, because it forced an awkward division of labor between clog.c and slru.c. Now that I realize that there's not that much behavior that we really want to share, I wonder whether we shouldn't revert that change and make subtrans.c stand on its own. > On a related note: if we mark a Xid with SUBTRANS COMMIT and later crash > without updating it, the main Xid will remain in in-progress status. At > what point is it marked aborted? I do not think there's any guarantee that it ever will be so marked. Certainly it could be a very long time until someone exhibits any interest in that particular Xid's status... regards, tom lane
pgsql-hackers by date: