Home > mailing lists

Re: pg_subtrans and WAL - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: pg_subtrans and WAL
Date	August 20, 2004 14:38:17
Msg-id	751.1093023399@sss.pgh.pa.us Whole thread Raw
In response to	Re: pg_subtrans and WAL (Alvaro Herrera <alvherre@dcc.uchile.cl>)
Responses	Re: pg_subtrans and WAL
List	pgsql-hackers

Tree view

Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> On Tue, Aug 10, 2004 at 12:24:06PM -0400, Tom Lane wrote:
>> It may be that we do not care because pg_subtrans doesn't have to be
>> valid after a crash, but I haven't seen any proof of that theory.

> The whole point of the subtrans info is to be available _while_ the
> transaction tree is running.  If there is a crash, then by definition no
> backend can be running when we return, so pg_subtrans info is useless at
> that point.  We only need pg_clog to be correct.

But we also have to be sure that we don't try to access the useless info
anyway.  For instance some pre-crash subxacts might remain marked
SUBCOMMITTED in clog indefinitely.  I think this could be worked around:
for example, TransactionIdDidCommit could assume that any SUBCOMMITTED
xact older than RecentGlobalXmin must represent a child of a crashed
parent.  It shouldn't be too hard to guarantee that we never touch
pg_subtrans for XIDs older than RecentGlobalXmin.  We don't have that
guarantee in place at the moment though.

>> And if that theory is correct, then it is a seriously bad design to be
>> using the same code infrastructure for both pg_clog and pg_subtrans.
>> Every fsync on pg_subtrans is wasted effort if that is going to be our
>> approach.

> Right, but AFAICS both pg_clog and pg_subtrans are only fsync'ed during
> checkpoint and shutdown, so it doesn't seem that costly.  We could
> certainly skip calling CheckPointSUBTRANS() or making it a noop ...

The point is that the behaviors are fundamentally different.  We have no
need for any WAL log entries for pg_subtrans; we should never fsync it;
and the rules for deciding when and where to truncate it are a lot
different (or at least should be different).  I thought from the
beginning that the slru layer underneath pg_clog was bad from the point
of view of obfuscating the code, because it forced an awkward division
of labor between clog.c and slru.c.  Now that I realize that there's not
that much behavior that we really want to share, I wonder whether we
shouldn't revert that change and make subtrans.c stand on its own.

> On a related note: if we mark a Xid with SUBTRANS COMMIT and later crash
> without updating it, the main Xid will remain in in-progress status.  At
> what point is it marked aborted?

I do not think there's any guarantee that it ever will be so marked.
Certainly it could be a very long time until someone exhibits any
interest in that particular Xid's status...
        regards, tom lane

pgsql-hackers by date:

From: Gaetano Mendola
Date: 20 August 2004, 14:36:08
Subject: 7.4.5 on RH 2.1AS

From: Bruce Momjian
Date: 20 August 2004, 15:05:55
Subject: Re: postgres uptime

Re: pg_subtrans and WAL - Mailing list pgsql-hackers

Previous

Next