Home > mailing lists

Re: Two-phase commit issues - Mailing list pgsql-hackers

From	Alvaro Herrera
Subject	Re: Two-phase commit issues
Date	May 18, 2005 20:06:48
Msg-id	20050518230635.GB10521@surnet.cl Whole thread Raw
In response to	Two-phase commit issues (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Two-phase commit issues
List	pgsql-hackers

Tree view

On Wed, May 18, 2005 at 05:15:09PM -0400, Tom Lane wrote:
> I've started to look seriously at Heikki's patch for two-phase commit.

Hum.  I started a few days ago doing some reviewing, with the intention
of correcting some things here and there in order to present it all to
you later, with a pre-filter to get some bugs out.

> There are a few issues that probably deserve discussion:
> 
> * The major missing issue that I've come across so far is that
> subtransaction and multixact state isn't preserved across a crash.
[...]
> (AFAICS it's sufficient to make each subxact link directly to the top
> XID, even if there was a more complex hierarchy originally.)

Right, we don't care about the hierarchy; we know all those subXids were
committed.

> Similarly, we've got to reconstruct MultiXactIds that any prepared
> xacts are members of, else row-level locks taken out by prepared xacts
> won't be enforced correctly.  I think this can be handled if we add to
> the state files a list of all MultiXactIds that each prepared xact
> belongs to, and then during restart forcibly recreate those
> MultiXactIds.  (They would only be rebuilt with prepared XIDs, not any
> ordinary XIDs that might originally have been members.)  This seems to
> require some new code in multixact.c, but not anything fundamentally
> difficult --- Alvaro, do you see any likely problems in this stuff?

I'm not sure if it affects in any way that a Xid=1, which participates
in a MultiXactId is seen as not prepared when Xid=2 prepares, which also
participates in the same MultiXactId; if Xid=1 is prepared later, the
MultiXactId needs to be restored with both Xids as participants.


> * The patch is designed to dump state files into WAL as well as onto
> disk.  Why?  Wouldn't it be better just to write and fsync the state
> file before reporting successful prepare?  That would get rid of the
> need for checkpoint-time fsyncs.

I made the same observation.

> * I'm inclined to think that the "gid" identifiers for prepared
> transactions ought to be SQL identifiers (names), not string literals.
> Was there a particular reason for making them strings?

Ditto.

> * There are some fairly ugly cases associated with creation and deletion
> of temporary tables as well.  I think we might want to just decree that
> you can't PREPARE a transaction that included creating or dropping a
> temp table.  Does anyone have much of a problem with that?

Does this affect any of the other things that use the direct-fsync-no-WAL
path in the smgr?

-- 
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Having your biases confirmed independently is how scientific progress is
made, and hence made our great society what it is today" (Mary Gardiner)

pgsql-hackers by date:

From: "Joe Chang"
Date: 18 May 2005, 19:43:33
Subject: Re: Two-phase commit issues

From: Alvaro Herrera
Date: 18 May 2005, 20:10:14
Subject: Re: Learning curves and such (was Re: pgFoundry)

Re: Two-phase commit issues - Mailing list pgsql-hackers

Previous

Next