Re: Two-phase commit issues - Mailing list pgsql-hackers
From | Alvaro Herrera |
---|---|
Subject | Re: Two-phase commit issues |
Date | |
Msg-id | 20050518230635.GB10521@surnet.cl Whole thread Raw |
In response to | Two-phase commit issues (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Two-phase commit issues
|
List | pgsql-hackers |
On Wed, May 18, 2005 at 05:15:09PM -0400, Tom Lane wrote: > I've started to look seriously at Heikki's patch for two-phase commit. Hum. I started a few days ago doing some reviewing, with the intention of correcting some things here and there in order to present it all to you later, with a pre-filter to get some bugs out. > There are a few issues that probably deserve discussion: > > * The major missing issue that I've come across so far is that > subtransaction and multixact state isn't preserved across a crash. [...] > (AFAICS it's sufficient to make each subxact link directly to the top > XID, even if there was a more complex hierarchy originally.) Right, we don't care about the hierarchy; we know all those subXids were committed. > Similarly, we've got to reconstruct MultiXactIds that any prepared > xacts are members of, else row-level locks taken out by prepared xacts > won't be enforced correctly. I think this can be handled if we add to > the state files a list of all MultiXactIds that each prepared xact > belongs to, and then during restart forcibly recreate those > MultiXactIds. (They would only be rebuilt with prepared XIDs, not any > ordinary XIDs that might originally have been members.) This seems to > require some new code in multixact.c, but not anything fundamentally > difficult --- Alvaro, do you see any likely problems in this stuff? I'm not sure if it affects in any way that a Xid=1, which participates in a MultiXactId is seen as not prepared when Xid=2 prepares, which also participates in the same MultiXactId; if Xid=1 is prepared later, the MultiXactId needs to be restored with both Xids as participants. > * The patch is designed to dump state files into WAL as well as onto > disk. Why? Wouldn't it be better just to write and fsync the state > file before reporting successful prepare? That would get rid of the > need for checkpoint-time fsyncs. I made the same observation. > * I'm inclined to think that the "gid" identifiers for prepared > transactions ought to be SQL identifiers (names), not string literals. > Was there a particular reason for making them strings? Ditto. > * There are some fairly ugly cases associated with creation and deletion > of temporary tables as well. I think we might want to just decree that > you can't PREPARE a transaction that included creating or dropping a > temp table. Does anyone have much of a problem with that? Does this affect any of the other things that use the direct-fsync-no-WAL path in the smgr? -- Alvaro Herrera (<alvherre[a]surnet.cl>) "Having your biases confirmed independently is how scientific progress is made, and hence made our great society what it is today" (Mary Gardiner)
pgsql-hackers by date: