Re: mvcc catalo gsnapshots and TopTransactionContext - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: mvcc catalo gsnapshots and TopTransactionContext |
Date | |
Msg-id | 20130810031707.GB7843@alap2.anarazel.de Whole thread Raw |
In response to | Re: mvcc catalo gsnapshots and TopTransactionContext (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: mvcc catalo gsnapshots and TopTransactionContext
|
List | pgsql-hackers |
On 2013-08-09 14:11:46 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2013-08-08 09:27:24 -0400, Robert Haas wrote: > >> How can it be safe to try to read catalogs if the transaction is aborted? > > > Well. It isn't. At least not in general. The specific case triggered > > here though are cache invalidations being processed which can lead to > > the catalog being read (pretty crummy, but not easy to get rid > > of). That's actually safe since before we process the invalidations we > > have done: > > 1) CurrentTransactionState->state = TRANS_ABORT > > 2) RecordTransactionAbort(), marking the transaction as aborted in the > > clog > > 3) marked subxacts as aborted > > 3) ProcArrayEndTransaction() (for toplevel ones) > > > Due to these any tqual stuff will treat the current (sub-)xact and it's > > children as aborted. So the catalog lookups will use the catalog in a > > sensible state. > > I don't have any faith in this argument. You might be right that we'll > correctly see our own output rows as aborted, but that's barely the tip > of the iceberg of risk here. Is it safe to take new locks in an aborted > transaction? (What if we're already past the lock-release point in > the abort sequence?) Don't get me wrong. I find the idea of doing catalog lookup during abort horrid. But it's been that way for at least 10 years (I checked 7.4), so it has at least some resemblance of working. Today we do a good bit less than back then, for one we don't do a full cache reload during abort anymore, just for the index support infrastructure. Also, you've reduced the amount of lookups a bit with the relmapper introduction. > For that matter, given that we don't know what > exactly caused the transaction abort, how safe is it to do anything at > all --- we might for instance be nearly out of memory. If the catalog > reading attempt itself fails, won't we be in an infinite loop of > transaction aborts? Looks like that's possible, yes. There seem to be quite some other opportunities for this to happen if you look at the amount of work done in AbortSubTransaction(). I guess it rarely happens because we previously release some memory... > I could probably think of ten more risks if I spent a few more minutes > at it. No need to convince me here. I neither could believe we were doing this, nor figure out why it even "works" for the first hour of looking at it. > Cache invalidation during abort should *not* lead to any attempt to > immediately revalidate the cache. No amount of excuses will make that > okay. I have not looked to see just what the path of control is in this > particular case, but we need to fix it, not paper over it. I agree, although that's easier said than done in the case of subtransactions. The problem we have there is that it's perfectly valid to still have references to a relation from the outer, not aborted, transaction. Those need to be valid for anybody looking at the relcache entry after we've processed the ROLLBACK TO/... I guess the fix is something like we do in the commit case, where we transfer invalidations to the parent transaction. If we then process local invalidations *after* we've cleaned up the subtransaction completely we should be fine. We already do an implicity CommandCounterIncrement() in CommitSubTransaction()... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: