Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint) - Mailing list pgsql-hackers
From | Craig Ringer |
---|---|
Subject | Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint) |
Date | |
Msg-id | CAMsr+YE3wcUEPNs9tty0vVV7QZuB2RLF5J3N8VYLKWN-MCntfQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint) (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
|
List | pgsql-hackers |
On 23 March 2017 at 01:41, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Mar 22, 2017 at 3:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Changes made per discussion. >> >> This looks good to me also. Does what we need it to do. >> >> I was a little worried by possible performance of new lock, but its >> not intended to be run frequently, so its OK. > > Agreed. > > Reviewing 0002: > > + if (!TransactionIdIsValid(xid)) > + { > + LWLockRelease(XidGenLock); > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("transaction ID " UINT64_FORMAT " is an invalid xid", > + xid_with_epoch))); > + } > > It's unnecessary to release LWLockRelease() before throwing an error, > and it's also wrong because we haven't acquired XidGenLock in this > code path. But maybe it would be better to just remove this entirely > and instead have TransactionIdInRecentPast return false for > InvalidTransactionId. Then we'd avoid adding a translatable message > for a case that is basically harmless to allow. Agreed, that's better. > + if (TransactionIdIsCurrentTransactionId(xid)) > + status = gettext_noop("in progress"); > + else if (TransactionIdDidCommit(xid)) > + status = gettext_noop("committed"); > + else if (TransactionIdDidAbort(xid)) > + status = gettext_noop("aborted"); > + else > + > + /* > + * can't test TransactionIdIsInProgress here or we race with > + * concurrent commit/abort. There's no point anyway, since it > + * might then commit/abort just after we check. > + */ > + status = gettext_noop("in progress"); > > I am not sure this is going to do the right thing for transactions > that are aborted by a crash without managing to write an abort record. You're right. It'll report in-progress, since the clog entry will still be 0. We don't appear to explicitly update the clog during recovery. Funny, I got so focused on the clog access safety stuff in the end that I missed the bigger picture. Given that part of the point of this is xact status after crash, that's kind of important. I've added a TAP test for this, as part of a new test set, "011_crash_recovery.pl". The recovery tests don't really have much on crash behaviour at the moment so might as well start it and add more as we go. It doesn't make sense to add a whole new top level TAP test just for txid_status. (I *love* the TAP framework now, it took 10 mins to write a test that's trivial to repeat in 5 seconds per run for this). > It seems that the first check will say the transaction isn't in > progress, and the second and third checks will say it isn't either > committed or aborted since, if I am reading this correctly, they just > read clog directly. Right. They're not concerned with subxacts or multixacts. > Compare HeapTupleSatisfiesMVCC, which assumes > that a not-in-progress, not-committed transaction must necessarily > have aborted. Right. We don't have a HeapTuple here, of course, so we can't test flags. Most importantly this means we can't handle multixacts. If we ever did decide to expose multixacts as bigint epoch-qualified xids for general users, we'd probably reserve the high bit of the epoch the bigint xid representation for the equivalent of HEAP_XMAX_IS_MULTI, and maybe it's worth reserving that bit now and ERROR'ing if we find it set. But right now we never produce a bigint xid with a multixact. I wonder if a note in the docs warning not to cast xid to bigint is worth it. Probably not. You have to cast xid::text::bigint which is kind of a hint, plus it doesn't make sense to test txid_status() for tuples' xmin/xmax . I guess you might use it with xids from pg_stat_activity, but there's not much point when you can just look at pg_stat_activity again to see if the proc of interest is still there and has the same xid. Anyway... > I think your comment is pointing to a real issue, > though. It seems like what might be needed is to add one more check. > Before where you have the "else" clause, check if the XID is old, e.g. > precedes our snapshot's xmin. If so, it must be committed or aborted > and, since it didn't commit, it aborted. If not, it must've changed > from in progress to not-in-progress just as we were in the midst > checking, so labeling it as in progress is fine. That seems clear and sensible. XidInMVCCSnapshot simply tests TransactionIdPrecedes(xid, snapshot->xmin), as does HeapTupleSatisfiesHistoricMVCC . It seems reasonable enough to just examine the snapshot xmin directly in txid_status too; it's already done in replication/logical/snapbuild.c and lmgr/predicate.c. We're running in an SQL-called function so we'll have a good snapshot and GetSnapshotData will have run, so GetActiveSnapshot()->xmin should be sufficient. Amended patch attached, with added TAP test included. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: