Thread: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

[bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

From

Craig Ringer

Date:

23 November 2016, 12:58:28

Hi all

Today I ran into an issue where commit timestamp lookups were failing with

        ERROR: cannot retrieve commit timestamp for transaction 2

which is of course FrozenTransactionId.

TransactionIdGetCommitTsData(...) ERRORs on !TransactionIdIsNormal(),
which I think is wrong. Attached is a patch to make it return 0 for
FrozenTransactionId and BootstrapTransactionId, like it does for xids
that are too old.

Note that the prior behaviour was as designed and has tests to enforce
it. I just think it's wrong, and it's also not documented.

IMO this should be back-patched to 9.6 and, without the TAP test part, to 9.5.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

0001-Treat-frozen-and-bootstrap-xids-as-old-not-invalid-f.patch

Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

From

Craig Ringer

Date:

23 November 2016, 13:02:39

On 23 November 2016 at 20:58, Craig Ringer <craig@2ndquadrant.com> wrote:
> Hi all
>
> Today I ran into an issue where commit timestamp lookups were failing with
>
>         ERROR: cannot retrieve commit timestamp for transaction 2
>
> which is of course FrozenTransactionId.
>
> TransactionIdGetCommitTsData(...) ERRORs on !TransactionIdIsNormal(),
> which I think is wrong. Attached is a patch to make it return 0 for
> FrozenTransactionId and BootstrapTransactionId, like it does for xids
> that are too old.
>
> Note that the prior behaviour was as designed and has tests to enforce
> it. I just think it's wrong, and it's also not documented.
>
> IMO this should be back-patched to 9.6 and, without the TAP test part, to 9.5.

Updated to correct the other expected file, since there's an alternate.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

0001-Treat-frozen-and-bootstrap-xids-as-old-not-invalid-f.patch

Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

From

Andres Freund

Date:

23 November 2016, 18:32:44

Hi,

On 2016-11-23 20:58:22 +0800, Craig Ringer wrote:
> Today I ran into an issue where commit timestamp lookups were failing with
>
>         ERROR: cannot retrieve commit timestamp for transaction 2
>
> which is of course FrozenTransactionId.
>
> TransactionIdGetCommitTsData(...) ERRORs on !TransactionIdIsNormal(),
> which I think is wrong. Attached is a patch to make it return 0 for
> FrozenTransactionId and BootstrapTransactionId, like it does for xids
> that are too old.

Why? It seems quite correct to not allow lookups for special case
values, as it seems sensible to give them special treatmeant at the call
site?

> IMO this should be back-patched to 9.6 and, without the TAP test part,
> to 9.5.

Why would we want to backpatch a behaviour change, where arguments for
the current and proposed behaviour exists?

Andres

Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

From

Craig Ringer

Date:

24 November 2016, 01:04:46

On 24 November 2016 at 02:32, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2016-11-23 20:58:22 +0800, Craig Ringer wrote:
>> Today I ran into an issue where commit timestamp lookups were failing with
>>
>>         ERROR: cannot retrieve commit timestamp for transaction 2
>>
>> which is of course FrozenTransactionId.
>>
>> TransactionIdGetCommitTsData(...) ERRORs on !TransactionIdIsNormal(),
>> which I think is wrong. Attached is a patch to make it return 0 for
>> FrozenTransactionId and BootstrapTransactionId, like it does for xids
>> that are too old.
>
> Why? It seems quite correct to not allow lookups for special case
> values, as it seems sensible to give them special treatmeant at the call
> site?

It's surprising behaviour that doesn't make sense. Look at it this way:

- We do some work, generating rows that have commit timestamps
- TransactionIdGetCommitTsData() on those rows returns their cts fine
- The commit timestamp data ages out
- TransactionIdGetCommitTsData() returns 0 on these rows
- vacuum comes alone and freezes the rows, even though nothing's changed
- TransactionIdGetCommitTsData() suddenly ERRORs

Nothing has meaningfully changed on these rows. They have gone from
"old, committed, past the commit timestamp threshold" to "old,
commited, past the commit timestamp threshold, frozen".

It makes no sense to ERROR when vacuum gets around to freezing the
tuples, when we don't also ERROR when we pass the cts threshold.

ERRORing on BootstrapTransactionId is slightly more reasonable since
those rows can never have had a cts in the first place, but it's also
unnecessary since they're effectively "oldest always-committed xids".

Making it ERROR on FrozenTransactionId was a mistake and should be corrected.

>> IMO this should be back-patched to 9.6 and, without the TAP test part,
>> to 9.5.
>
> Why would we want to backpatch a behaviour change, where arguments for
> the current and proposed behaviour exists?

I don't think it's crucial since callers can just work around it, but
IMO the current behaviour is a design oversight that should be
corrected and can be safely and sensibly corrected. Nobody's going to
rely on FrozenTransactionId ERRORing.

I don't think a backpatch is crucial though; as you note, C-level
callers can work around the problem pretty simply, and that's just
what I've done in pglogical for existing versions. I just think it's
ugly, should be fixed, and is safe to fix.

It's slightly harder for SQL-level callers to work around since they
must hardcode a CASE that tests for xmin = XID '1' OR xmin = XID '2',
and it's much less reasonable to expect SQL level callers to deal with
this sort of mess with low level state.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

From

Alvaro Herrera

Date:

24 November 2016, 13:31:17

I considered the argument here for a bit and I think Craig is right --
FrozenXid eventually makes it to a tuple's xmin where it becomes a burden
to the caller, making our interface bug-prone -- sure you can
special-case it, but you don't until it first happens ... and it may not
until you're deep into production.

Even the code comment is confused: "error if the given Xid doesn't
normally commit".  But surely FrozenXid *does* commit in the sense that
it appears in committed tuples' Xmin.

We already have a good mechanism for replying to the query with "this
value is too old for us to have its commit TS", which is a false return
value.  We should use that.

I think not backpatching is worse, because then users have to be aware
that they need to handle the FrozenXid case specially, but only on
9.5/9.6 ...  I think the reason it took this long to pop up is because
it has taken this long to get to replication systems on which this issue
matters.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

From

Tom Lane

Date:

24 November 2016, 16:51:31

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> I considered the argument here for a bit and I think Craig is right --

FWIW, I agree.  We shouldn't require every call site to special-case this,
and we definitely don't want it to require special cases in SQL code.

(And I'm for back-patching, too.)
        regards, tom lane

Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

From

Alvaro Herrera

Date:

24 November 2016, 18:43:14

Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > I considered the argument here for a bit and I think Craig is right --
> 
> FWIW, I agree.  We shouldn't require every call site to special-case this,
> and we definitely don't want it to require special cases in SQL code.
> 
> (And I'm for back-patching, too.)

Pushed.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

From

Alvaro Herrera

Date:

24 November 2016, 18:44:46

Craig Ringer wrote:

> Updated to correct the other expected file, since there's an alternate.

FWIW I don't know what you did here, but you did not patch the
alternate expected file.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

From

Craig Ringer

Date:

25 November 2016, 01:19:42

On 25 November 2016 at 02:44, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Craig Ringer wrote:
>
>> Updated to correct the other expected file, since there's an alternate.
>
> FWIW I don't know what you did here, but you did not patch the
> alternate expected file.

Damn. Attached the first patch a second time is what I did.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services