Re: recovering from "found xmin ... from before relfrozenxid ..." - Mailing list pgsql-hackers

From Tom Lane
Subject Re: recovering from "found xmin ... from before relfrozenxid ..."
Date
Msg-id 784080.1600546745@sss.pgh.pa.us
Whole thread Raw
In response to Re: recovering from "found xmin ... from before relfrozenxid ..."  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: recovering from "found xmin ... from before relfrozenxid ..."
List pgsql-hackers
I wrote:
> I was able to partially reproduce whelk's failure here.  I got a
> couple of cases of "cannot freeze committed xmax", which then leads
> to the second NOTICE diff; but I couldn't reproduce the first
> NOTICE diff.  That was out of about a thousand tries :-( so it's not
> looking like a promising thing to reproduce without modifying the test.

... however, it's trivial to reproduce via manual interference,
using the same strategy discussed recently for another case:
add a pg_sleep at the start of the heap_surgery.sql script,
run "make installcheck", and while that's running start another
session in which you begin a serializable transaction, execute
any old SELECT, and wait.  AFAICT this reproduces all of whelk's
symptoms with 100% reliability.

With a little more effort, this could be automated by putting
some long-running transaction (likely, it needn't be any more
complicated than "select pg_sleep(10)") in a second test script
launched in parallel with heap_surgery.sql.

So this confirms the suspicion that the cause of the buildfarm
failures is a concurrently-open transaction, presumably from
autovacuum.  I don't have time to poke further right now.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: factorial function/phase out postfix operators?
Next
From: Thomas Munro
Date:
Subject: Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds)