This patch may be better, which only track catalog modified transactions.
--
Regards,
ChangAo Chen
------------------ Original ------------------
From: "cca5507" <cca5507@qq.com>;
Date: Sun, Jan 21, 2024 05:25 PM
To: "pgsql-hackers"<pgsql-hackers@lists.postgresql.org>;
Subject: Bug report and fix about building historic snapshot
Hello, I find a bug in building historic snapshot and the steps to reproduce are as follows:
Prepare:
(pub)create table t1 (id int primary key);
(pub)insert into t1 values (1);
(pub)create publication pub for table t1;
(sub)create table t1 (id int primary key);
Reproduce:
(pub)begin; insert into t1 values (2); (txn1 in session1)
(sub)create subscription sub connection 'hostaddr=127.0.0.1 port=5432 user=xxx dbname=postgres' publication pub; (pub will switch to BUILDING_SNAPSHOT state soon)
(pub)begin; insert into t1 values (3); (txn2 in session2)
(pub)create table t2 (id int primary key); (session3)
(pub)commit; (commit txn1, and pub will switch to FULL_SNAPSHOT state soon)
(pub)begin; insert into t2 values (1); (txn3 in session3)
(pub)commit; (commit txn2, and pub will switch to CONSISTENT state soon)
(pub)commit; (commit txn3, and replay txn3 will failed because its snapshot cannot see t2)
Reasons:
We currently don't track the transaction that begin after BUILDING_SNAPSHOT
and commit before FULL_SNAPSHOT when building historic snapshot in logical
decoding. This can cause a transaction which begin after FULL_SNAPSHOT to take
an incorrect historic snapshot because transactions committed in BUILDING_SNAPSHOT
state will not be processed by SnapBuildCommitTxn().
To fix it, we can track the transaction that begin after BUILDING_SNAPSHOT and
commit before FULL_SNAPSHOT forcely by using SnapBuildCommitTxn().