Re: Error "initial slot snapshot too large" in create replication slot - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Error "initial slot snapshot too large" in create replication slot |
Date | |
Msg-id | 20211012.135959.877910916353346602.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Error "initial slot snapshot too large" in create replication slot (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Error "initial slot snapshot too large" in create replication slot
|
List | pgsql-hackers |
At Mon, 11 Oct 2021 16:48:10 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in > On Mon, Oct 11, 2021 at 4:29 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > > > At Mon, 11 Oct 2021 11:49:41 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in > > > While creating an "export snapshot" I don't see any protection why the > > > number of xids in the snapshot can not cross the > > > "GetMaxSnapshotXidCount()"?. > > > > > > Basically, while converting the HISTORIC snapshot to the MVCC snapshot > > > in "SnapBuildInitialSnapshot()", we add all the xids between > > > snap->xmin to snap->xmax to the MVCC snap->xip array (xids for which > > > commit were not recorded). The problem is that we add both topxids as > > > well as the subxids into the same array and expect that the "xid" > > > count does not cross the "GetMaxSnapshotXidCount()". So it seems like > > > an issue but I am not sure what is the fix for this, some options are > > > > It seems to me that it is a compromise between the restriction of the > > legitimate snapshot and snapshots created by snapbuild. If the xids > > overflow, the resulting snapshot may lose a siginificant xid, i.e, a > > top-level xid. > > > > > a) Don't limit the xid count in the exported snapshot and dynamically > > > resize the array b) Increase the limit to GetMaxSnapshotXidCount() + > > > GetMaxSnapshotSubxidCount(). But in option b) there would still be a > > > problem that how do we handle the overflowed subtransaction? > > > > I'm afraid that we shouldn't expand the size limits. If I understand > > it correctly, we only need the top-level xids in the exported snapshot > > But since we are using this as an MVCC snapshot, if we don't have the > subxid and if we also don't mark the "suboverflowed" flag then IMHO > the sub-transaction visibility might be wrong, Am I missing something? Sorry I should have set suboverflowed in the generated snapshot. However, we can store subxid list as well when the snapshot (or running_xact) is not overflown. These (should) works the same way. On physical standby, snapshot is created just filling up only subxip with all top and sub xids (procrray.c:2400). It would be better we do the same thing here. > > and reorder buffer knows whether a xid is a top-level or not after > > establishing a consistent snapshot. > > > > The attached diff tries to make SnapBuildInitialSnapshot exclude > > subtransaction from generated snapshots. It seems working fine for > > you repro. (Of course, I'm not confident that it is the correct thing, > > though..) > > > > What do you think about this? > > If your statement that we only need top-xids in the exported snapshot, > is true then this fix looks fine to me. If not then we might need to > add the sub-xids in the snapshot->subxip array and if it crosses the > limit of GetMaxSnapshotSubxidCount(), then we can mark "suboverflowed" > flag. So I came up with the attached version. regards. -- Kyotaro Horiguchi NTT Open Source Software Center From 374a10aa6819224ca6af548100aa34e6c772a2c3 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Tue, 12 Oct 2021 13:53:27 +0900 Subject: [PATCH] Allow overflowed snapshot when creating logical replication slot Snapshot can hold top XIDs up to the number of server processes but SnapBuildInitialSnapshot tries to store all top-level and sub XIDs to there and fails for certain circumstances. Instead, create a "takenDuringRecovery" snapshot instead, which stores all XIDs in subxip array. Addition to that, allow to create an overflowed snapshot by adding to reorder buffer a function to tell whether an XID is a top-level or not. --- .../replication/logical/reorderbuffer.c | 20 ++++++++ src/backend/replication/logical/snapbuild.c | 50 ++++++++++++++----- src/include/replication/reorderbuffer.h | 1 + 3 files changed, 59 insertions(+), 12 deletions(-) diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 46e66608cf..4e452cce7c 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -3326,6 +3326,26 @@ ReorderBufferXidHasBaseSnapshot(ReorderBuffer *rb, TransactionId xid) } +/* + * ReorderBufferXidIsKnownSubXact + * Returns true if the xid is a known subtransaction. + */ +bool +ReorderBufferXidIsKnownSubXact(ReorderBuffer *rb, TransactionId xid) +{ + ReorderBufferTXN *txn; + + txn = ReorderBufferTXNByXid(rb, xid, false, + NULL, InvalidXLogRecPtr, false); + + /* a known subtxn? */ + if (txn && rbtxn_is_known_subxact(txn)) + return true; + + return false; +} + + /* * --------------------------------------- * Disk serialization support diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c index a5333349a8..d422315717 100644 --- a/src/backend/replication/logical/snapbuild.c +++ b/src/backend/replication/logical/snapbuild.c @@ -531,8 +531,9 @@ SnapBuildInitialSnapshot(SnapBuild *builder) { Snapshot snap; TransactionId xid; - TransactionId *newxip; - int newxcnt = 0; + TransactionId *newsubxip; + int newsubxcnt; + bool overflowed = false; Assert(!FirstSnapshotSet); Assert(XactIsoLevel == XACT_REPEATABLE_READ); @@ -568,9 +569,12 @@ SnapBuildInitialSnapshot(SnapBuild *builder) MyProc->xmin = snap->xmin; - /* allocate in transaction context */ - newxip = (TransactionId *) - palloc(sizeof(TransactionId) * GetMaxSnapshotXidCount()); + /* + * Allocate in transaction context. We use only subxip to store xids. See + * GetSnapshotData for details. + */ + newsubxip = (TransactionId *) + palloc(sizeof(TransactionId) * GetMaxSnapshotSubxidCount()); /* * snapbuild.c builds transactions in an "inverted" manner, which means it @@ -578,6 +582,9 @@ SnapBuildInitialSnapshot(SnapBuild *builder) * classical snapshot by marking all non-committed transactions as * in-progress. This can be expensive. */ +retry: + newsubxcnt = 0; + for (xid = snap->xmin; NormalTransactionIdPrecedes(xid, snap->xmax);) { void *test; @@ -591,12 +598,29 @@ SnapBuildInitialSnapshot(SnapBuild *builder) if (test == NULL) { - if (newxcnt >= GetMaxSnapshotXidCount()) - ereport(ERROR, - (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE), - errmsg("initial slot snapshot too large"))); + bool add = true; - newxip[newxcnt++] = xid; + /* exlude subxids when subxip is overflowed */ + if (overflowed && + ReorderBufferXidIsKnownSubXact(builder->reorder, xid)) + add = false; + + if (add) + { + if (newsubxcnt >= GetMaxSnapshotSubxidCount()) + { + if (overflowed) + ereport(ERROR, + (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE), + errmsg("initial slot snapshot too large"))); + + /* retry excluding subxids */ + overflowed = true; + goto retry; + } + + newsubxip[newsubxcnt++] = xid; + } } TransactionIdAdvance(xid); @@ -604,8 +628,10 @@ SnapBuildInitialSnapshot(SnapBuild *builder) /* adjust remaining snapshot fields as needed */ snap->snapshot_type = SNAPSHOT_MVCC; - snap->xcnt = newxcnt; - snap->xip = newxip; + snap->xcnt = 0; + snap->subxcnt = newsubxcnt; + snap->subxip = newsubxip; + snap->suboverflowed = overflowed; return snap; } diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h index 5b40ff75f7..e5fa1051d7 100644 --- a/src/include/replication/reorderbuffer.h +++ b/src/include/replication/reorderbuffer.h @@ -669,6 +669,7 @@ void ReorderBufferXidSetCatalogChanges(ReorderBuffer *, TransactionId xid, XLog bool ReorderBufferXidHasCatalogChanges(ReorderBuffer *, TransactionId xid); bool ReorderBufferXidHasBaseSnapshot(ReorderBuffer *, TransactionId xid); +bool ReorderBufferXidIsKnownSubXact(ReorderBuffer *rb, TransactionId xid); bool ReorderBufferRememberPrepareInfo(ReorderBuffer *rb, TransactionId xid, XLogRecPtr prepare_lsn, XLogRecPtr end_lsn, TimestampTz prepare_time, -- 2.27.0
pgsql-hackers by date: