Re: Error "initial slot snapshot too large" in create replication slot - Mailing list pgsql-hackers
| From | Kyotaro Horiguchi |
|---|---|
| Subject | Re: Error "initial slot snapshot too large" in create replication slot |
| Date | |
| Msg-id | 20211012.135959.877910916353346602.horikyota.ntt@gmail.com Whole thread Raw |
| In response to | Re: Error "initial slot snapshot too large" in create replication slot (Dilip Kumar <dilipbalaut@gmail.com>) |
| Responses |
Re: Error "initial slot snapshot too large" in create replication slot
|
| List | pgsql-hackers |
At Mon, 11 Oct 2021 16:48:10 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in
> On Mon, Oct 11, 2021 at 4:29 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > At Mon, 11 Oct 2021 11:49:41 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in
> > > While creating an "export snapshot" I don't see any protection why the
> > > number of xids in the snapshot can not cross the
> > > "GetMaxSnapshotXidCount()"?.
> > >
> > > Basically, while converting the HISTORIC snapshot to the MVCC snapshot
> > > in "SnapBuildInitialSnapshot()", we add all the xids between
> > > snap->xmin to snap->xmax to the MVCC snap->xip array (xids for which
> > > commit were not recorded). The problem is that we add both topxids as
> > > well as the subxids into the same array and expect that the "xid"
> > > count does not cross the "GetMaxSnapshotXidCount()". So it seems like
> > > an issue but I am not sure what is the fix for this, some options are
> >
> > It seems to me that it is a compromise between the restriction of the
> > legitimate snapshot and snapshots created by snapbuild. If the xids
> > overflow, the resulting snapshot may lose a siginificant xid, i.e, a
> > top-level xid.
> >
> > > a) Don't limit the xid count in the exported snapshot and dynamically
> > > resize the array b) Increase the limit to GetMaxSnapshotXidCount() +
> > > GetMaxSnapshotSubxidCount(). But in option b) there would still be a
> > > problem that how do we handle the overflowed subtransaction?
> >
> > I'm afraid that we shouldn't expand the size limits. If I understand
> > it correctly, we only need the top-level xids in the exported snapshot
>
> But since we are using this as an MVCC snapshot, if we don't have the
> subxid and if we also don't mark the "suboverflowed" flag then IMHO
> the sub-transaction visibility might be wrong, Am I missing something?
Sorry I should have set suboverflowed in the generated snapshot.
However, we can store subxid list as well when the snapshot (or
running_xact) is not overflown. These (should) works the same way.
On physical standby, snapshot is created just filling up only subxip
with all top and sub xids (procrray.c:2400). It would be better we do
the same thing here.
> > and reorder buffer knows whether a xid is a top-level or not after
> > establishing a consistent snapshot.
> >
> > The attached diff tries to make SnapBuildInitialSnapshot exclude
> > subtransaction from generated snapshots. It seems working fine for
> > you repro. (Of course, I'm not confident that it is the correct thing,
> > though..)
> >
> > What do you think about this?
>
> If your statement that we only need top-xids in the exported snapshot,
> is true then this fix looks fine to me. If not then we might need to
> add the sub-xids in the snapshot->subxip array and if it crosses the
> limit of GetMaxSnapshotSubxidCount(), then we can mark "suboverflowed"
> flag.
So I came up with the attached version.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From 374a10aa6819224ca6af548100aa34e6c772a2c3 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Date: Tue, 12 Oct 2021 13:53:27 +0900
Subject: [PATCH] Allow overflowed snapshot when creating logical replication
slot
Snapshot can hold top XIDs up to the number of server processes but
SnapBuildInitialSnapshot tries to store all top-level and sub XIDs to
there and fails for certain circumstances. Instead, create a
"takenDuringRecovery" snapshot instead, which stores all XIDs in
subxip array. Addition to that, allow to create an overflowed
snapshot by adding to reorder buffer a function to tell whether an XID
is a top-level or not.
---
.../replication/logical/reorderbuffer.c | 20 ++++++++
src/backend/replication/logical/snapbuild.c | 50 ++++++++++++++-----
src/include/replication/reorderbuffer.h | 1 +
3 files changed, 59 insertions(+), 12 deletions(-)
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 46e66608cf..4e452cce7c 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3326,6 +3326,26 @@ ReorderBufferXidHasBaseSnapshot(ReorderBuffer *rb, TransactionId xid)
}
+/*
+ * ReorderBufferXidIsKnownSubXact
+ * Returns true if the xid is a known subtransaction.
+ */
+bool
+ReorderBufferXidIsKnownSubXact(ReorderBuffer *rb, TransactionId xid)
+{
+ ReorderBufferTXN *txn;
+
+ txn = ReorderBufferTXNByXid(rb, xid, false,
+ NULL, InvalidXLogRecPtr, false);
+
+ /* a known subtxn? */
+ if (txn && rbtxn_is_known_subxact(txn))
+ return true;
+
+ return false;
+}
+
+
/*
* ---------------------------------------
* Disk serialization support
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index a5333349a8..d422315717 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -531,8 +531,9 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
{
Snapshot snap;
TransactionId xid;
- TransactionId *newxip;
- int newxcnt = 0;
+ TransactionId *newsubxip;
+ int newsubxcnt;
+ bool overflowed = false;
Assert(!FirstSnapshotSet);
Assert(XactIsoLevel == XACT_REPEATABLE_READ);
@@ -568,9 +569,12 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
MyProc->xmin = snap->xmin;
- /* allocate in transaction context */
- newxip = (TransactionId *)
- palloc(sizeof(TransactionId) * GetMaxSnapshotXidCount());
+ /*
+ * Allocate in transaction context. We use only subxip to store xids. See
+ * GetSnapshotData for details.
+ */
+ newsubxip = (TransactionId *)
+ palloc(sizeof(TransactionId) * GetMaxSnapshotSubxidCount());
/*
* snapbuild.c builds transactions in an "inverted" manner, which means it
@@ -578,6 +582,9 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
* classical snapshot by marking all non-committed transactions as
* in-progress. This can be expensive.
*/
+retry:
+ newsubxcnt = 0;
+
for (xid = snap->xmin; NormalTransactionIdPrecedes(xid, snap->xmax);)
{
void *test;
@@ -591,12 +598,29 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
if (test == NULL)
{
- if (newxcnt >= GetMaxSnapshotXidCount())
- ereport(ERROR,
- (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
- errmsg("initial slot snapshot too large")));
+ bool add = true;
- newxip[newxcnt++] = xid;
+ /* exlude subxids when subxip is overflowed */
+ if (overflowed &&
+ ReorderBufferXidIsKnownSubXact(builder->reorder, xid))
+ add = false;
+
+ if (add)
+ {
+ if (newsubxcnt >= GetMaxSnapshotSubxidCount())
+ {
+ if (overflowed)
+ ereport(ERROR,
+ (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg("initial slot snapshot too large")));
+
+ /* retry excluding subxids */
+ overflowed = true;
+ goto retry;
+ }
+
+ newsubxip[newsubxcnt++] = xid;
+ }
}
TransactionIdAdvance(xid);
@@ -604,8 +628,10 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
/* adjust remaining snapshot fields as needed */
snap->snapshot_type = SNAPSHOT_MVCC;
- snap->xcnt = newxcnt;
- snap->xip = newxip;
+ snap->xcnt = 0;
+ snap->subxcnt = newsubxcnt;
+ snap->subxip = newsubxip;
+ snap->suboverflowed = overflowed;
return snap;
}
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff75f7..e5fa1051d7 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -669,6 +669,7 @@ void ReorderBufferXidSetCatalogChanges(ReorderBuffer *, TransactionId xid, XLog
bool ReorderBufferXidHasCatalogChanges(ReorderBuffer *, TransactionId xid);
bool ReorderBufferXidHasBaseSnapshot(ReorderBuffer *, TransactionId xid);
+bool ReorderBufferXidIsKnownSubXact(ReorderBuffer *rb, TransactionId xid);
bool ReorderBufferRememberPrepareInfo(ReorderBuffer *rb, TransactionId xid,
XLogRecPtr prepare_lsn, XLogRecPtr end_lsn,
TimestampTz prepare_time,
--
2.27.0
pgsql-hackers by date: