Home > mailing lists

Re: long-standing data loss bug in initial sync of logical replication - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: long-standing data loss bug in initial sync of logical replication
Date	February 24 12:49:49
Msg-id	CAA4eK1KDPryBr-ewsW_Rkh5x0oiqpcOHAF3GAocDUrK2CV+atA@mail.gmail.com Whole thread Raw
In response to	Re: long-standing data loss bug in initial sync of logical replication (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses	RE: long-standing data loss bug in initial sync of logical replication
List	pgsql-hackers

Tree view

On Wed, Dec 11, 2024 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I confirmed that the proposed patch fixes these issues. I have one
> question about the patch:
>
> In the main loop in SnapBuildDistributeSnapshotAndInval(), we have the
> following code:
>
>        /*
>         * If we don't have a base snapshot yet, there are no changes in this
>         * transaction which in turn implies we don't yet need a snapshot at
>         * all. We'll add a snapshot when the first change gets queued.
>         *
>         * NB: This works correctly even for subtransactions because
>         * ReorderBufferAssignChild() takes care to transfer the base snapshot
>         * to the top-level transaction, and while iterating the changequeue
>         * we'll get the change from the subtxn.
>         */
>        if (!ReorderBufferXidHasBaseSnapshot(builder->reorder, txn->xid))
>            continue;
>
> Is there any case where we need to distribute inval messages to
> transactions that don't have the base snapshot yet but eventually need
> the inval messages?
>

Good point. It is mentioned that for snapshots: "We'll add a snapshot
when the first change gets queued.". I think we achieve this via
builder->committed.xip array such that when we set a base snapshot for
a transaction, we use that array to form a snapshot. However, I don't
see any such consideration for invalidations. Now, we could either
always add invalidations to xacts that don't have base_snapshot yet or
have a mechanism similar committed.xid array. But it is better to
first reproduce the problem.

> Overall, with this idea, we distribute invalidation messages to all
> concurrent decoded transactions. It could introduce performance
> regressions by several causes. For example, we could end up
> invalidating RelationSyncCache entries in more cases. While this is
> addressed by your selectively cache invalidation patch, there is still
> 5% regression. We might need to accept a certain amount of regressions
> for making it correct but it would be better to figure out where these
> regressions come from. Other than that, I think the performance
> regression could happen due to the costs of distributing invalidation
> messages. You've already observed there is 1~3% performance regression
> in cases where we distribute a large amount of invalidation messages
> to one concurrently decoded transaction[1]. I guess that the
> selectively cache invalidation idea would not help this case. Also, I
> think we might want to test other cases like where we distribute a
> small amount of invalidation messages to many concurrently decoded
> transactions.
>

+1.

--
With Regards,
Amit Kapila.

pgsql-hackers by date:

From: Andres Freund
Date: 24 February, 12:41:36
Subject: Re: Remove wal_[sync|write][_time] from pg_stat_wal and track_wal_io_timing

From: "Jelte Fennema-Nio"
Date: 24 February, 12:56:23
Subject: Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup

Re: long-standing data loss bug in initial sync of logical replication - Mailing list pgsql-hackers

Previous

Next