RE: Fix slot synchronization with two_phase decoding enabled - Mailing list pgsql-hackers
From | Zhijie Hou (Fujitsu) |
---|---|
Subject | RE: Fix slot synchronization with two_phase decoding enabled |
Date | |
Msg-id | OS0PR01MB571616DB432287BA89A79A1694812@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Fix slot synchronization with two_phase decoding enabled (shveta malik <shveta.malik@gmail.com>) |
Responses |
Re: Fix slot synchronization with two_phase decoding enabled
RE: Fix slot synchronization with two_phase decoding enabled |
List | pgsql-hackers |
On Mon, Apr 28, 2025 at 5:33 PM shveta malik wrote: > > On Mon, Apr 28, 2025 at 2:27 PM Zhijie Hou (Fujitsu) > <houzj.fnst@fujitsu.com> wrote: > > > > On Fri, Apr 18, 2025 at 12:29 PM Amit Kapila wrote: > > > > > > On Thu, Apr 17, 2025 at 6:14 PM Zhijie Hou (Fujitsu) > > > <houzj.fnst@fujitsu.com> > > > > > > > > ----- > > > > Fix > > > > ----- > > > > > > > > I think we should keep the confirmed_flush even if the previous > > > > synced restart_lsn/catalog_xmin is newer. Attachments include a > > > > patch for the > > > same. > > > > > > > > > > This will fix the case we are facing but adds a new rule for slot > synchronization. > > > Can we think of a simpler way to fix this by avoiding updating other > > > slot fields (like two_phase, two_phase_at) if restart_lsn or > > > catalog_xmin of the local slot is ahead of the remote slot? > > > > Since the fix for xmin advancement during fast-forward decoding has > > been pushed (commit aaf9e95), I'm attaching the V2 patch to address > > the assert failure by avoiding updating other slot fields (like > > two_phase, two_phase_at) if restart_lsn or catalog_xmin of the local slot is > ahead. > > > > The patch looks good to me. We can have minor comment tweak: > > + * Syncing only two_phase_at, without also syncing the latest > + * confirmed_lsn, might lead to transactions between the old > + * confirmed_lsn and two_phase_at being unexpectedly decoded and sent > + * to the subscriber. > > We can append "following a promotion". Thanks for reviewing. Here is V3 patch that addressed it. BTW, I also did some tests to confirm the catalog_xmin could still be ahead in some case, and here is an example: 1. Create a failover replication slot named 'logicalslot' on primary and acquire it in the walsender. 2. Log two standby snapshots on primary. Before logging, call txid_current() To assign a xid, so that each standby snapshot will hold a new xid in its oldestrunningXid field: - txid_current(); - `0/3000420` - `running_xacts` (no running transactions, oldestrunningXid = 755) - txid_current(); - `0/3000488` - `running_xacts` (no running transactions, oldestrunningXid = 756) 3. The walsender sets `0/3000420` as the `candidate_restart_lsn`, 755 as `candidate_catalog_xmin`, skipping the second `running_xacts` because `candidate_restart_lsn`/`candidate_catalog_xmin` is already valid. 4. After receiving a reply from the apply worker, the walsender assigns `0/3000420` to `restart_lsn`, `755` to `catalog_xmin`. At this point, the replication slot 'logicalslot' has `restart_lsn` set to `0/3000420`, `catalog_xmin` set to `755`. 5. On the standby, execute `pg_sync_replication_slots()` to synchronize 'logicalslot'. 6. During synchronization, with the initial `restart_lsn` at `0/3000420`, the sync slot reaches a consistent point at this position. As a result, it does not update `candidate_restart_lsn` and `candidate_catalog_xmin` at consistent point (refer to `SnapBuildProcessRunningXacts()`). 7. The sync process identifies the second standby snapshot at `0/3000488` and uses its LSN as `candidate_restart_lsn`, and use the oldestrunningXid `756` as `candidate_catalog_xmin`, eventually updating it to `restart_lsn` and `catalog_xmin`. 8. Now, the synced slot holds `restart_lsn` at `0/3000488`, `catalog_xmin` at `756`, which are all ahead of the remote slot on the primary server. Attaching a script to reproduce the same. Note that, to reproduce this stably, we'd better modify the value of LOG_SNAPSHOT_INTERVAL_MS in bgwriter.c to a bigger number to avoid unexpected xl_running_xacts logging. Best Regards, Hou zj
Attachment
pgsql-hackers by date: