RE: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Conflict detection for update_deleted in logical replication
Date
Msg-id TY4PR01MB1690751D1CA8C128B0770EC6F9409A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Conflict detection for update_deleted in logical replication  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
Responses Re: Conflict detection for update_deleted in logical replication
List pgsql-hackers
On Monday, September 8, 2025 7:21 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote:
> 
> On Monday, September 8, 2025 3:13 PM Amit Kapila
> <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Sep 5, 2025 at 5:03 PM Zhijie Hou (Fujitsu)
> > <houzj.fnst@fujitsu.com>
> > wrote:
> > >
> > > Here are v2 patches which addressed above comments.
> > >
> >
> > I have pushed the first patch. I find that the test can't reliably fail without a fix.
> > Can you please investigate it?
> 
> Thank you for catching this issue. I confirmed that the test may have tested
> VACCUM before slot.xmin was advanced. Therefore, to improve the test, I
> modified test to wait for the publisher's request message appearing twice, as
> after the fix, the apply worker should keep waiting for publisher status until the
> prepared txn is committed.
> 
> Also, to reduce test time, I moved the test into the existing 035 test.
> 
> Here is the updated test.

I noticed a BF failure[1] on this test. The log shows that the apply worker
advances the non-removable xid to the latest state before waiting for the
prepared transaction to commit. Upon reviewing the log, I didn't find any clues
of a bug in the code. One potential explanation is that the prepared transaction
hasn't reached the injection point before the apply worker requests the
publisher status.

The log lacks the timing for when the injection point is triggered and only
includes:

pub: 2025-09-11 03:40:05.667 CEST [396867][client backend][8/3:0] LOG:  statement: COMMIT PREPARED
'txn_with_later_commit_ts';
..
sub: 2025-09-11 03:40:05.684 CEST [396798][logical replication apply worker][16/0:0] DEBUG:  sending publisher status
requestmessage
 

Although the statement on the publisher appears before the publisher request,
the statement log is generated prior to command execution. Thus, it's possible
the injection point is triggered after responding to the publisher status.

After checking some other tap tests using injection points, most of them ensure
the injection is triggered before proceeding with the test (by waiting for the
wait event of injection point). We could also add this in the test:

$node_B->wait_for_event('client backend', 'commit-after-delay-checkpoint');

Here is a small patch.

[1]
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=scorpion&dt=2025-09-11%2001%3A17%3A25&stg=subscription-check

Best Regards,
Hou zj

Attachment

pgsql-hackers by date:

Previous
From: Maxim Orlov
Date:
Subject: Re: POC: make mxidoff 64 bits
Next
From: Chao Li
Date:
Subject: Re: GB18030-2022 Support in PostgreSQL