Re: Improve pg_sync_replication_slots() to wait for primary to advance - Mailing list pgsql-hackers

From Japin Li
Subject Re: Improve pg_sync_replication_slots() to wait for primary to advance
Date
Msg-id ME0P300MB0445602461FB0FDED4BC0E58B6C3A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
Whole thread Raw
List pgsql-hackers
On Thu, 06 Nov 2025 at 18:53, Ajin Cherian <itsajin@gmail.com> wrote:
> On Tue, Nov 4, 2025 at 5:23 PM shveta malik <shveta.malik@gmail.com> wrote:
>>
>> >
>> > I have addressed the above comments in patch v21.
>> >
>>
>> Thank You. Please find a few comments:
>>
>> 1)
>> + fparams.slot_names = slot_names = NIL;
>>
>> I think it is not needed to set slot_names to NIL.
>>
>> 2)
>> -    WAIT_EVENT_REPLICATION_SLOTSYNC_MAIN);
>> +    WAIT_EVENT_REPLICATION_SLOTSYNC_PRIMARY_CATCHUP);
>>
>> The new name does not seem appropriate. For the slotsync-worker case,
>> even when the primary is not behind, the worker still waits but it is
>> not waiting for primary to catch-up. I could not find a better name
>> except the original one 'WAIT_EVENT_REPLICATION_SLOTSYNC_MAIN'. We can
>> change the explanation to :
>>
>> "Waiting in main loop of slot sync worker and slot sync API."
>> Or
>> "Waiting in main loop of slot synchronization."
>>
>> If anyone has any better name suggestions, we can consider changing.
>
> Changed as suggested above.
>
>>
>> 3)
>>
>> +# Attempt to synchronize slots using API. The API will continue retrying
>> +# synchronization until the remote slot catches up with the locally reserved
>> +# position. The API will not return until this happens, to be able to make
>> +# further calls, call the API in a background process.
>>
>> Shall we remove 'with the locally reserved position', it’s already
>> explained in the test header and the comment is good enough even
>> without it.
>>
>
> Changed.
>
>> 4)
>> +# Confirm log that the slot has been synced after becoming sync-ready.
>>
>> Shall we just say:
>> Confirm from the log that the slot is sync-ready now.
>>
>> 5)
>>  # Synchronize the primary server slots to the standby.
>>  $standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
>> @@ -945,6 +1007,7 @@ $subscriber1->safe_psql('postgres',
>>  is( $standby1->safe_psql(
>>   'postgres',
>>   q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN
>> ('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
>> +
>>   ),
>>
>> Redundant change.
>
> Removed.
>
> Attaching patch v22 addressing the above comments.

@@ -62,8 +62,8 @@ LOGICAL_APPLY_MAIN    "Waiting in main loop of logical replication apply process."
 LOGICAL_LAUNCHER_MAIN    "Waiting in main loop of logical replication launcher process."
 LOGICAL_PARALLEL_APPLY_MAIN    "Waiting in main loop of logical replication parallel apply process."
 RECOVERY_WAL_STREAM    "Waiting in main loop of startup process for WAL to arrive, during streaming recovery."
-REPLICATION_SLOTSYNC_MAIN    "Waiting in main loop of slot sync worker."
 REPLICATION_SLOTSYNC_SHUTDOWN    "Waiting for slot sync worker to shut down."
+REPLICATION_SLOTSYNC_MAIN    "Waiting in main loop of slot synchronization."
 SYSLOGGER_MAIN    "Waiting in main loop of syslogger process."
 WAL_RECEIVER_MAIN    "Waiting in main loop of WAL receiver process."
 WAL_SENDER_MAIN    "Waiting in main loop of WAL sender process."

I've noticed that all events are sorted alphabetically.  I think we should keep
the order of REPLICATION_SLOTSYNC_MAIN unchanged.

--
Regards,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



pgsql-hackers by date:

Previous
From: Xuneng Zhou
Date:
Subject: Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array
Next
From: Nishant Sharma
Date:
Subject: Re: on_error table, saving error info to a table