Re: Improve pg_sync_replication_slots() to wait for primary to advance - Mailing list pgsql-hackers
| From | Japin Li |
|---|---|
| Subject | Re: Improve pg_sync_replication_slots() to wait for primary to advance |
| Date | |
| Msg-id | SYAPR01MB30383D32734FB370337D5FE6B6DCA@SYAPR01MB3038.ausprd01.prod.outlook.com Whole thread Raw |
| List | pgsql-hackers |
On Fri, 28 Nov 2025 at 15:46, Ajin Cherian <itsajin@gmail.com> wrote:
> On Wed, Nov 26, 2025 at 7:42 PM shveta malik <shveta.malik@gmail.com>
> wrote:
>>
>> A few comments:
>>
>> 1)
>> +/*
>> + * Structure holding parameters that need to be freed on error in
>> + * pg_sync_replication_slots()
>> + */
>> +typedef struct SlotSyncApiFailureParams
>> +{
>> + WalReceiverConn *wrconn;
>> + List *slot_names;
>> +} SlotSyncApiFailureParams;
>> +
>>
>> We can get rid of it now as we do not use it.
>>
>
> Removed.
>
>> 2)
>> ProcessSlotSyncInterrupts():
>>
>> + if (!AmLogicalSlotSyncWorkerProcess())
>> + ereport(ERROR,
>> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>> + errmsg("cannot continue replication slots synchronization"
>> + " as standby promotion is triggered"));
>> + else
>> + {
>>
>> Can we please reverse the if-else i.e. first worker and then API.
>> Negated if-condition can be avoided in this case.
>>
>
> Changed.
>
>> 3)
>>
>> slotsync_reread_config():
>> + /* Worker-specific check for sync_replication_slots change */
>>
>> Now since we check for both API and worker, this comment is not
>> needed.
>>
>
> Removed.
>
>> 4)
>> - ereport(LOG,
>> - errmsg("replication slot synchronization worker will restart
>> because
>> of a parameter change"));
>>
>> - /*
>> - * Reset the last-start time for this worker so that the postmaster
>> - * can restart it without waiting for
>> SLOTSYNC_RESTART_INTERVAL_SEC.
>> - */
>> - SlotSyncCtx->last_start_time = 0;
>> + ereport(AmLogicalSlotSyncWorkerProcess() ? LOG : ERROR,
>> + errmsg("replication slot synchronization will stop because of a
>> parameter change"));
>>
>> Here, we should retain the same old message for worker i.e. 'worker
>> will restart..'. instead of 'synchronization will stop'. I find the
>> old message better in this case.
>>
>> 5)
>> slotsync_reread_config() is slightly difficult to follow.
>> I think in the case of API, we can display a common error message
>> instead of 2 different messages for 'sync_replication_slot change'
>> and
>> the rest of the parameters. We can mark if any of the parameters
>> changed in both 'if' blocks and if the current process has not
>> exited,
>> then at the end based on 'parameter-changed', we can deal with API
>> by
>> giving a common message. Something like:
>>
>> /*
>> * If we have reached here with a parameter change, we must be
>> running
>> * in SQL function, emit error in such a case.
>> */
>> if (parameter_changed (new variable))
>> {
>> Assert (!AmLogicalSlotSyncWorkerProcess);
>> ereport(ERROR,
>> errmsg("replication slot synchronization will stop because of a
>> parameter change"));
>> }
>>
>
> Fixed as above.
>
> I've addressed the above comments as well as rebased the patch based
> on changes in commit 76b7872 in patch v26
>
1.
Initialize slot_persistence_pending to false (to avoid uninitialized values, or
initialize to true by mistaken) in update_and_persist_local_synced_slot(). This
aligns with the handling of found_consistent_snapshot and remote_slot_precedes
in update_local_synced_slot().
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 20eada3393..c55ba11f17 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -617,6 +617,9 @@ update_and_persist_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid,
bool found_consistent_snapshot = false;
bool remote_slot_precedes = false;
+ if (slot_persistence_pending)
+ *slot_persistence_pending = false;
+
/* Slotsync skip stats are handled in function update_local_synced_slot() */
(void) update_local_synced_slot(remote_slot, remote_dbid,
&found_consistent_snapshot,
2.
This change seems unnecessary。
static void
-slotsync_reread_config(void)
+slotsync_reread_config()
static void
-ProcessSlotSyncInterrupts(void)
+ProcessSlotSyncInterrupts()
3.
Since we are already caching the result of AmLogicalSlotSyncWorkerProcess() in
a local worker variable, how about applying this replacement:
s/if (AmLogicalSlotSyncWorkerProcess())/if (worker)/g?
+ bool worker = AmLogicalSlotSyncWorkerProcess();
+ bool parameter_changed = false;
- Assert(sync_replication_slots);
+ if (AmLogicalSlotSyncWorkerProcess())
+ Assert(sync_replication_slots);
--
Regards,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.
pgsql-hackers by date: