Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date
Msg-id CAA4eK1JXYcPrJ+pvzsDWHw6shyzWinL0yQNwythaqh75E4QkXA@mail.gmail.com
Whole thread Raw
In response to Re: POC: enable logical decoding when wal_level = 'replica' without a server restart  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
List pgsql-hackers
On Mon, Sep 8, 2025 at 11:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Sep 5, 2025 at 9:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Sep 6, 2025 at 3:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Sep 2, 2025 at 5:12 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > > >
> > > >
> > > > I tested the behaviour with HEAD and with Patch. And I confirmed the
> > > > change in behaviour between HEAD and Patch
> > > >
> > > > Suppose we have a primary and a standby with wal_level = logical and
> > > > guc parameters to enable slot sync worker are set accordingly. A slot
> > > > sync worker will be running.
> > > > Now we change the value of wal_level for primary to replica. And
> > > > restart the primary server
> > > >
> > > > With HEAD, during restart the existing sync_slot_worker will exit with:
> > > > 2025-09-02 11:49:08.846 IST [3877882] ERROR:  synchronization worker
> > > > "" could not connect to the primary server: connection to server at
> > > > "localhost" (127.0.0.1), port 5432 failed: Connection refused
> > > > Is the server running on that host and accepting TCP/IP connections?
> > > > 2025-09-02 11:49:11.380 IST [3877885] FATAL:  streaming replication
> > > > receiver "walreceiver" could not connect to the primary server:
> > > > connection to server at "localhost" (127.0.0.1), port 5432 failed:
> > > > Connection refused
> > > > Is the server running on that host and accepting TCP/IP connections?
> > > >
> > > > and after the restart of the primary server, slot sync worker will
> > > > restart and it is able to connect to the primary.
> > > >
> > > > With Patch, during restart the existing sync_slot_worker will exit.
> > > > But after the restart of the primary server, slot sync worker cannot
> > > > start and we can see following log:
> > > > 2025-09-02 12:44:51.497 IST [3947520] LOG:  replication slot
> > > > synchronization worker is shutting down on receiving SIGINT
> > > > 2025-09-02 12:44:51.498 IST [3943504] LOG:  replication slot
> > > > synchronization requires logical decoding to be enabled
> > > > 2025-09-02 12:44:51.498 IST [3943504] HINT:  To enable logical
> > > > decoding on primary, set "wal_level" >= "logical" or create at least
> > > > one logical slot when "wal_level" = "replica".
> > > > 2025-09-02 12:45:51.537 IST [3943504] LOG:  replication slot
> > > > synchronization requires logical decoding to be enabled
> > > > 2025-09-02 12:45:51.537 IST [3943504] HINT:  To enable logical
> > > > decoding on primary, set "wal_level" >= "logical" or create at least
> > > > one logical slot when "wal_level" = "replica".
> > > >
> > > > So, with HEAD, after we restart the primary server with 'wal_level =
> > > > replica', the slot sync worker can restart and connect to the primary
> > > > but with patch it cannot start after restart due to the check in
> > > > ValidateSlotSyncParams.
> > >
> > > But the slotsync worker is launched again once logical decoding is
> > > enabled, no? I'm not sure that we want to launch the slotsync worker
> > > also when we know logical decoding is not enabled.
> > >
> >
> > Why in the first place the logical_decoding enabled check has failed
> > because IIUC, the wal_level on standby is still 'logical'?
>
> This is because logical decoding on standbys can be used only when the
> standby's effective_wal_level is 'logical', which also means the
> primary's effective_wal_level is 'logical' too. This behavior is
> mostly the same as today; logical decoding on standbys can be used
> only when both the primary and the standbys set wal_level to
> 'logical'. Even if standby's wal_level is set to logical, it doesn't
> mean that incoming WAL records are generated on the primary with the
> information required by logical decoding.
>

This is true but IIUC Shlok's report says that we are able to restart
server before patch and not after patch. Am, I missing something? If
not, then shouldn't this be fixed separately first?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Clear logical slot's 'synced' flag on promotion of standby
Next
From: vignesh C
Date:
Subject: Re: Logical Replication of sequences