Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date
Msg-id CAD21AoA77mhD5j2bGR2gJ2TzvBQ+=6ZuepWuzZPKfZJRTpEArg@mail.gmail.com
Whole thread Raw
In response to Re: POC: enable logical decoding when wal_level = 'replica' without a server restart  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
List pgsql-hackers
On Tue, Sep 16, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 15, 2025 at 10:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sun, Sep 14, 2025 at 7:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Sep 12, 2025 at 11:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 11, 2025 at 9:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > >
> > > > > For the shutdown sequence, can't we think of resetting effective_wal
> > > > > after a restart?
> > > >
> > > > Does it mean that effective_wal_level keeps 'logical' until the next
> > > > server starts?
> > > >
> > >
> > > Yes, IIUC, effective_wal_level is anyway a derived value based on
> > > current wal_level and presence of logical slots. So, what will be the
> > > impact if it is not accurate at shutdown?
> >
> > I think there won't be an impact at shutdown time. I would rather be
> > concerned that such behavior could confuse users. I think it would not
> > be a rare situation where users enable and disable logical decoding by
> > creating and dropping a temporary slot. If we keep effective_wal_level
> > 'logical' in this case, users would want to somehow disable logical
> > decoding as it could have a negative performance impact.
> >
>
> When user is dropping a temporary slot, we should disable the
> decoding. The lazy behaviour should be for ERROR or session_exit
> cases.

I think it might be worth discussing whether to use lazy behavior in
all cases. There are several advantages:

- It mitigates the risk of connection timeouts during a logical slot
drop or a subscription drop.
- In scenarios involving frequent creation and deletion of logical
slots (such as during initial data synchronization), it could
potentially avoid the issue of a frequent switch on and off.

On the other hand, drawbacks are:

- users would have to wait for effective_wal_level to get decreased to
'replica' somehow.
- makes the checkpointer more busy in addition to its checkpointing job.
- it could take a longer time to disable logical decoding if the
checkpoint is busy with a checkpointing job.

What do you think?

>
> > There would
> > be two ways for users to change it to 'replica': restart the server or
> > create and drop a logical slot again.
> >
>
> If we do the lazy work during the checkpoint then they can perform the
> checkpoint command.

Right.

>
>  On the other hand, for users who
> > dropped a non-temporary logical slot without an error or dropped the
> > non-last temporary slot, logical decoding is disabled without other
> > manual interventions. It could be pretty hard to assess the situation,
> > resulting in having users always checking effective_wal_level after
> > dropping a logical slot and doing extra steps to make the
> > effective_wal_level 'replica'.
> >
>
> When the last slot is dropped, anyway, users won't be able to perform
> any decoding. Do you mean that they want to know whether logical_wal
> is still being recorded? If so, then checking effective_wal_level
> would be the way.

I think the situation that users would want to avoid is that the
logical decoding is enabled (therefore writing logical_wal) even when
they don't want to use logical decoding because it means the system is
paying unnecessary costs in terms of writing logical_wal. It would not
be a problem if we can ensure that logical decoding is eventually
disabled in a reasonably short time in any case using lazy behavior.
On the other hand, I think it would not be a good user experience if
it's required for users to restart the server or do other manual
interventions in some specific scenarios in order to disable logical
decoding.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: PG 18 release notes draft committed
Next
From: Masahiko Sawada
Date:
Subject: Re: POC: Parallel processing of indexes in autovacuum