Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date | |
Msg-id | CAD21AoALaRUZkec7+XL_vFn0=wW8UbObS=FhymUK=zOeHxTMow@mail.gmail.com Whole thread Raw |
In response to | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
|
List | pgsql-hackers |
On Wed, Sep 17, 2025 at 4:19 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Sep 16, 2025 at 11:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Tue, Sep 16, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > When user is dropping a temporary slot, we should disable the > > > decoding. The lazy behaviour should be for ERROR or session_exit > > > cases. > > > > I think it might be worth discussing whether to use lazy behavior in > > all cases. > > > > Agreed. > > > There are several advantages: > > > > - It mitigates the risk of connection timeouts during a logical slot > > drop or a subscription drop. > > - In scenarios involving frequent creation and deletion of logical > > slots (such as during initial data synchronization), it could > > potentially avoid the issue of a frequent switch on and off. > > > > On the other hand, drawbacks are: > > > > - users would have to wait for effective_wal_level to get decreased to > > 'replica' somehow. > > - makes the checkpointer more busy in addition to its checkpointing job. > > - it could take a longer time to disable logical decoding if the > > checkpoint is busy with a checkpointing job. > > > > This last point in drawback could hurt performance of systems for a > longer time when that was really not required. It should be okay to > use lazy behavior in all cases when we can do that in a predictable > time. Agreed. If we use the lazy behavior in ERROR or session_exit cases, we would have these drawbacks anyway. But assuming it won't happen frequently in practice, we can live with that. > The other background process to consider doing lazy processing > is the launcher whose role is to launch apply workers for subscription > and maintain a conflict_slot (if required). Now, because disabling > logical_info could also take longer time in worst cases, the > launcher's own tasks can become unpredictable. Also, if tomorrow, we > decide to support dynamically changing wal_level from minimal to some > upper level, the launcher won't be the appropriate process. Right. Also, we don't launch the launcher process when max_logical_replication_workers == 0. It should be >0 on the subscriber but might not be on the publisher. > > The other idea could be to have a new auxiliary process to disable > logical_info lazily. It is arguable if we just have a separate process > for this purpose but we have previously discussed some other tasks for > such a process like removal of old_serialized_snapshots and > old_logical_ rewrite_map files. See [1]. If we agree to have a > separate process for this purpose then disabling logical_info in all > cases sounds okay to me. Yeah, the custodian worker would be one solution. But please refer to subsequent discussions[1][2]; there might not be other tasks to delegate to the custodian worker than this logical decoding deactivation, and it might be not optimal to have a single worker that is responsible for all custodian works. Actually we've discussed a similar idea on this thread and I drafted a patch[3] that utilizes bgworkers to do internal tasks in the background in a one-task-per-one-worker manner. It requires more discussion anyway if we want to go with this direction. I think we can start with using lazy behavior in ERROR or session_exit cases (assuming it won't happen frequently in practice), and consider using lazy behavior other cases if it's really preferable. Regards, [1] https://www.postgresql.org/message-id/1058306.1680467858%40sss.pgh.pa.us [2] https://www.postgresql.org/message-id/20230402184226.kkjplqvqu6utvzbt%40awork3.anarazel.de [3] https://www.postgresql.org/message-id/CAD21AoCPc%2BpEgb0pJeiS2CU39ad8VW-10Ze7Uii%3D1RRjfgQ0uw%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: