Re: Replication slot stats misgivings - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Replication slot stats misgivings |
Date | |
Msg-id | CAD21AoDOnGFaqyMn+q-7riRm167zc1in65=uk9g3yrWQ1KzgGg@mail.gmail.com Whole thread Raw |
In response to | Re: Replication slot stats misgivings (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Replication slot stats misgivings
|
List | pgsql-hackers |
On Mon, May 3, 2021 at 10:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, May 3, 2021 at 5:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Mon, May 3, 2021 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > > > I am not sure if any of these alternatives are a good idea. What do > > > > > > you think? Do you have any other ideas for this? > > > > > > > > > > I've been considering some ideas but don't come up with a good one > > > > > yet. It’s just an idea and not tested but how about having > > > > > CreateDecodingContext() register before_shmem_exit() callback with the > > > > > decoding context to ensure that we send slot stats even on > > > > > interruption. And FreeDecodingContext() cancels the callback. > > > > > > > > > > > > > Is it a good idea to send stats while exiting and rely on the same? I > > > > think before_shmem_exit is mostly used for the cleanup purpose so not > > > > sure if we can rely on it for this purpose. I think we can't be sure > > > > that in all cases we will send all stats, so maybe Vignesh's patch is > > > > sufficient to cover the cases where we avoid losing it in cases where > > > > we would have sent a large amount of data. > > > > > > > > > > Sawada-San, any thoughts on this point? > > > > before_shmem_exit is mostly used to the cleanup purpose but how about > > on_shmem_exit()? pgstats relies on that to send stats at the > > interruption. See pgstat_shutdown_hook(). > > > > Yeah, that is worth trying. Would you like to give it a try? Yes. In this approach, I think we will need to have a static pointer in logical.c pointing to LogicalDecodingContext that we’re using. At StartupDecodingContext(), we set the pointer to the just created LogicalDecodingContext and register the callback so that we can refer to the LogicalDecodingContext on that callback. And at FreeDecodingContext(), we reset the pointer to NULL (however, since FreeDecodingContext() is not called when an error happens we would need to ensure resetting it somehow). But, after more thought, if we have the static pointer in logical.c it would rather be better to have a global function that sends slot stats based on the LogicalDecodingContext pointed by the static pointer and can be called by ReplicationSlotRelease(). That way, we don’t need to worry about erroring out cases as well as interruption cases, although we need to have a new static pointer. I've attached a quick-hacked patch. I also incorporated the change that calls UpdateDecodingStats() at FreeDecodingContext() so that we can send slot stats also in the case where we spilled/streamed changes but finished without commit/abort/prepare record. > I think > it still might not cover the cases where we error out in the backend > while decoding via APIs because at that time we won't exit, maybe for > that we can consider Vignesh's patch. Agreed. It seems to me that the approach of the attached patch is better than the approach using on_shmem_exit(). So if we want to avoid having the new static pointer and function for this purpose we can consider Vignesh’s patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Attachment
pgsql-hackers by date: