Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAA4eK1+kCww8VAAmDuUkJVtHCK6-n3Z2JDYMd6Mkwp6mC2ts9w@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
On Wed, Oct 25, 2023 at 8:49 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> On 10/9/23 12:30 PM, shveta malik wrote:
> > PFA v22 patch-set. It has below changes:
> >
> > patch 001:
> > 1) Now physical walsender wakes up logical walsender(s) by using a new
> > CV as suggested in [1]
>
> Thanks!
>
> I think that works fine as long as the standby is up and running and catching up.
>
> The problem I see with the current WalSndWaitForStandbyConfirmation() implementation
> is that if the standby is not running then:
>
> +   for (;;)
> +   {
> +       ListCell   *l;
> +       long        sleeptime = -1;
>
> will loop until we reach the "terminating walsender process due to replication timeout" if we
> explicitly want to end with SIGINT or friends.
>
> For example a scenario like:
>
> - standby down
> - pg_recvlogical running
>
> then CTRL-C on pg_recvlogical would not "respond" immediately but when we reach the replication timeout.
>
> So it seems that we should use something like WalSndWait() instead of ConditionVariableTimedSleep() here:
>
> +               /*
> +                * Sleep until other physical walsenders awaken us or until a timeout
> +                * occurs.
> +                */
> +               sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
> +
> +               ConditionVariableTimedSleep(&WalSndCtl->wal_confirm_rcv_cv, sleeptime,
> +
WAIT_EVENT_WAL_SENDER_WAIT_FOR_STANDBY_CONFIRMATION);
>
> In that case I think that WalSndWait() should take care of the new CV WalSndCtl->wal_confirm_rcv_cv too.
> The wait on the socket should allow us to stop waiting when, for example, CTRL-C on pg_recvlogical is triggered.
>
> Then we would need to deal with this scenario: Standby down or not catching up and exited WalSndWait() due to the
socket
> to break the loop or shutdown the walsender.
>
> Thoughts?
>

Good point, I think we should enhance the WalSndWait() logic to
address this case. Additionally, I think we should ensure that
WalSndWaitForWal() shouldn't wait twice once for wal_flush and a
second time for wal to be replayed by physical standby. It should be
okay to just wait for Wal to be replayed by physical standby when
applicable, otherwise, just wait for Wal to flush as we are doing now.
Does that make sense?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Open a streamed block for transactional messages during decoding
Next
From: Andrei Lepikhov
Date:
Subject: Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements