Thread: Taking into account syncrep position in flush_lsn reported by apply worker

Taking into account syncrep position in flush_lsn reported by apply worker

From
Arseny Sher
Date:
Hey. Currently synchronous_commit is by default disabled for logical 
apply worker on the
ground that reported flush_lsn includes only locally flushed data so slot
(publisher) preserves everything higher than this, and so in case of 
subscriber
restart no data is lost. However, imagine that subscriber is made highly
available by standby to which synchronous replication is enabled. Then 
reported
flush_lsn is ignorant of this synchronous replication progress, and in 
case of
failover data loss may occur if subscriber managed to ack flush_lsn ahead of
syncrep. Moreover, it is almost silent due to this

     else if (start_lsn < slot->data.confirmed_flush)
     {
         /*
          * It might seem like we should error out in this case, but it's
          * pretty common for a client to acknowledge a LSN it doesn't 
have to
          * do anything for, and thus didn't store persistently, because the
          * xlog records didn't result in anything relevant for logical
          * decoding. Clients have to be able to do that to support 
synchronous
          * replication.
          *
          * Starting at a different LSN than requested might not catch 
certain
          * kinds of client errors; so the client may wish to check that
          * confirmed_flush_lsn matches its expectations.
          */
         elog(LOG, "%X/%X has been already streamed, forwarding to %X/%X",
              LSN_FORMAT_ARGS(start_lsn),
              LSN_FORMAT_ARGS(slot->data.confirmed_flush));

         start_lsn = slot->data.confirmed_flush;
     }


in logical.c

Attached draft patch fixes this by taking into account syncrep progress in

worker reporting.


-- cheers, arseny



Attachment
Sorry for the poor formatting of the message above, this should be better:

Hey. Currently synchronous_commit is disabled for logical apply worker
on the ground that reported flush_lsn includes only locally flushed data
so slot (publisher) preserves everything higher than this, and so in
case of subscriber restart no data is lost. However, imagine that
subscriber is made highly available by standby to which synchronous
replication is enabled. Then reported flush_lsn is ignorant of this
synchronous replication progress, and in case of failover data loss may
occur if subscriber managed to ack flush_lsn ahead of syncrep. Moreover,
it is silent due to this

    else if (start_lsn < slot->data.confirmed_flush)
    {
        /*
         * It might seem like we should error out in this case, but it's
         * pretty common for a client to acknowledge a LSN it doesn't have to
         * do anything for, and thus didn't store persistently, because the
         * xlog records didn't result in anything relevant for logical
         * decoding. Clients have to be able to do that to support synchronous
         * replication.
         *
         * Starting at a different LSN than requested might not catch certain
         * kinds of client errors; so the client may wish to check that
         * confirmed_flush_lsn matches its expectations.
         */
        elog(LOG, "%X/%X has been already streamed, forwarding to %X/%X",
             LSN_FORMAT_ARGS(start_lsn),
             LSN_FORMAT_ARGS(slot->data.confirmed_flush));

        start_lsn = slot->data.confirmed_flush;
    }


in logical.c

Attached draft patch fixes this by taking into account syncrep progress
in worker reporting.
Attachment
On Mon, Aug 12, 2024 at 3:43 PM Arseny Sher <ars@neon.tech> wrote:
>
> Sorry for the poor formatting of the message above, this should be better:
>
> Hey. Currently synchronous_commit is disabled for logical apply worker
> on the ground that reported flush_lsn includes only locally flushed data
> so slot (publisher) preserves everything higher than this, and so in
> case of subscriber restart no data is lost. However, imagine that
> subscriber is made highly available by standby to which synchronous
> replication is enabled. Then reported flush_lsn is ignorant of this
> synchronous replication progress, and in case of failover data loss may
> occur if subscriber managed to ack flush_lsn ahead of syncrep.
>

Won't the same can be achieved by enabling the synchronous_commit
parameter for a subscription?

--
With Regards,
Amit Kapila.




On 8/13/24 06:35, Amit Kapila wrote:
> On Mon, Aug 12, 2024 at 3:43 PM Arseny Sher <ars@neon.tech> wrote:
>>
>> Sorry for the poor formatting of the message above, this should be better:
>>
>> Hey. Currently synchronous_commit is disabled for logical apply worker
>> on the ground that reported flush_lsn includes only locally flushed data
>> so slot (publisher) preserves everything higher than this, and so in
>> case of subscriber restart no data is lost. However, imagine that
>> subscriber is made highly available by standby to which synchronous
>> replication is enabled. Then reported flush_lsn is ignorant of this
>> synchronous replication progress, and in case of failover data loss may
>> occur if subscriber managed to ack flush_lsn ahead of syncrep.
>>
> 
> Won't the same can be achieved by enabling the synchronous_commit
> parameter for a subscription?

Nope, because it would force WAL flush and wait for replication to the
standby in the apply worker, slowing down it. The logic missing
currently is not to wait for the synchronous commit, but still mind its
progress in the flush_lsn reporting.

-- cheers, arseny



Re: Taking into account syncrep position in flush_lsn reported by apply worker

From
Heikki Linnakangas
Date:
On 14/08/2024 16:54, Arseny Sher wrote:
> On 8/13/24 06:35, Amit Kapila wrote:
>> On Mon, Aug 12, 2024 at 3:43 PM Arseny Sher <ars@neon.tech> wrote:
>>>
>>> Sorry for the poor formatting of the message above, this should be 
>>> better:
>>>
>>> Hey. Currently synchronous_commit is disabled for logical apply worker
>>> on the ground that reported flush_lsn includes only locally flushed data
>>> so slot (publisher) preserves everything higher than this, and so in
>>> case of subscriber restart no data is lost. However, imagine that
>>> subscriber is made highly available by standby to which synchronous
>>> replication is enabled. Then reported flush_lsn is ignorant of this
>>> synchronous replication progress, and in case of failover data loss may
>>> occur if subscriber managed to ack flush_lsn ahead of syncrep.
>>
>> Won't the same can be achieved by enabling the synchronous_commit
>> parameter for a subscription?
> 
> Nope, because it would force WAL flush and wait for replication to the
> standby in the apply worker, slowing down it. The logic missing
> currently is not to wait for the synchronous commit, but still mind its
> progress in the flush_lsn reporting.

I think this patch makes sense. I'm not sure we've actually made any 
promises on it, but it feels wrong that the slot's LSN might be advanced 
past the LSN that's been has been acknowledged by the replica, if 
synchronous replication is configured. I see little downside in making 
that promise.

> +    /*
> +     * If synchronous replication is configured, take into account its position.
> +     */
> +    if (SyncRepStandbyNames != NULL && SyncRepStandbyNames[0] != '\0')
> +    {
> +        LWLockAcquire(SyncRepLock, LW_SHARED);
> +        local_flush = Min(local_flush, WalSndCtl->lsn[SYNC_REP_WAIT_FLUSH]);
> +        LWLockRelease(SyncRepLock);
> +    }
> +

Should probably use the SyncStandbysDefined() macro here. Or check 
WalSndCtl->sync_standbys_defined like SyncRepWaitForLSN() does; not sure 
which would be more appropriate here.

Should the synchronous_commit setting also affect this?

Please also check if the docs need to be updated, or if a paragraph 
should be added somewhere on this behavior.

A TAP test case would be nice. Not sure how complicated it will be, but 
if not too complicated, it'd be nice to include it in check-world.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




On Wed, Aug 21, 2024 at 2:25 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> On 14/08/2024 16:54, Arseny Sher wrote:
> > On 8/13/24 06:35, Amit Kapila wrote:
> >> On Mon, Aug 12, 2024 at 3:43 PM Arseny Sher <ars@neon.tech> wrote:
> >>>
> >>> Sorry for the poor formatting of the message above, this should be
> >>> better:
> >>>
> >>> Hey. Currently synchronous_commit is disabled for logical apply worker
> >>> on the ground that reported flush_lsn includes only locally flushed data
> >>> so slot (publisher) preserves everything higher than this, and so in
> >>> case of subscriber restart no data is lost. However, imagine that
> >>> subscriber is made highly available by standby to which synchronous
> >>> replication is enabled. Then reported flush_lsn is ignorant of this
> >>> synchronous replication progress, and in case of failover data loss may
> >>> occur if subscriber managed to ack flush_lsn ahead of syncrep.
> >>
> >> Won't the same can be achieved by enabling the synchronous_commit
> >> parameter for a subscription?
> >
> > Nope, because it would force WAL flush and wait for replication to the
> > standby in the apply worker, slowing down it. The logic missing
> > currently is not to wait for the synchronous commit, but still mind its
> > progress in the flush_lsn reporting.
>
> I think this patch makes sense. I'm not sure we've actually made any
> promises on it, but it feels wrong that the slot's LSN might be advanced
> past the LSN that's been has been acknowledged by the replica, if
> synchronous replication is configured. I see little downside in making
> that promise.
>

One possible downside of such a promise could be that the publisher
may slow down for sync replication because it has to wait for all the
configured sync_standbys of subscribers to acknowledge the LSN. I
don't know how many applications can be impacted due to this if we do
it by default but if we feel there won't be any such cases or they
will be in the minority then it is okay to proceed with this.

--
With Regards,
Amit Kapila.



Re: Taking into account syncrep position in flush_lsn reported by apply worker

From
Heikki Linnakangas
Date:
On 21/08/2024 09:25, Amit Kapila wrote:
> On Wed, Aug 21, 2024 at 2:25 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>
>> On 14/08/2024 16:54, Arseny Sher wrote:
>>> On 8/13/24 06:35, Amit Kapila wrote:
>>>> On Mon, Aug 12, 2024 at 3:43 PM Arseny Sher <ars@neon.tech> wrote:
>>>>>
>>>>> Sorry for the poor formatting of the message above, this should be
>>>>> better:
>>>>>
>>>>> Hey. Currently synchronous_commit is disabled for logical apply worker
>>>>> on the ground that reported flush_lsn includes only locally flushed data
>>>>> so slot (publisher) preserves everything higher than this, and so in
>>>>> case of subscriber restart no data is lost. However, imagine that
>>>>> subscriber is made highly available by standby to which synchronous
>>>>> replication is enabled. Then reported flush_lsn is ignorant of this
>>>>> synchronous replication progress, and in case of failover data loss may
>>>>> occur if subscriber managed to ack flush_lsn ahead of syncrep.
>>>>
>>>> Won't the same can be achieved by enabling the synchronous_commit
>>>> parameter for a subscription?
>>>
>>> Nope, because it would force WAL flush and wait for replication to the
>>> standby in the apply worker, slowing down it. The logic missing
>>> currently is not to wait for the synchronous commit, but still mind its
>>> progress in the flush_lsn reporting.
>>
>> I think this patch makes sense. I'm not sure we've actually made any
>> promises on it, but it feels wrong that the slot's LSN might be advanced
>> past the LSN that's been has been acknowledged by the replica, if
>> synchronous replication is configured. I see little downside in making
>> that promise.
> 
> One possible downside of such a promise could be that the publisher
> may slow down for sync replication because it has to wait for all the
> configured sync_standbys of subscribers to acknowledge the LSN. I
> don't know how many applications can be impacted due to this if we do
> it by default but if we feel there won't be any such cases or they
> will be in the minority then it is okay to proceed with this.

It only slows down updating the flush LSN on the publisher, which is 
updated quite lazily anyway.

A more serious scenario is if the sync replica crashes or is not 
responding at all. In that case, the flush LSN on the publisher cannot 
advance, and WAL starts to accumulate. However, if a sync replica is not 
responding, that's very painful for the (subscribing) server anyway: all 
commits will hang waiting for the replica. Holding back the flush LSN on 
the publisher seems like a minor problem compared to that.

It would be good to have some kind of an escape hatch though. If you get 
into that situation, is there a way to advance the publisher's flush LSN 
even though the synchronous replica has crashed? You can temporarily 
turn off synchronous replication on the subscriber. That will release 
any COMMITs on the server too. In theory you might not want that, but in 
practice stuck COMMITs are so painful that if you are taking manual 
action, you probably do want to release them as well.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




On Wed, Aug 21, 2024 at 12:40 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> On 21/08/2024 09:25, Amit Kapila wrote:
> >>
> >> I think this patch makes sense. I'm not sure we've actually made any
> >> promises on it, but it feels wrong that the slot's LSN might be advanced
> >> past the LSN that's been has been acknowledged by the replica, if
> >> synchronous replication is configured. I see little downside in making
> >> that promise.
> >
> > One possible downside of such a promise could be that the publisher
> > may slow down for sync replication because it has to wait for all the
> > configured sync_standbys of subscribers to acknowledge the LSN. I
> > don't know how many applications can be impacted due to this if we do
> > it by default but if we feel there won't be any such cases or they
> > will be in the minority then it is okay to proceed with this.
>
> It only slows down updating the flush LSN on the publisher, which is
> updated quite lazily anyway.
>

But doesn't that also mean that if the logical subscriber is
configured in synchronous_standby_names, then the publisher's
transactions also need to wait for such an update? We do update it
lazily but as soon as the operation is applied to the subscriber the
transaction on publisher will be released, however, IIUC the same
won't be true after the patch.

> A more serious scenario is if the sync replica crashes or is not
> responding at all. In that case, the flush LSN on the publisher cannot
> advance, and WAL starts to accumulate. However, if a sync replica is not
> responding, that's very painful for the (subscribing) server anyway: all
> commits will hang waiting for the replica. Holding back the flush LSN on
> the publisher seems like a minor problem compared to that.
>

Yeah, but as per my understanding that also means holding all the
active work/transactions on the publisher which doesn't sound to be a
minor problem.

> It would be good to have some kind of an escape hatch though. If you get
> into that situation, is there a way to advance the publisher's flush LSN
> even though the synchronous replica has crashed? You can temporarily
> turn off synchronous replication on the subscriber. That will release
> any COMMITs on the server too. In theory you might not want that, but in
> practice stuck COMMITs are so painful that if you are taking manual
> action, you probably do want to release them as well.
>

This will work in the scenario you mentioned.

If the above understanding is correct and you agree that it is not a
good idea to hold back transactions on the publisher then we can think
of a new subscription that allows the apply worker to wait for
synchronous replicas.

--
With Regards,
Amit Kapila.