Thread: Add progressive backoff to XactLockTableWait functions

Add progressive backoff to XactLockTableWait functions

From
Xuneng Zhou
Date:
Hi hackers,

This patch implements progressive backoff in XactLockTableWait() and
ConditionalXactLockTableWait().

As Kevin reported in this thread [1], XactLockTableWait() can enter a
tight polling loop during logical replication slot creation on standby
servers, sleeping for fixed 1ms intervals that can continue for a long
time. This creates significant CPU overhead.

The patch implements a time-based threshold approach based on Fujii’s
idea [1]: keep sleeping for 1ms until the total sleep time reaches 10
seconds, then start exponential backoff (doubling the sleep duration
each cycle) up to a maximum of 10 seconds per sleep. This balances
responsiveness for normal operations (which typically complete within
seconds) against CPU efficiency for the long waits in some logical
replication scenarios.

[1] https://www.postgresql.org/message-id/flat/CAM45KeELdjhS-rGuvN%3DZLJ_asvZACucZ9LZWVzH7bGcD12DDwg%40mail.gmail.com

Best regards,
Xuneng

Attachment

Re: Add progressive backoff to XactLockTableWait functions

From
Fujii Masao
Date:

On 2025/06/08 23:33, Xuneng Zhou wrote:
> Hi hackers,
>
> This patch implements progressive backoff in XactLockTableWait() and
> ConditionalXactLockTableWait().
>
> As Kevin reported in this thread [1], XactLockTableWait() can enter a
> tight polling loop during logical replication slot creation on standby
> servers, sleeping for fixed 1ms intervals that can continue for a long
> time. This creates significant CPU overhead.
>
> The patch implements a time-based threshold approach based on Fujii’s
> idea [1]: keep sleeping for 1ms until the total sleep time reaches 10
> seconds, then start exponential backoff (doubling the sleep duration
> each cycle) up to a maximum of 10 seconds per sleep. This balances
> responsiveness for normal operations (which typically complete within
> seconds) against CPU efficiency for the long waits in some logical
> replication scenarios.

Thanks for the patch!

When I first suggested this idea, I used 10s as an example for
the maximum sleep time. But thinking more about it now, 10s might
be too long. Even if the target transaction has already finished,
XactLockTableWait() could still wait up to 10 seconds,
which seems excessive.

What about using 1s instead? That value is already used as a max
sleep time in other places, like WaitExceedsMaxStandbyDelay().

If we agree on 1s as the max, then using exponential backoff from
1ms to 1s after the threshold might not be necessary. It might
be simpler and sufficient to just sleep for 1s once we hit
the threshold.

Based on that, I think a change like the following could work well.
Thought?

----------------------------------------
         XactLockTableWaitInfo info;
         ErrorContextCallback callback;
         bool            first = true;
+       int             left_till_hibernate = 5000;

<snip>

                 if (!first)
                 {
                         CHECK_FOR_INTERRUPTS();
-                       pg_usleep(1000L);
+
+                       if (left_till_hibernate > 0)
+                       {
+                               pg_usleep(1000L);
+                               left_till_hibernate--;
+                       }
+                       else
+                               pg_usleep(1000000L);
----------------------------------------

Regards,

--
Fujii Masao
NTT DATA Japan Corporation




Re: Add progressive backoff to XactLockTableWait functions

From
Xuneng Zhou
Date:
Hi,

Thanks for the feedback! 

On Thu, Jun 12, 2025 at 10:02 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:


When I first suggested this idea, I used 10s as an example for
the maximum sleep time. But thinking more about it now, 10s might
be too long. Even if the target transaction has already finished,
XactLockTableWait() could still wait up to 10 seconds,
which seems excessive.

+1, this could be a problem 
 
What about using 1s instead? That value is already used as a max
sleep time in other places, like WaitExceedsMaxStandbyDelay().

1s should be generally good
 
If we agree on 1s as the max, then using exponential backoff from
1ms to 1s after the threshold might not be necessary. It might
be simpler and sufficient to just sleep for 1s once we hit
the threshold.

That makes sense to me.

Based on that, I think a change like the following could work well.
Thought?

I'll update the patch accordingly.

Best regards,
Xuneng

Re: Add progressive backoff to XactLockTableWait functions

From
Xuneng Zhou
Date:
Hi,

Although it’s clear that replacing tight 1 ms polling loops will reduce CPU usage, I'm curious about the hard numbers. To that end, I ran a 60 s logical-replication slot–creation workload on a standby using three different XactLockTableWait() variants—on an 8-core, 16 GB AMD system—and collected both profiling traces and hardware-counter metrics. 


1. Hardware‐counter results


image.png

  • CPU cycles drop by 58% moving from 1 ms to exp. backoff, and another 25% to the 1 s threshold variant.
  • Cache‐misses and context‐switches see similarly large reductions.
  • IPC remains around 0.45, dipping slightly under longer sleeps.

2. Flame‐graph 
See attached files

Best regards, 
Xuneng

Attachment

Re: Add progressive backoff to XactLockTableWait functions

From
Andres Freund
Date:
Hi,

On 2025-06-08 22:33:39 +0800, Xuneng Zhou wrote:
> This patch implements progressive backoff in XactLockTableWait() and
> ConditionalXactLockTableWait().
> 
> As Kevin reported in this thread [1], XactLockTableWait() can enter a
> tight polling loop during logical replication slot creation on standby
> servers, sleeping for fixed 1ms intervals that can continue for a long
> time. This creates significant CPU overhead.
> 
> The patch implements a time-based threshold approach based on Fujii’s
> idea [1]: keep sleeping for 1ms until the total sleep time reaches 10
> seconds, then start exponential backoff (doubling the sleep duration
> each cycle) up to a maximum of 10 seconds per sleep. This balances
> responsiveness for normal operations (which typically complete within
> seconds) against CPU efficiency for the long waits in some logical
> replication scenarios.

ISTM that this is going to wrong way - the real problem is that we seem to
have extended periods where XactLockTableWait() doesn't actually work, not
that the sleep time is too short.  The sleep in XactLockTableWait() was
intended to address a very short race, not something that's essentially
unbound.

Greetings,

Andres Freund