Hi,
On 2025-01-31 03:30:35 -0800, Dmitry Koterov wrote:
> Debugging some replication lag on a replica when the master node
> experiences heavy writes.
>
> PG "startup recovering" eats up a lot of CPU (like 65 %user and 30 %sys),
> which is a little surprising (what is it doing with all those CPU cycles?
> it looked like WAL replay should be more IO bound than CPU bound?).
>
> Running "perf top -p <pid>", it shows this:
>
> Samples: 1M of event 'cycles:P', 4000 Hz, Event count (approx.):
> 18178814660 lost: 0/0 drop: 0/0
> Overhead Shared Object Symbol
> 16.63% postgres [.] hash_search_with_hash_value
It'd be interesting to see what the paths towards hash_search_with_hash_value
are.
You said it's a COPY workloads, which surprises me a bit, because that should
normally be a bit less sensitive to it. Perhaps you have triggers or such that
prevent use of the multi-insert path?
> 5.38% postgres [.] __aarch64_ldset4_sync
> 4.42% postgres [.] __aarch64_cas4_acq_rel
These two suggest that it might be worth compiling with an -march CPU that
provides native atomics (everything above armv8.1-a, I think).
> Maybe it's a red herring though, but it looks pretty suspicious.
It's unfortunately not too surprising - our buffer mapping table is a pretty
big bottleneck. Both because a hash table is just not a good fit for the
buffer mapping table due to the lack of locality and because dynahash is
really poor hash table implementation.
Greetings,
Andres Freund