Home > mailing lists

Re: Slow standby snapshot - Mailing list pgsql-hackers

From	Michail Nikolaev
Subject	Re: Slow standby snapshot
Date	August 2, 2021 21:07:23
Msg-id	CANtu0ohwA-9QXGA+g_vUTBz9Jcf8UgrPN6LnCvbki9_GY4Cz=g@mail.gmail.com Whole thread Raw
In response to	Re: Slow standby snapshot (Michail Nikolaev <michail.nikolaev@gmail.com>)
Responses	Re: Slow standby snapshot
List	pgsql-hackers

Tree view

Hello.

> I have tried such an approach but looks like it is not effective,
> probably because of CPU caching issues.

It was a mistake by me. I have repeated the approach and got good
results with small and a non-invasive patch.

The main idea is simple optimistic optimization - store offset to next
valid entry. So, in most cases, we could just skip all the gaps.
Of course, it adds some additional impact for workloads without long
(few seconds) transactions but it is almost not detectable (because of
CPU caches).

* TEST

The next testing setup was used:

max_connections=5000 (taken from real RDS installation)
pgbench -i -s 10 -U postgres -d postgres

# emulate typical OLTP load
pgbench -b simple-update -j 1 -c 50 -n -P 1 -T 18000 -U postgres postgres

#emulate typical cache load on replica
pgbench -b select-only -p 6543 -j 1 -c 50 -n -P 1 -T 18000 -U postgres postgres

# emulate some typical long transactions up to 60 seconds on primary
echo "\set t random(0, 60)
    BEGIN;
    select txid_current();
    select pg_sleep(:t);
    COMMIT;" > slow_transactions.bench
pgbench -f /home/nkey/pg/slow_transactions.bench -p 5432 -j 1 -c 10 -n
-P 1 -T 18000 -U postgres postgres

* RESULTS

*REL_13_STABLE* - 23.02% vs 0.76%

non-patched:
  23.02%  postgres           [.] KnownAssignedXidsGetAndSetXmin
   2.56%  postgres            [.] base_yyparse
   2.15%  postgres            [.] AllocSetAlloc
   1.68%  postgres            [.] MemoryContextAllocZeroAligned
   1.51%  postgres            [.] hash_search_with_hash_value
   1.26%  postgres            [.] SearchCatCacheInternal
   1.03%  postgres            [.] hash_bytes
   0.92%  postgres            [.] pg_checksum_block
   0.89%  postgres            [.] expression_tree_walker
   0.81%  postgres            [.] core_yylex
   0.69%  postgres            [.] palloc
   0.68%  [kernel]              [k] do_syscall_64
   0.59%  postgres            [.] _bt_compare
   0.54%  postgres            [.] new_list

patched:
   3.09%  postgres            [.] base_yyparse
   3.00%  postgres            [.] AllocSetAlloc
   2.01%  postgres            [.] MemoryContextAllocZeroAligned
   1.89%  postgres            [.] SearchCatCacheInternal
   1.80%  postgres            [.] hash_search_with_hash_value
   1.27%  postgres            [.] expression_tree_walker
   1.27%  postgres            [.] pg_checksum_block
   1.18%  postgres            [.] hash_bytes
   1.10%  postgres            [.] core_yylex
   0.96%  [kernel]              [k] do_syscall_64
   0.86%  postgres            [.] palloc
   0.84%  postgres            [.] _bt_compare
   0.79%  postgres            [.] new_list
   0.76%  postgres            [.] KnownAssignedXidsGetAndSetXmin

*MASTER* - 6.16% vs ~0%
(includes snapshot scalability optimization by Andres Freund (1))

non-patched:
   6.16%  postgres            [.] KnownAssignedXidsGetAndSetXmin
   3.05%  postgres            [.] AllocSetAlloc
   2.59%  postgres            [.] base_yyparse
   1.95%  postgres            [.] hash_search_with_hash_value
   1.87%  postgres            [.] MemoryContextAllocZeroAligned
   1.85%  postgres            [.] SearchCatCacheInternal
   1.27%  postgres            [.] hash_bytes
   1.16%  postgres            [.] expression_tree_walker
   1.06%  postgres            [.] core_yylex
   0.94%  [kernel]              [k] do_syscall_64

patched:
   3.35%  postgres            [.] base_yyparse
   2.84%  postgres            [.] AllocSetAlloc
   1.89%  postgres            [.] hash_search_with_hash_value
   1.82%  postgres            [.] MemoryContextAllocZeroAligned
   1.79%  postgres            [.] SearchCatCacheInternal
   1.49%  postgres            [.] pg_checksum_block
   1.26%  postgres            [.] hash_bytes
   1.26%  postgres            [.] expression_tree_walker
   1.08%  postgres            [.] core_yylex
   1.04%  [kernel]              [k] do_syscall_64
   0.81%  postgres            [.] palloc

Looks like it is possible to get a significant TPS increase on a very
typical standby workload.
Currently, I have no environment to measure TPS accurately. Could you
please try it on yours?

I have attached two versions of the patch - for master and REL_13_STABLE.
Also, I am going to add a patch to commitfest (2).

Thanks,
MIchail.

(1): https://commitfest.postgresql.org/29/2500/
(2): https://commitfest.postgresql.org/34/3271/

Attachment

pgsql-hackers by date:

From: Andrew Dunstan
Date: 02 August 2021, 20:50:13
Subject: Release 13 of the PostgreSQL BuildFarm client

From: Gilles Darold
Date: 02 August 2021, 21:22:04
Subject: Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace

Re: Slow standby snapshot - Mailing list pgsql-hackers

Attachment

Previous

Next