RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers
From | shiy.fnst@fujitsu.com |
---|---|
Subject | RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date | |
Msg-id | OSZPR01MB6310F013D78D0C67EF7544DAFDB29@OSZPR01MB6310.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher (Önder Kalacı <onderkalaci@gmail.com>) |
Responses |
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
|
List | pgsql-hackers |
On Wed, Mar 1, 2023 9:22 PM Önder Kalacı <onderkalaci@gmail.com> wrote: > > Hi Andres, Amit, Shi Yu, all > > Andres Freund <mailto:andres@anarazel.de>, 28 Şub 2023 Sal, 21:39 tarihinde şunu yazdı: > Hi, > > On 2023-02-25 16:00:05 +0530, Amit Kapila wrote: > > On Tue, Feb 21, 2023 at 7:55 PM Önder Kalacı <mailto:onderkalaci@gmail.com> wrote: > > >> I think this overhead seems to be mostly due to the need to perform > > >> tuples_equal multiple times for duplicate values. > > I think more work needs to be done to determine the source of the > overhead. It's not clear to me why there'd be an increase in tuples_equal() > calls in the tests upthread. > > You are right, looking closely, in fact, we most of the time do much less > tuples_equal() with index scan. > > I've done some profiling with perf, and created flame graphs for the apply worker, with the > test described above: -- case 1 (All values are duplicated). I used the following commands: > - perf record -F 99 -p 122555 -g -- sleep 60 > - perf script | ./http://stackcollapse-perf.pl > out.perf-folded > - ./http://flamegraph.pl out.perf-folded > perf_[index|seq]_scan.svg > > I attached both flame graphs. I do not see anything specific regarding what the patch does, but > instead the difference mostly seems to come down to index scan vs sequential scan related > functions. As I continue to investigate, I thought it might be useful to share the flame graphs > so that more experienced hackers could comment on the difference. > > Regarding my own end-to-end tests: In some runs, the sequential scan is indeed faster for case-1. But, > when I execute update tbl set a=a+1; for 50 consecutive times, and measure end to end performance, I see > much better results for index scan, only case-1 is on-par as mostly I'd expect. > > Case-1, running the update 50 times and waiting all changes applied > • index scan: 2minutes 36 seconds > • sequential scan: 2minutes 30 seconds > Case-2, running the update 50 times and waiting all changes applied > • index scan: 1 minutes, 2 seconds > • sequential scan: 2minutes 30 seconds > Case-7, running the update 50 times and waiting all changes applied > • index scan: 6 seconds > • sequential scan: 2minutes 26seconds > > > > # Result > The time executing update (the average of 3 runs is taken, the unit is > milliseconds): > > Shi Yu, could it be possible for you to re-run the tests with some more runs, and share the average? > I suspect maybe your test results have a very small pool size, and some runs are making > the average slightly problematic. > > In my tests, I shared the total time, which is probably also fine. > Thanks for your reply, I re-tested (based on v25_0001_use_index_on_subs_when_pub_rep_ident_full.patch) and took the average of 100 runs. The results are as follows. The unit is milliseconds. case1 sequential scan: 1348.57 index scan: 3785.15 case2 sequential scan: 1350.26 index scan: 1754.01 case3 sequential scan: 1350.13 index scan: 1340.97 There was still some degradation in the first two cases. There are some gaps in our test results. Some information about my test is as follows. a. Some parameters specified in postgresql.conf. shared_buffers = 8GB checkpoint_timeout = 30min max_wal_size = 20GB min_wal_size = 10GB autovacuum = off b. Executed SQL. I executed TRUNCATE and INSERT before each UPDATE. I am not sure if you did the same, or just executed 50 consecutive UPDATEs. If the latter one, there would be lots of old tuples and this might have a bigger impact on sequential scan. I tried this case (which executes 50 consecutive UPDATEs) and also saw that the overhead is smaller than before. Besides, I looked into the regression of this patch with `gprof`. Some results are as follows. I think with single buffer lock, sequential scan can scan multiple tuples (see heapgettup()), while index scan can only scan one tuple. So in case1, which has lots of duplicate values and more tuples need to be scanned, index scan takes longer time. - results of `gprof` case1: master % cumulative self self total time seconds seconds calls ms/call ms/call name 1.37 0.66 0.01 654312 0.00 0.00 LWLockAttemptLock 0.00 0.73 0.00 573358 0.00 0.00 LockBuffer 0.00 0.73 0.00 10014 0.00 0.06 heap_getnextslot patched % cumulative self self total time seconds seconds calls ms/call ms/call name 9.70 1.27 0.36 50531459 0.00 0.00 LWLockAttemptLock 3.23 2.42 0.12 100259200 0.00 0.00 LockBuffer 6.20 1.50 0.23 50015101 0.00 0.00 heapam_index_fetch_tuple 4.04 2.02 0.15 50015101 0.00 0.00 index_fetch_heap 1.35 3.21 0.05 10119 0.00 0.00 index_getnext_slot case7: master % cumulative self self total time seconds seconds calls ms/call ms/call name 2.67 0.60 0.02 654582 0.00 0.00 LWLockAttemptLock 0.00 0.75 0.00 573488 0.00 0.00 LockBuffer 0.00 0.75 0.00 10014 0.00 0.06 heap_getnextslot patched % cumulative self self total time seconds seconds calls ms/call ms/call name 0.00 0.12 0.00 241979 0.00 0.00 LWLockAttemptLock 0.00 0.12 0.00 180884 0.00 0.00 LockBuffer 0.00 0.12 0.00 10101 0.00 0.00 heapam_index_fetch_tuple 0.00 0.12 0.00 10101 0.00 0.00 index_fetch_heap 0.00 0.12 0.00 10119 0.00 0.00 index_getnext_slot Regards, Shi Yu
pgsql-hackers by date: