Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers
From | Alexey Kondratov |
---|---|
Subject | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions |
Date | |
Msg-id | f4cf3883-5ce2-6829-bb42-3038bfc7b4d9@postgrespro.ru Whole thread Raw |
In response to | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions |
List | pgsql-hackers |
Hi Tomas, > I'm a bit confused by the changes to TAP tests. Per the patch summary, > some .pl files get renamed (nor sure why), a new one is added, etc. I added new tap test case, streaming=true option inside old stream_* ones and incremented streaming tests number (+2) because of the collision between 009_matviews.pl / 009_stream_simple.pl and 010_truncate.pl / 010_stream_subxact.pl. At least in the previous version of the patch they were under the same numbers. Nothing special, but for simplicity, please, find attached my new tap test separately. > So > I've instead enabled streaming subscriptions in all tests, which with > this patch produces two failures: > > Test Summary Report > ------------------- > t/004_sync.pl (Wstat: 7424 Tests: 1 Failed: 0) > Non-zero exit status: 29 > Parse errors: Bad plan. You planned 7 tests but ran 1. > t/011_stream_ddl.pl (Wstat: 256 Tests: 2 Failed: 1) > Failed test: 2 > Non-zero exit status: 1 > > So yeah, there's more stuff to fix. But I can't directly apply your > fixes because the updated patches are somewhat different. Fixes should apply clearly to the previous version of your patch. Also, I am not sure, that it is a good idea to simply enable streaming subscriptions in all tests (e.g. pre streaming patch t/004_sync.pl), since then they do not hit not streaming code. >>> Interesting. Any idea where does the extra overhead in this particular >>> case come from? It's hard to deduce that from the single flame graph, >>> when I don't have anything to compare it with (i.e. the flame graph for >>> the "normal" case). >> I guess that bottleneck is in disk operations. You can check >> logical_repl_worker_new_perf.svg flame graph: disk reads (~9%) and >> writes (~26%) take around 35% of CPU time in summary. To compare, >> please, see attached flame graph for the following transaction: >> >> INSERT INTO large_text >> SELECT (SELECT string_agg('x', ',') >> FROM generate_series(1, 2000)) FROM generate_series(1, 1000000); >> >> Execution Time: 44519.816 ms >> Time: 98333,642 ms (01:38,334) >> >> where disk IO is only ~7-8% in total. So we get very roughly the same >> ~x4-5 performance drop here. JFYI, I am using a machine with SSD for tests. >> >> Therefore, probably you may write changes on receiver in bigger chunks, >> not each change separately. >> > Possibly, I/O is certainly a possible culprit, although we should be > using buffered I/O and there certainly are not any fsyncs here. So I'm > not sure why would it be cheaper to do the writes in batches. > > BTW does this mean you see the overhead on the apply side? Or are you > running this on a single machine, and it's difficult to decide? I run this on a single machine, but walsender and worker are utilizing almost 100% of CPU per each process all the time, and at apply side I/O syscalls take about 1/3 of CPU time. Though I am still not sure, but for me this result somehow links performance drop with problems at receiver side. Writing in batches was just a hypothesis and to validate it I have performed test with large txn, but consisting of a smaller number of wide rows. This test does not exhibit any significant performance drop, while it was streamed too. So it seems to be valid. Anyway, I do not have other reasonable ideas beside that right now. Regards -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company
Attachment
pgsql-hackers by date: