Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager - Mailing list pgsql-hackers
From | Mahendra Singh Thalor |
---|---|
Subject | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager |
Date | |
Msg-id | CAKYtNAre3w8DNo7=Kcg7Pt3hZ6_YJ-Fb7_kBFk9BxOks9vSvNQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager (Mahendra Singh Thalor <mahi6run@gmail.com>) |
Responses |
Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
|
List | pgsql-hackers |
On Sat, 8 Feb 2020 at 00:27, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>
> On Thu, 6 Feb 2020 at 09:44, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Feb 6, 2020 at 1:57 AM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
> > >
> > > On Wed, 5 Feb 2020 at 12:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Feb 3, 2020 at 8:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > > > > > > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund <andres@anarazel.de> wrote:
> > > > > > >>> I think the real question is whether the scenario is common enough to
> > > > > > >>> worry about. In practice, you'd have to be extremely unlucky to be
> > > > > > >>> doing many bulk loads at the same time that all happened to hash to
> > > > > > >>> the same bucket.
> > > > > > >>
> > > > > > >> With a bunch of parallel bulkloads into partitioned tables that really
> > > > > > >> doesn't seem that unlikely?
> > > > > > >
> > > > > > > It increases the likelihood of collisions, but probably decreases the
> > > > > > > number of cases where the contention gets really bad.
> > > > > > >
> > > > > > > For example, suppose each table has 100 partitions and you are
> > > > > > > bulk-loading 10 of them at a time. It's virtually certain that you
> > > > > > > will have some collisions, but the amount of contention within each
> > > > > > > bucket will remain fairly low because each backend spends only 1% of
> > > > > > > its time in the bucket corresponding to any given partition.
> > > > > > >
> > > > > >
> > > > > > I share another result of performance evaluation between current HEAD
> > > > > > and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024).
> > > > > >
> > > > > > Type of table: normal table, unlogged table
> > > > > > Number of child tables : 16, 64 (all tables are located on the same tablespace)
> > > > > > Number of clients : 32
> > > > > > Number of trials : 100
> > > > > > Duration: 180 seconds for each trials
> > > > > >
> > > > > > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores), 256GB
> > > > > > RAM, NVMe SSD 1.5TB.
> > > > > > Each clients load 10kB random data across all partitioned tables.
> > > > > >
> > > > > > Here is the result.
> > > > > >
> > > > > > childs | type | target | avg_tps | diff with HEAD
> > > > > > --------+----------+---------+------------+------------------
> > > > > > 16 | normal | HEAD | 1643.833 |
> > > > > > 16 | normal | Patched | 1619.5404 | 0.985222
> > > > > > 16 | unlogged | HEAD | 9069.3543 |
> > > > > > 16 | unlogged | Patched | 9368.0263 | 1.032932
> > > > > > 64 | normal | HEAD | 1598.698 |
> > > > > > 64 | normal | Patched | 1587.5906 | 0.993052
> > > > > > 64 | unlogged | HEAD | 9629.7315 |
> > > > > > 64 | unlogged | Patched | 10208.2196 | 1.060073
> > > > > > (8 rows)
> > > > > >
> > > > > > For normal tables, loading tps decreased 1% ~ 2% with this patch
> > > > > > whereas it increased 3% ~ 6% for unlogged tables. There were
> > > > > > collisions at 0 ~ 5 relation extension lock slots between 2 relations
> > > > > > in the 64 child tables case but it didn't seem to affect the tps.
> > > > > >
> > > > >
> > > > > AFAIU, this resembles the workload that Andres was worried about. I
> > > > > think we should once run this test in a different environment, but
> > > > > considering this to be correct and repeatable, where do we go with
> > > > > this patch especially when we know it improves many workloads [1] as
> > > > > well. We know that on a pathological case constructed by Mithun [2],
> > > > > this causes regression as well. I am not sure if the test done by
> > > > > Mithun really mimics any real-world workload as he has tested by
> > > > > making N_RELEXTLOCK_ENTS = 1 to hit the worst case.
> > > > >
> > > > > Sawada-San, if you have a script or data for the test done by you,
> > > > > then please share it so that others can also try to reproduce it.
> > > >
> > > > Unfortunately the environment I used for performance verification is
> > > > no longer available.
> > > >
> > > > I agree to run this test in a different environment. I've attached the
> > > > rebased version patch. I'm measuring the performance with/without
> > > > patch, so will share the results.
> > > >
> > >
> > > Thanks Sawada-san for patch.
> > >
> > > From last few days, I was reading this thread and was reviewing v13 patch. To debug and test, I did re-base of v13 patch. I compared my re-based patch and v14 patch. I think, ordering of header files is not alphabetically in v14 patch. (I haven't reviewed v14 patch fully because before review, I wanted to test false sharing). While debugging, I didn't noticed any hang or lock related issue.
> > >
> > > I did some testing to test false sharing(bulk insert, COPY data, bulk insert into partitions tables). Below is the testing summary.
> > >
> > > Test setup(Bulk insert into partition tables):
> > > autovacuum=off
> > > shared_buffers=512MB -c max_wal_size=20GB -c checkpoint_timeout=12min
> > >
> > > Basically, I created a table with 13 partitions. Using pgbench, I inserted bulk data. I used below pgbench command:
> > > ./pgbench -c $threads -j $threads -T 180 -f insert1.sql@1 -f insert2.sql@1 -f insert3.sql@1 -f insert4.sql@1 postgres
> > >
> > > I took scripts from previews mails and modified. For reference, I am attaching test scripts. I tested with default 1024 slots(N_RELEXTLOCK_ENTS = 1024).
> > >
> > > Clients HEAD (tps) With v14 patch (tps) %change (time: 180s)
> > > 1 92.979796 100.877446 +8.49 %
> > > 32 392.881863 388.470622 -1.12 %
> > > 56 551.753235 528.018852 -4.30 %
> > > 60 648.273767 653.251507 +0.76 %
> > > 64 645.975124 671.322140 +3.92 %
> > > 66 662.728010 673.399762 +1.61 %
> > > 70 647.103183 660.694914 +2.10 %
> > > 74 648.824027 676.487622 +4.26 %
> > >
> > > From above results, we can see that in most cases, TPS is slightly increased with v14 patch. I am still testing and will post my results.
> > >
> >
> > The number at 56 and 74 client count seem slightly suspicious. Can
> > you please repeat those tests? Basically, I am not able to come up
> > with a theory why at 56 clients the performance with the patch is a
> > bit lower and then at 74 it is higher.
>
> Okay. I will repeat test.
I re-tested in different machine because in previous machine, results are in-consistent
My testing machine:
$ lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 8
Core(s) per socket: 1
Socket(s): 24
NUMA node(s): 4
Model: IBM,8286-42A
L1d cache: 64K
L1i cache: 32K
L2 cache: 512K
L3 cache: 8192K
NUMA node0 CPU(s): 0-47
NUMA node1 CPU(s): 48-95
NUMA node2 CPU(s): 96-143
./pgbench -c $threads -j $threads -T 180 -f insert1.sql@1 -f insert2.sql@1 -f insert3.sql@1 -f insert4.sql@1 postgres
Clients HEAD(tps) With v14 patch(tps) %change (time: 180s)
1 41.491486 41.375532 -0.27%
32 335.138568 330.028739 -1.52%
56 353.783930 360.883710 +2.00%
60 341.741925 359.028041 +5.05%
64 338.521730 356.511423 +5.13%
66 339.838921 352.761766 +3.80%
70 339.305454 353.658425 +4.23%
74 332.016217 348.809042 +5.05%
>
> >
> > > I want to test extension lock by blocking use of fsm(use_fsm=false in code). I think, if we block use of fsm, then load will increase into extension lock. Is this correct way to test?
> > >
> >
> > Hmm, I think instead of directly hacking the code, you might want to
> > use the operation (probably cluster or vacuum full) where we set
> > HEAP_INSERT_SKIP_FSM. I think along with this you can try with
> > unlogged tables because that might stress the extension lock.
>
>
> >
> > In the above test, you might want to test with a higher number of
> > partitions (say up to 100) as well. Also, see if you want to use the
> > Copy command.
>
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
>
> On Thu, 6 Feb 2020 at 09:44, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Feb 6, 2020 at 1:57 AM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
> > >
> > > On Wed, 5 Feb 2020 at 12:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Feb 3, 2020 at 8:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > > > > > > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund <andres@anarazel.de> wrote:
> > > > > > >>> I think the real question is whether the scenario is common enough to
> > > > > > >>> worry about. In practice, you'd have to be extremely unlucky to be
> > > > > > >>> doing many bulk loads at the same time that all happened to hash to
> > > > > > >>> the same bucket.
> > > > > > >>
> > > > > > >> With a bunch of parallel bulkloads into partitioned tables that really
> > > > > > >> doesn't seem that unlikely?
> > > > > > >
> > > > > > > It increases the likelihood of collisions, but probably decreases the
> > > > > > > number of cases where the contention gets really bad.
> > > > > > >
> > > > > > > For example, suppose each table has 100 partitions and you are
> > > > > > > bulk-loading 10 of them at a time. It's virtually certain that you
> > > > > > > will have some collisions, but the amount of contention within each
> > > > > > > bucket will remain fairly low because each backend spends only 1% of
> > > > > > > its time in the bucket corresponding to any given partition.
> > > > > > >
> > > > > >
> > > > > > I share another result of performance evaluation between current HEAD
> > > > > > and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024).
> > > > > >
> > > > > > Type of table: normal table, unlogged table
> > > > > > Number of child tables : 16, 64 (all tables are located on the same tablespace)
> > > > > > Number of clients : 32
> > > > > > Number of trials : 100
> > > > > > Duration: 180 seconds for each trials
> > > > > >
> > > > > > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores), 256GB
> > > > > > RAM, NVMe SSD 1.5TB.
> > > > > > Each clients load 10kB random data across all partitioned tables.
> > > > > >
> > > > > > Here is the result.
> > > > > >
> > > > > > childs | type | target | avg_tps | diff with HEAD
> > > > > > --------+----------+---------+------------+------------------
> > > > > > 16 | normal | HEAD | 1643.833 |
> > > > > > 16 | normal | Patched | 1619.5404 | 0.985222
> > > > > > 16 | unlogged | HEAD | 9069.3543 |
> > > > > > 16 | unlogged | Patched | 9368.0263 | 1.032932
> > > > > > 64 | normal | HEAD | 1598.698 |
> > > > > > 64 | normal | Patched | 1587.5906 | 0.993052
> > > > > > 64 | unlogged | HEAD | 9629.7315 |
> > > > > > 64 | unlogged | Patched | 10208.2196 | 1.060073
> > > > > > (8 rows)
> > > > > >
> > > > > > For normal tables, loading tps decreased 1% ~ 2% with this patch
> > > > > > whereas it increased 3% ~ 6% for unlogged tables. There were
> > > > > > collisions at 0 ~ 5 relation extension lock slots between 2 relations
> > > > > > in the 64 child tables case but it didn't seem to affect the tps.
> > > > > >
> > > > >
> > > > > AFAIU, this resembles the workload that Andres was worried about. I
> > > > > think we should once run this test in a different environment, but
> > > > > considering this to be correct and repeatable, where do we go with
> > > > > this patch especially when we know it improves many workloads [1] as
> > > > > well. We know that on a pathological case constructed by Mithun [2],
> > > > > this causes regression as well. I am not sure if the test done by
> > > > > Mithun really mimics any real-world workload as he has tested by
> > > > > making N_RELEXTLOCK_ENTS = 1 to hit the worst case.
> > > > >
> > > > > Sawada-San, if you have a script or data for the test done by you,
> > > > > then please share it so that others can also try to reproduce it.
> > > >
> > > > Unfortunately the environment I used for performance verification is
> > > > no longer available.
> > > >
> > > > I agree to run this test in a different environment. I've attached the
> > > > rebased version patch. I'm measuring the performance with/without
> > > > patch, so will share the results.
> > > >
> > >
> > > Thanks Sawada-san for patch.
> > >
> > > From last few days, I was reading this thread and was reviewing v13 patch. To debug and test, I did re-base of v13 patch. I compared my re-based patch and v14 patch. I think, ordering of header files is not alphabetically in v14 patch. (I haven't reviewed v14 patch fully because before review, I wanted to test false sharing). While debugging, I didn't noticed any hang or lock related issue.
> > >
> > > I did some testing to test false sharing(bulk insert, COPY data, bulk insert into partitions tables). Below is the testing summary.
> > >
> > > Test setup(Bulk insert into partition tables):
> > > autovacuum=off
> > > shared_buffers=512MB -c max_wal_size=20GB -c checkpoint_timeout=12min
> > >
> > > Basically, I created a table with 13 partitions. Using pgbench, I inserted bulk data. I used below pgbench command:
> > > ./pgbench -c $threads -j $threads -T 180 -f insert1.sql@1 -f insert2.sql@1 -f insert3.sql@1 -f insert4.sql@1 postgres
> > >
> > > I took scripts from previews mails and modified. For reference, I am attaching test scripts. I tested with default 1024 slots(N_RELEXTLOCK_ENTS = 1024).
> > >
> > > Clients HEAD (tps) With v14 patch (tps) %change (time: 180s)
> > > 1 92.979796 100.877446 +8.49 %
> > > 32 392.881863 388.470622 -1.12 %
> > > 56 551.753235 528.018852 -4.30 %
> > > 60 648.273767 653.251507 +0.76 %
> > > 64 645.975124 671.322140 +3.92 %
> > > 66 662.728010 673.399762 +1.61 %
> > > 70 647.103183 660.694914 +2.10 %
> > > 74 648.824027 676.487622 +4.26 %
> > >
> > > From above results, we can see that in most cases, TPS is slightly increased with v14 patch. I am still testing and will post my results.
> > >
> >
> > The number at 56 and 74 client count seem slightly suspicious. Can
> > you please repeat those tests? Basically, I am not able to come up
> > with a theory why at 56 clients the performance with the patch is a
> > bit lower and then at 74 it is higher.
>
> Okay. I will repeat test.
I re-tested in different machine because in previous machine, results are in-consistent
My testing machine:
$ lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 8
Core(s) per socket: 1
Socket(s): 24
NUMA node(s): 4
Model: IBM,8286-42A
L1d cache: 64K
L1i cache: 32K
L2 cache: 512K
L3 cache: 8192K
NUMA node0 CPU(s): 0-47
NUMA node1 CPU(s): 48-95
NUMA node2 CPU(s): 96-143
NUMA node3 CPU(s): 144-191
Clients HEAD(tps) With v14 patch(tps) %change (time: 180s)
1 41.491486 41.375532 -0.27%
32 335.138568 330.028739 -1.52%
56 353.783930 360.883710 +2.00%
60 341.741925 359.028041 +5.05%
64 338.521730 356.511423 +5.13%
66 339.838921 352.761766 +3.80%
70 339.305454 353.658425 +4.23%
74 332.016217 348.809042 +5.05%
From above results, it seems that there is very little regression with the patch(+-5%) that can be run to run variation.
> >
> > > I want to test extension lock by blocking use of fsm(use_fsm=false in code). I think, if we block use of fsm, then load will increase into extension lock. Is this correct way to test?
> > >
> >
> > Hmm, I think instead of directly hacking the code, you might want to
> > use the operation (probably cluster or vacuum full) where we set
> > HEAP_INSERT_SKIP_FSM. I think along with this you can try with
> > unlogged tables because that might stress the extension lock.
>
> Okay. I will test.
I tested with unlogged tables also. There also I was getting 3-6% gain in tps.
> >
> > In the above test, you might want to test with a higher number of
> > partitions (say up to 100) as well. Also, see if you want to use the
> > Copy command.
>
> Okay. I will test.
I tested with 500, 1000, 2000 paratitions. I observed max +5% regress in the tps and there was no performace degradation.
For example:
I created a table with 2000 paratitions and then I checked false sharing.
Slot Number | Slot Freq. | Slot Number | Slot Freq. | Slot Number | Slot Freq. |
156 | 13 | 973 | 11 | 446 | 10 |
627 | 13 | 52 | 10 | 488 | 10 |
782 | 12 | 103 | 10 | 501 | 10 |
812 | 12 | 113 | 10 | 701 | 10 |
192 | 11 | 175 | 10 | 737 | 10 |
221 | 11 | 235 | 10 | 754 | 10 |
367 | 11 | 254 | 10 | 781 | 10 |
546 | 11 | 314 | 10 | 790 | 10 |
814 | 11 | 419 | 10 | 833 | 10 |
917 | 11 | 424 | 10 | 888 | 10 |
From above table, we can see that total 13 child tables are falling in same backet (slot 156) so I did bulk-loading only in those 13 child tables to check tps in false sharing but I noticed that there was no performance degradation.
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: