Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager |
Date | |
Msg-id | CAA4eK1+C6JQvf=_oW7=GfVec0KKa5GL8uzMqQ9FtUZ3e2gKBdQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
|
List | pgsql-hackers |
On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund <andres@anarazel.de> wrote: > >>> I think the real question is whether the scenario is common enough to > >>> worry about. In practice, you'd have to be extremely unlucky to be > >>> doing many bulk loads at the same time that all happened to hash to > >>> the same bucket. > >> > >> With a bunch of parallel bulkloads into partitioned tables that really > >> doesn't seem that unlikely? > > > > It increases the likelihood of collisions, but probably decreases the > > number of cases where the contention gets really bad. > > > > For example, suppose each table has 100 partitions and you are > > bulk-loading 10 of them at a time. It's virtually certain that you > > will have some collisions, but the amount of contention within each > > bucket will remain fairly low because each backend spends only 1% of > > its time in the bucket corresponding to any given partition. > > > > I share another result of performance evaluation between current HEAD > and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024). > > Type of table: normal table, unlogged table > Number of child tables : 16, 64 (all tables are located on the same tablespace) > Number of clients : 32 > Number of trials : 100 > Duration: 180 seconds for each trials > > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores), 256GB > RAM, NVMe SSD 1.5TB. > Each clients load 10kB random data across all partitioned tables. > > Here is the result. > > childs | type | target | avg_tps | diff with HEAD > --------+----------+---------+------------+------------------ > 16 | normal | HEAD | 1643.833 | > 16 | normal | Patched | 1619.5404 | 0.985222 > 16 | unlogged | HEAD | 9069.3543 | > 16 | unlogged | Patched | 9368.0263 | 1.032932 > 64 | normal | HEAD | 1598.698 | > 64 | normal | Patched | 1587.5906 | 0.993052 > 64 | unlogged | HEAD | 9629.7315 | > 64 | unlogged | Patched | 10208.2196 | 1.060073 > (8 rows) > > For normal tables, loading tps decreased 1% ~ 2% with this patch > whereas it increased 3% ~ 6% for unlogged tables. There were > collisions at 0 ~ 5 relation extension lock slots between 2 relations > in the 64 child tables case but it didn't seem to affect the tps. > AFAIU, this resembles the workload that Andres was worried about. I think we should once run this test in a different environment, but considering this to be correct and repeatable, where do we go with this patch especially when we know it improves many workloads [1] as well. We know that on a pathological case constructed by Mithun [2], this causes regression as well. I am not sure if the test done by Mithun really mimics any real-world workload as he has tested by making N_RELEXTLOCK_ENTS = 1 to hit the worst case. Sawada-San, if you have a script or data for the test done by you, then please share it so that others can also try to reproduce it. [1] - https://www.postgresql.org/message-id/4c171ffe-e3ee-acc5-9066-a40d52bc5ae9%40postgrespro.ru [2] - https://www.postgresql.org/message-id/CAD__Oug52j%3DDQMoP2b%3DVY7wZb0S9wMNu4irXOH3-ZjFkzWZPGg%40mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: