Home > mailing lists

Re: performance issue in remove_from_unowned_list() - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: performance issue in remove_from_unowned_list()
Date	February 8, 2019 13:27:59
Msg-id	251f20b9-087d-4e1a-0428-f93c55a1bc73@2ndquadrant.com Whole thread Raw
In response to	RE: performance issue in remove_from_unowned_list() ("Ideriha, Takeshi" <ideriha.takeshi@jp.fujitsu.com>)
Responses	Re: performance issue in remove_from_unowned_list()
List	pgsql-hackers

Tree view



On 2/8/19 12:32 PM, Ideriha, Takeshi wrote:
>> From: Tomas Vondra [mailto:tomas.vondra@2ndquadrant.com]
>> But it's a bit funnier, because there's also DropRelationFiles() which does smgrclose on
>> a batch of relations too, and it says this
>>
>>    /*
>>     * Call smgrclose() in reverse order as when smgropen() is called.
>>     * This trick enables remove_from_unowned_list() in smgrclose()
>>     * to search the SMgrRelation from the unowned list,
>>     * with O(1) performance.
>>     */
>>    for (i = ndelrels - 1; i >= 0; i--)
>>    ...
>>
>> but it's called from two places in xact.c only. And if you trigger the issue with 100k x
>> CREATE TABLE + ROLLBACK, and then force a recovery by killing postmaster, you
>> actually get the very same behavior with always traversing the unowned list for some
>> reason. I'm not quite sure why, but it seems the optimization is exactly the wrong thing
>> to do here.
> 
> So when DropRelationFiles() is called, order of calling smgr_close()
and order of unowed list is always same?
> 

Well, let's say you create 10k tables in a single transaction, and then
kill the server with "kill -9" right after the commit. Then on restart,
during recovery this happens

2019-02-08 14:16:21.781 CET [12817] LOG:  remove_from_unowned_list 10015
2019-02-08 14:16:21.871 CET [12817] LOG:  remove_from_unowned_list 10005
2019-02-08 14:16:21.967 CET [12817] LOG:  remove_from_unowned_list 10004
2019-02-08 14:16:22.057 CET [12817] LOG:  remove_from_unowned_list 10001
2019-02-08 14:16:22.147 CET [12817] LOG:  remove_from_unowned_list 10000
2019-02-08 14:16:22.238 CET [12817] LOG:  remove_from_unowned_list 9999
2019-02-08 14:16:22.327 CET [12817] LOG:  remove_from_unowned_list 9998
2019-02-08 14:16:22.421 CET [12817] LOG:  remove_from_unowned_list 9996
2019-02-08 14:16:22.513 CET [12817] LOG:  remove_from_unowned_list 9995
2019-02-08 14:16:22.605 CET [12817] LOG:  remove_from_unowned_list 9994
2019-02-08 14:16:22.696 CET [12817] LOG:  remove_from_unowned_list 9993
...
2019-02-08 14:19:13.921 CET [12817] LOG:  remove_from_unowned_list 8396
2019-02-08 14:19:14.025 CET [12817] LOG:  remove_from_unowned_list 8395
2019-02-08 14:19:14.174 CET [12817] LOG:  remove_from_unowned_list 8394
2019-02-08 14:19:14.277 CET [12817] LOG:  remove_from_unowned_list 8393
2019-02-08 14:19:14.387 CET [12817] LOG:  remove_from_unowned_list 8392
2019-02-08 14:19:14.508 CET [12817] LOG:  remove_from_unowned_list 8391
2019-02-08 14:19:14.631 CET [12817] LOG:  remove_from_unowned_list 8390
2019-02-08 14:19:14.770 CET [12817] LOG:  remove_from_unowned_list 8389
...

This is with the attached debug patch, which simply prints index of the
unowned list index entry. And yes, it took ~3 minutes to get from 10k to
8k (at which point I interrupted the recovery and killed the cluster).

I see similar issue after creating a lot of tables in the same xact and
rolling it back, although in this case it's faster for some reason.

> This one was inroduced at b4166911 and I'd like to hear author and reviewer's opinion. 
> https://www.postgresql.org/message-id/CAHGQGwH0hwXwrCDnmUU2Twj5YgHcrmMCVD7O%3D1NrRTpHcbtCBw%40mail.gmail.com
> 

That probably explains why we haven't seen the issue before.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

smgr-debug.patch

pgsql-hackers by date:

From: "MauMau"
Date: 08 February 2019, 13:27:57
Subject: Re: Protect syscache from bloating with negative cache entries

From: Raúl Marín
Date: 08 February 2019, 13:30:51
Subject: Re: [PATCH] pgbench tap tests fail if the path contains a perl special character

Re: performance issue in remove_from_unowned_list() - Mailing list pgsql-hackers

Attachment

Previous

Next