Re: dynamic shared memory and locks - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: dynamic shared memory and locks |
Date | |
Msg-id | CA+TgmoZUdX2=7eou10gig5b=DTPm0foXCx4n9VGPw70pBe9GTQ@mail.gmail.com Whole thread Raw |
In response to | Re: dynamic shared memory and locks (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: dynamic shared memory and locks
|
List | pgsql-hackers |
On Mon, Jan 6, 2014 at 9:48 AM, Stephen Frost <sfrost@snowman.net> wrote: >> None of these ideas are a complete solution for LWLOCK_STATS. In the >> other three cases noted above, we only need an identifier for the lock >> "instantaneously", so that we can pass it off to the logger or dtrace >> or whatever. But LWLOCK_STATS wants to hold on to data about the >> locks that were visited until the end of the session, and it does that >> using an array that is *indexed* by lwlockid. I guess we could >> replace that with a hash table. Ugh. Any suggestions? > > Yeah, that's not fun. No good suggestions here offhand. Replacing it with a hashtable turns out not to be too bad, either in terms of code complexity or performance, so I think that's the way to go. I did some test runs with pgbench -S, scale factor 300, 32 clients, shared_buffers=8GB, five minute runs and got these results: resultsr.lwlock-stats.32.300.300:tps = 195493.037962 (including connections establishing) resultsr.lwlock-stats.32.300.300:tps = 189985.964658 (including connections establishing) resultsr.lwlock-stats.32.300.300:tps = 197641.293892 (including connections establishing) resultsr.lwlock-stats-htab.32.300.300:tps = 193286.066063 (including connections establishing) resultsr.lwlock-stats-htab.32.300.300:tps = 191832.100877 (including connections establishing) resultsr.lwlock-stats-htab.32.300.300:tps = 191899.834211 (including connections establishing) resultsr.master.32.300.300:tps = 197939.111998 (including connections establishing) resultsr.master.32.300.300:tps = 198641.337720 (including connections establishing) resultsr.master.32.300.300:tps = 198675.404349 (including connections establishing) "master" is the master branch, commit 10a82cda67731941c18256e009edad4a784a2994. "lwlock-stats" is the same, but with LWLOCK_STATS defined. "lwlock-stats-htab" is the same, with the attached patch and LWLOCK_STATS defined. The runs were interleaved, but the results are shown here grouped by test run. If we assume that the 189k result is an outlier, then there's probably some regression associated with the lwlock-stats-htab patch, but not a lot. Considering that this code isn't even compiled unless you have LWLOCK_STATS defined, I think that's OK. This is only part of the solution, of course: a complete solution will involve making the hash table key something other than the lock ID. What I'm thinking we can do is making the lock ID consist of two unsigned 32-bit integers. One of these will be stored in the lwlock itself, which if my calculations are correct won't increase the size of LWLockPadded on any common platforms (a 64-bit integer would). Let's call this the "tranch id". The other will be derived from the LWLock address. Let's call this the "instance ID". We'll keep a table of tranch IDs, which will be assigned consecutively starting with 0. We'll keep an array of metadata for tranches, indexed by tranch ID, and each entry will have three associated pieces of information: an array base, a stride length, and a printable name. When we need to identify an lwlock in the log or to dtrace, we'll fetch the tranch ID from the lwlock itself and use that to index into the tranch metadata array. We'll then take the address of the lwlock, subtract the array base address for the tranch, and divide by the stride length; the result is the instance ID. When reporting the user, we can report either the tranch ID directly or the associated name for that tranch; in either case, we'll also report the instance ID. So initially we'll probably just have tranch 0: the main LWLock array. If we move buffer content and I/O locks to the buffer headers, we'll define tranch 1 and tranch 2 with the same base address: the start of the buffer descriptor array, and the same stride length, the size of a buffer descriptor. One will have the associated name "buffer content lock" and the other "buffer I/O lock". If we want, we can define split the main LWLock array into several tranches so that we can more easily identify lock manager locks, predicate lock manager locks, and buffer mapping locks. I like this system because it's very cheap - we only need a small array of metadata and a couple of memory accesses to name a lock - but it still lets us report data in a way that's actually *more* human-readable than what we have now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: