Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager |
Date | |
Msg-id | CA+TgmoYCffUw+iT0DKtooNN+vjPSgLHVEWWrTVePSR4hR5OKEw@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager |
List | pgsql-hackers |
On Sun, Nov 26, 2017 at 9:33 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Attached latest patch incorporated all comments so far. Please review it. I think you only need RelExtLockReleaseAllI() where we currently have LockReleaseAll(DEFAULT_LOCKMETHOD, ...) not where we have LockReleaseAll(USER_LOCKMETHOD, ...). That's because relation extension locks use the default lock method, not USER_LOCKMETHOD. You need to update the table of wait events in the documentation. Please be sure to actually build the documentation afterwards and make sure it looks OK. Maybe the way event name should be RelationExtensionLock rather than just RelationExtension; we are not waiting for the extension itself. You have a typo/thinko in lmgr/README: confliction is not a word. Maybe you mean "When conflicts occur, lock waits are implemented using condition variables." Instead of having shared and exclusive locks, how about just having exclusive locks and introducing a new primitive operation that waits for the lock to be free and returns without acquiring it? That is essentially what brin_pageops.c is doing by taking and releasing the shared lock, and it's the only caller that takes anything but an exclusive lock. This seems like it would permit a considerable simplification of the locking mechanism, since there would then be only two possible states: 1 (locked) and 0 (not locked). In RelExtLockAcquire, I can't endorse this sort of coding: + if (relid == held_relextlock.lock->relid && + lockmode == held_relextlock.mode) + { + held_relextlock.nLocks++; + return true; + } + else + Assert(false); /* cannot happen */ Either convert the Assert() to an elog(), or change the if-statement to an Assert() of the same condition. I'd probably vote for the first one. As it is, if that Assert(false) is ever hit, chaos will (maybe) ensue. Let's make sure we nip any such problems in the bud. "successed" is not a good variable name; that's not an English word. + /* Could not got the lock, return iff in conditional locking */ + if (mustwait && conditional) Comment contradicts code. The comment is right; the code need not test mustwait, as that's already been done. The way this is hooked into the shared-memory initialization stuff looks strange in a number of ways: - Apparently, you're making initialize enough space for as many relation extension locks as the save of the main heavyweight lock table, but that seems like overkill. I'm not sure how much space we actually need for relation extension locks, but I bet it's a lot less than we need for regular heavyweight locks. - The error message emitted when you run out of space also claims that you can fix the issue by raising max_pred_locks_per_transaction, but that has no effect on the size of the main lock table or this table. - The changes to LockShmemSize() suppose that the hash table elements have a size equal to the size of an LWLock, but the actual size is sizeof(RELEXTLOCK). - I don't really know why the code for this should be daisy-chained off of the lock.c code inside of being called from CreateSharedMemoryAndSemaphores() just like (almost) all of the other subsystems. This code ignores the existence of multiple databases; RELEXTLOCK contains a relid, but no database OID. That's easy enough to fix, but it actually causes no problem unless, by bad luck, you have two relations with the same OID in different databases that are both being rapidly extended at the same time -- and even then, it's only a performance problem, not a correctness problem. In fact, I wonder if we shouldn't go further: instead of creating these RELEXTLOCK structures dynamically, let's just have a fixed number of them, say 1024. When we get a request to take a lock, hash <dboid, reloid> and take the result modulo 1024; lock the RELEXTLOCK at that offset in the array. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: