Re: [RFC] Lock-free XLog Reservation from WAL - Mailing list pgsql-hackers
From | Yura Sokolov |
---|---|
Subject | Re: [RFC] Lock-free XLog Reservation from WAL |
Date | |
Msg-id | aaa0d579-b990-4464-aa6b-eade52fe1cac@postgrespro.ru Whole thread Raw |
In response to | Re: [RFC] Lock-free XLog Reservation from WAL (Japin Li <japinli@hotmail.com>) |
Responses |
Re: [RFC] Lock-free XLog Reservation from WAL
|
List | pgsql-hackers |
22.01.2025 09:09, Japin Li пишет: > On Sun, 19 Jan 2025 at 17:56, Yura Sokolov <y.sokolov@postgrespro.ru> wrote: >> 17.01.2025 17:00, Zhou, Zhiguo пишет: >>> On 1/16/2025 10:00 PM, Yura Sokolov wrote: >>>> >>>> Good day, Zhiguo. >>>> >>>> Excuse me, I feel sneaky a bit, but I've started another thread >>>> just about increase of NUM_XLOGINSERT_LOCK, because I can measure >>>> its effect even on my working notebook (it is another one: Ryzen >>>> 5825U limited to @2GHz). >>>> >>>> http://postgr.es/m/flat/3b11fdc2-9793-403d- >>>> b3d4-67ff9a00d447%40postgrespro.ru >>>> >>>> ----- >>>> >>>> regards >>>> Yura Sokolov aka funny-falcon >>>> >>>> >>> Good day, Yura! >>> Thank you for keeping me informed. I appreciate your proactive >>> approach and understand the importance of exploring different angles >>> for optimization. Your patch is indeed fundamental to our ongoing >>> work on the lock-free xlog reservation, and I'm eager to see how it >>> can further enhance our efforts. >>> I will proceed to test the performance impact of your latest patch >>> when combined with the lock-free xlog reservation patch. This will >>> help us determine if there's potential for additional >>> optimization. Concurrently, with your permission, I'll try to refine >>> the hash-table- based implementation for your further review. WDYT? >>> >> >> Good day, Zhiguo >> >> Here's version of "hash-table reservation" with both 32bit and 64bit >> operations (depending on PG_HAVE_ATOMIC_U64_SIMULATION, or may be >> switched by hand). >> >> 64bit version uses other protocol with a bit lesser atomic >> operations. I suppose it could be a bit faster. But I can't prove it >> now. >> >> btw, you wrote: >> >>>> Major issue: >>>> - `SetPrevRecPtr` and `GetPrevRecPtr` do non-atomic write/read >> with on >>>> platforms where MAXALIGN != 8 or without native 64 >> load/store. Branch >>>> with 'memcpy` is rather obvious, but even pointer de-referencing on >>>> "lucky case" is not safe either. >>>> >>>> I have no idea how to fix it at the moment. >>>> >>> >>> Indeed, non-atomic write/read operations can lead to safety issues in >>> some situations. My initial thought is to define a bit near the >>> prev-link to flag the completion of the update. In this way, we could >>> allow non-atomic or even discontinuous write/read operations on the >>> prev-link, while simultaneously guaranteeing its atomicity through >>> atomic operations (as well as memory barriers) on the flag bit. What >>> do you think of this as a viable solution? >> >> There is a way to order operations: >> - since SetPrevRecPtr stores start of record as LSN, its lower 32bits >> are certainly non-zero (record could not start at the beginning of a >> page). >> - so SetPrevRecPtr should write high 32bits, issue write barrier, and >> then write lower 32bits, >> - and then GetPrevRecPtr should first read lower 32bits, and if it is >> not zero, then issue read barrier and read upper 32bits. >> >> This way you will always read correct prev-rec-ptr on platform without >> 64bit atomics. (because MAXALING >= 4 and PostgreSQL requires 4 byte >> atomicity for several years). >> > > Hi, Yura Sokolov > > Thanks for updating the patch. > I test the v2 patch using BenchmarkSQL 1000 warehouse, and here is the tpmC > result: > > case | min | avg | max > --------------------+------------+------------+-------------- > master (patched) | 988,461.89 | 994,916.50 | 1,000,362.40 > master (44b61efb79) | 857,028.07 | 863,174.59 | 873,856.92 > > The patch provides a significant improvement. > > I just looked through the patch, here are some comments. > > 1. > The v2 patch can't be applied cleanly. > > Applying: Lock-free XLog Reservation using lock-free hash-table > .git/rebase-apply/patch:33: trailing whitespace. > > .git/rebase-apply/patch:37: space before tab in indent. > { > .git/rebase-apply/patch:38: space before tab in indent. > int i; > .git/rebase-apply/patch:39: trailing whitespace. > > .git/rebase-apply/patch:46: space before tab in indent. > LWLockReleaseClearVar(&WALInsertLocks[i].l.lock, > warning: squelched 4 whitespace errors > warning: 9 lines add whitespace errors. > > 2. > And there is a typo: > > + * PrevLinksHash is a lock-free hash table based on Cuckoo algorith. It is > + * mostly 4 way: for every element computed two positions h1, h2, and > > s/algorith/algorithm/g Hi, Japin Thank you a lot for measuring and comments. May I ask you to compare not only against master, but against straight increase of NUM_XLOGINSERT_LOCKS to 128 as well? This way the profit from added complexity will be more obvious: does it pay for self or not. ------- regards Yura
pgsql-hackers by date: