Re: [RFC] Lock-free XLog Reservation from WAL - Mailing list pgsql-hackers
From | Yura Sokolov |
---|---|
Subject | Re: [RFC] Lock-free XLog Reservation from WAL |
Date | |
Msg-id | 5f39c77a-f5be-449c-a2dc-e293843e221b@postgrespro.ru Whole thread Raw |
In response to | Re: [RFC] Lock-free XLog Reservation from WAL ("Zhou, Zhiguo" <zhiguo.zhou@intel.com>) |
Responses |
Re: [RFC] Lock-free XLog Reservation from WAL
|
List | pgsql-hackers |
14.01.2025 17:49, Zhou, Zhiguo пишет: > Good day, Yura! > > On 1/10/2025 8:42 PM, Yura Sokolov wrote: >> If you consider hash-table fillrate, than 256 is quite enough for 128 >> concurrent inserters. > > The profile of your patch didn't show significant hotspots in the hash > table functions, so I believe the 256 entries should be enough. > >>> >>> I will soon try to evaluate the performance impact of your patch on >>> my device with the TPCC benchmark and also profile it to see if there >>> are any changes that could be made to further improve it. >> >> It would be great. On my notebook (Mac Air M1) I don't see any >> benefits neither from mine, nor from yours patch )) >> My colleague will also test it on 20 core virtual machine (but >> backported to v15). >> > > I've tested the performance impact of our patches on an Intel Sapphire > Rapids device with 480 vCPUs using a HammerDB TPC-C workload (256 VUs). > The results show a 72.3% improvement (average of 3 rounds, RSD: 1.5%) > with your patch and a 76.0% boost (average of 3 rounds, RSD: 2.95%) with > mine, applied to the latest codebase. This optimization is most > effective on systems with over 64 cores, as our core-scaling experiments > suggest minimal impact on lower-core setups like your notebook or a 20- > core VM. > >> >>> BTW, do you have a plan to merge this patch to the master branch? >>> Thanks! >> >> I'm not committer )) We are both will struggle to make something >> committed for many months ;-) >> >> BTW, your version could make alike trick for guaranteed atomicity: >> - change XLogRecord's `XLogRecPtr xl_prev` to `uint32 xl_prev_offset` >> and store offset to prev record's start. >> >> Since there are two limits: >> >> #define XLogRecordMaxSize (1020 * 1024 * 1024) >> #define WalSegMaxSize 1024 * 1024 * 1024 >> >> offset to previous record could not be larger than 2GB. >> >> Yes, it is format change, that some backup utilities will have to adopt. >> But it saves 4 bytes in XLogRecord (that could be spent to store >> FullTransactionId instead of TransactionId) and it is better >> compressible. >> >> And your version than will not need the case when this value is split >> among two buffers (since MAXALIGN is not less than 4), and PostgreSQL >> already relies on 4 byte read/write atomicity (in some places even >> without use of pg_atomic_uint32). >> >> ---- >> >> regards >> Sokolov Yura aka funny-falcon > > Thanks for the great suggestion! > > I think we've arrived at a critical juncture where we need to decide > which patch to move forward with for our optimization efforts. I've > evaluated the pros and cons of my implementation: > > Pros: > -Achieves an additional 4% performance improvement. > > Cons: > -Breaks the current convention of XLog insertions. > -TAP tests are not fully passed, requiring time to resolve. > -May necessitate changes to the format and backup tools, potentially > leading to backward compatibility issues. > > Given these considerations, I believe your implementation is superior to > mine. I'd greatly appreciate it if you could share your insights on this > matter. Good day, Zhiguo. Excuse me, I feel sneaky a bit, but I've started another thread just about increase of NUM_XLOGINSERT_LOCK, because I can measure its effect even on my working notebook (it is another one: Ryzen 5825U limited to @2GHz). http://postgr.es/m/flat/3b11fdc2-9793-403d-b3d4-67ff9a00d447%40postgrespro.ru ----- regards Yura Sokolov aka funny-falcon
pgsql-hackers by date: