Re: [RFC] Lock-free XLog Reservation from WAL - Mailing list pgsql-hackers

From Zhou, Zhiguo
Subject Re: [RFC] Lock-free XLog Reservation from WAL
Date
Msg-id faea001e-c23c-46b3-aa7f-3d7e5c7a1a68@intel.com
Whole thread Raw
In response to Re: [RFC] Lock-free XLog Reservation from WAL  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Responses Re: [RFC] Lock-free XLog Reservation from WAL
List pgsql-hackers
Good day, Yura!

On 1/10/2025 8:42 PM, Yura Sokolov wrote:
> If you consider hash-table fillrate, than 256 is quite enough for 128 
> concurrent inserters.

The profile of your patch didn't show significant hotspots in the hash 
table functions, so I believe the 256 entries should be enough.

>>
>> I will soon try to evaluate the performance impact of your patch on my 
>> device with the TPCC benchmark and also profile it to see if there are 
>> any changes that could be made to further improve it.
> 
> It would be great. On my notebook (Mac Air M1) I don't see any benefits 
> neither from mine, nor from yours patch ))
> My colleague will also test it on 20 core virtual machine (but 
> backported to v15).
> 

I've tested the performance impact of our patches on an Intel Sapphire 
Rapids device with 480 vCPUs using a HammerDB TPC-C workload (256 VUs). 
The results show a 72.3% improvement (average of 3 rounds, RSD: 1.5%) 
with your patch and a 76.0% boost (average of 3 rounds, RSD: 2.95%) with 
mine, applied to the latest codebase. This optimization is most 
effective on systems with over 64 cores, as our core-scaling experiments 
suggest minimal impact on lower-core setups like your notebook or a 
20-core VM.

> 
>> BTW, do you have a plan to merge this patch to the master branch? Thanks!
> 
> I'm not committer )) We are both will struggle to make something 
> committed for many months ;-)
> 
> BTW, your version could make alike trick for guaranteed atomicity:
> - change XLogRecord's `XLogRecPtr xl_prev` to `uint32 xl_prev_offset` 
> and store offset to prev record's start.
> 
> Since there are two limits:
> 
>      #define XLogRecordMaxSize    (1020 * 1024 * 1024)
>      #define WalSegMaxSize 1024 * 1024 * 1024
> 
> offset to previous record could not be larger than 2GB.
> 
> Yes, it is format change, that some backup utilities will have to adopt.
> But it saves 4 bytes in XLogRecord (that could be spent to store 
> FullTransactionId instead of TransactionId) and it is better compressible.
> 
> And your version than will not need the case when this value is split 
> among two buffers (since MAXALIGN is not less than 4), and PostgreSQL 
> already relies on 4 byte read/write atomicity (in some places even 
> without use of pg_atomic_uint32).
> 
> ----
> 
> regards
> Sokolov Yura aka funny-falcon

Thanks for the great suggestion!

I think we've arrived at a critical juncture where we need to decide 
which patch to move forward with for our optimization efforts. I've 
evaluated the pros and cons of my implementation:

Pros:
-Achieves an additional 4% performance improvement.

Cons:
-Breaks the current convention of XLog insertions.
-TAP tests are not fully passed, requiring time to resolve.
-May necessitate changes to the format and backup tools, potentially 
leading to backward compatibility issues.

Given these considerations, I believe your implementation is superior to 
mine. I'd greatly appreciate it if you could share your insights on this 
matter.

Regards,
Zhiguo




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: question about relation_open
Next
From: Robert Haas
Date:
Subject: Re: Eager aggregation, take 3