Re: RFC: Lock-free XLog Reservation from WAL - Mailing list pgsql-hackers

From Japin Li
Subject Re: RFC: Lock-free XLog Reservation from WAL
Date
Msg-id ME0P300MB044511A4CE4FDEB163EBD830B6102@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
Whole thread Raw
Responses Re: RFC: Lock-free XLog Reservation from WAL
List pgsql-hackers
Hi, Zhigou

Thanks for the patch.

On Thu, 02 Jan 2025 at 09:14, "Zhou, Zhiguo" <zhiguo.zhou@intel.com> wrote:
> Hi all,
>
> I am reaching out to solicit your insights and comments on a recent
> proposal regarding the "Lock-free XLog Reservation from WAL." We have
> identified some challenges with the current WAL insertions, which
> require space reservations in the WAL buffer which involve updating
> two shared-memory statuses in XLogCtlInsert: CurrBytePos (the start
> position of the current XLog) and PrevBytePos (the prev-link to the
> previous XLog). Currently, the use of XLogCtlInsert.insertpos_lck
> ensures consistency but introduces lock contention, hindering the
> parallelism of XLog insertions.
>
> To address this issue, we propose the following changes:
>
> 1. Removal of PrevBytePos: This will allow increments of the CurrBytePos (a single uint64 field) to be implemented
withan atomic operation (fetch_add). 
> 2. Updating Prev-Link of next XLog: Based on the fact that the
> prev-link of the next XLog always points to the head of the current
> Xlog,we will slightly exceed the reserved memory range of the current
> XLog to update the prev-link of the next XLog, regardless of which
> backend acquires the next memory space. The next XLog inserter will
> wait until its prev-link is updated for CRC calculation before
> starting its own XLog copy into the WAL.
> 3. Breaking Sequential Write Convention: Each backend will update the
> prev-link of its next XLog first, then return to the header position
> for the current log insertion. This change will reduce the dependency
> of XLog writes on previous ones (compared with the sequential writes).
> 4. Revised GetXLogBuffer: To support #3, we need update this function to separate the LSN it intends to access from
theLSN it expects to update in the insertingAt field. 
> 5. Increase NUM_XLOGINSERT_LOCKS: With the above changes, increasing NUM_XLOGINSERT_LOCKS, for example to 128, could
effectivelyenhance the parallelism. 
>
> The attached patch could pass the regression tests (make check, make
> check-world), and in the performance test of this POC on SPR (480
> vCPU) shows that this optimization could help the TPCC benchmark
> better scale with the core count and as a result the performance with
> full cores enabled could be improved by 2.04x.
>

I tried it and found it cannot pass all test-cases.  Here is the output
comes from Rocky 9.5.

        mkdir build && cd build
        ../configure \
                --prefix=/tmp/pg \
                --enable-tap-tests \
                --enable-debug \
                --enable-cassert \
                --enable-depend \
                --enable-dtrace \
                --with-icu \
                --with-openssl \
                --with-python \
                --with-libxml \
                --with-libxslt \
                --with-lz4 \
                --with-pam \
                CFLAGS='-Wmissing-prototypes -Wincompatible-pointer-types'
        make -j $(nproc) -s && make install -s
        (cd contrib/ && make -j $(nproc) -s && make install -s)
        make check-world

The error as follows:

echo "# +++ tap check in src/test/recovery +++" && rm -rf '/home/japin/postgresql/build/src/test/recovery'/tmp_check &&
/usr/bin/mkdir-p '/home/japin/postgresql/build/src/test/recovery'/tmp_check && cd
/home/japin/postgresql/build/../src/test/recovery&&
TESTLOGDIR='/home/japin/postgresql/build/src/test/recovery/tmp_check/log'
TESTDATADIR='/home/japin/postgresql/build/src/test/recovery/tmp_check'
PATH="/home/japin/postgresql/build/tmp_install/home/japin/postgresql/build/pg/bin:/home/japin/postgresql/build/src/test/recovery:$PATH"
LD_LIBRARY_PATH="/home/japin/postgresql/build/tmp_install/home/japin/postgresql/build/pg/lib"
INITDB_TEMPLATE='/home/japin/postgresql/build'/tmp_install/initdb-template PGPORT='65432'
top_builddir='/home/japin/postgresql/build/src/test/recovery/../../..'
PG_REGRESS='/home/japin/postgresql/build/src/test/recovery/../../../src/test/regress/pg_regress'/usr/bin/prove -I
/home/japin/postgresql/build/../src/test/perl/-I /home/japin/postgresql/build/../src/test/recovery  t/*.pl 
# +++ tap check in src/test/recovery +++
t/001_stream_rep.pl ................... ok
t/002_archiving.pl .................... 1/? Bailout called.  Further testing stopped:  command "pg_ctl -D
/home/japin/postgresql/build/src/test/recovery/tmp_check/t_002_archiving_standby_data/pgdata-l
/home/japin/postgresql/build/src/test/recovery/tmp_check/log/002_archiving_standby.logpromote" exited with value 1 
FAILED--Further testing stopped: command "pg_ctl -D
/home/japin/postgresql/build/src/test/recovery/tmp_check/t_002_archiving_standby_data/pgdata-l
/home/japin/postgresql/build/src/test/recovery/tmp_check/log/002_archiving_standby.logpromote" exited with value 1 
make[2]: *** [Makefile:28: check] Error 255
make[2]: Leaving directory '/home/japin/postgresql/build/src/test/recovery'
make[1]: *** [Makefile:42: check-recovery-recurse] Error 2
make[1]: Leaving directory '/home/japin/postgresql/build/src/test'
make: *** [GNUmakefile:71: check-world-src/test-recurse] Error 2

--
Regrads,
Japin Li



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Conflict detection for update_deleted in logical replication
Next
From: Amul Sul
Date:
Subject: Re: Allow NOT VALID foreign key constraints on partitioned tables.