Re: Optimize Arm64 crc32c implementation in Postgresql - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: Optimize Arm64 crc32c implementation in Postgresql
Date
Msg-id 0E10D81B-4653-429D-AB17-665E0474A00B@yesql.se
Whole thread Raw
In response to Re: Optimize Arm64 crc32c implementation in Postgresql  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Optimize Arm64 crc32c implementation in Postgresql
List pgsql-hackers
> On 03 Apr 2018, at 18:38, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> On 03/04/18 19:09, Andres Freund wrote:
>> Hi,
>> On 2018-04-03 19:05:19 +0300, Heikki Linnakangas wrote:
>>> On 01/04/18 20:32, Andres Freund wrote:
>>>> On 2018-03-06 02:44:35 +0800, Heikki Linnakangas wrote:
>>>>> * I tested this on Linux, with gcc and clang, on an ARM64 virtual machine
>>>>> that I had available (not an emulator, but a VM on a shared ARM64 server).
>>>>
>>>> Have you seen actual postgres performance benefits with the patch?
>>>
>>> I just ran a small test with pg_waldump, similar to what Abhijit Menon-Sen
>>> ran with the Slicing-by-8 and Intel SSE patches, when we added those
>>> (https://www.postgresql.org/message-id/20141119155811.GA32492%40toroid.org).
>>> I ran pgbench, with scale factor 5, until it had generated about 1 GB of
>>> WAL, and then I ran pg_waldump -z on that WAL. With slicing-by-8, it took
>>> about 7 s, and with the special CPU instructions, about 5 s. 'perf' showed
>>> that the CRC computation took about 30% of the CPU time before, and about
>>> 12% after, which sounds about right. That's not as big a speedup as we saw
>>> with the corresponding Intel SSE instructions back in 2014, but still quite
>>> worthwhile.
>> Cool.  Based on a skim the patch looks reasonable.
>
> Thanks.
>
> I bikeshedded with myself on the naming of things, and decided to use "ARMv8" in the variable and file names, instead
ofARM64 or ARMCE or ARM64CE. The CRC instructions were introduced in ARM v8 (as an optional feature), it's not really
relatedto the 64-bitness, even though the 64-bit instruction set was also introduced in ARM v8. Other than that, and
somecomment fixes, this is the same as the previous patch version. 

Noticed two minor documentation issues in a quick skim of the attached patch:

The following line should say SSE and not SSSE (and the same typo is in
src/include/pg_config.h.in and src/include/pg_config.h.win32).  While not
introduced in this patch, it’s adjacent to the patched codepath and on topic so
may well be fixed while in there.

     AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSSE 4.2 CRC instructions with a
runtimecheck.]) 

The documentation in configure.in use “runtime” rather than "run time” in all
lines except this one:

+# uses the CRC instructions, compile both, and select at run time.

cheers ./daniel

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgsql: Validate page level checksums in base backups
Next
From: Alvaro Herrera
Date:
Subject: Re: Foreign keys and partitioned tables