Home > mailing lists

Re: [PATCH] Hex-coding optimizations using SVE on ARM. - Mailing list pgsql-hackers

From	Nathan Bossart
Subject	Re: [PATCH] Hex-coding optimizations using SVE on ARM.
Date	January 10, 2025 23:46:45
Msg-id	Z4GHNfhRKuA0r_Wn@nathan Whole thread Raw
In response to	Re: [PATCH] Hex-coding optimizations using SVE on ARM. (Nathan Bossart <nathandbossart@gmail.com>)
Responses	Re: [PATCH] Hex-coding optimizations using SVE on ARM.
List	pgsql-hackers

Tree view

On Fri, Jan 10, 2025 at 09:38:14AM -0600, Nathan Bossart wrote:
> On Fri, Jan 10, 2025 at 11:10:03AM +0000, Chiranmoy.Bhattacharya@fujitsu.com wrote:
>> We tried auto-vectorization and observed no performance improvement.
> 
> Do you mean that the auto-vectorization worked and you observed no
> performance improvement, or the auto-vectorization had no effect on the
> code generated?

I was able to get auto-vectorization to take effect on Apple clang 16 with
the following addition to src/backend/utils/adt/Makefile:

    encode.o: CFLAGS += ${CFLAGS_VECTORIZE} -mllvm -force-vector-width=8

This gave the following results with your hex_encode_test() function:

    buf  | HEAD  | patch | % diff
  -------+-------+-------+--------
      16 |    21 |    16 |   24
      64 |    54 |    41 |   24
     256 |   138 |   100 |   28
    1024 |   441 |   300 |   32
    4096 |  1671 |  1106 |   34
   16384 |  6890 |  4570 |   34
   65536 | 27393 | 18054 |   34

This doesn't compare with the gains you are claiming to see with
intrinsics, but it's not bad for a one line change.  I bet there are ways
to adjust the code so that the auto-vectorization is more effective, too.

-- 
nathan

pgsql-hackers by date:

From: Tom Lane
Date: 10 January 2025, 23:19:05
Subject: Re: IANA timezone abbreviations versus timezone_abbreviations

From: Tom Lane
Date: 11 January 2025, 00:37:56
Subject: Re: Memory leak in plpython3u (with testcase and patch)

Re: [PATCH] Hex-coding optimizations using SVE on ARM. - Mailing list pgsql-hackers

Previous

Next