Re: [PATCH] Hex-coding optimizations using SVE on ARM. - Mailing list pgsql-hackers

From Chiranmoy.Bhattacharya@fujitsu.com"
Subject Re: [PATCH] Hex-coding optimizations using SVE on ARM.
Date
Msg-id TY2PR01MB2667B294A9D2556C05BE02A0971F2@TY2PR01MB2667.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PATCH] Hex-coding optimizations using SVE on ARM.  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: [PATCH] Hex-coding optimizations using SVE on ARM.
List pgsql-hackers
On Fri, Jan 10, 2025 at 09:38:14AM -0600, Nathan Bossart wrote:
> Do you mean that the auto-vectorization worked and you observed no
> performance improvement, or the auto-vectorization had no effect on the
> code generated?

Auto-vectorization is working now with the following addition on Graviton 3 (m7g.4xlarge) with GCC 11.4, and the results match yours. Previously, auto-vectorization had no effect because we missed the -march=native option.

      encode.o: CFLAGS += ${CFLAGS_VECTORIZE} -march=native

There is a 30% improvement using auto-vectorization.

 buf   | default | auto_vec | SVE
--------+-------+--------+-------
     16 |     16  |      12  |    8
     64 |     58  |      40  |    9
    256 |    223  |     152  |   18
   1024 |    934  |     613  |   54
   4096 |   3533  |    2430  |  202
  16384 |  14081  |    9831  |  800
  65536 |  56374  |   38702  | 3202

Auto-vectorization had no effect on hex_decode due to the presence of control flow.

-----
Here is a comment snippet from src/include/port/simd.h

"While Neon support is technically optional for aarch64, it appears that all available 64-bit hardware does have it."

Currently, it is assumed that all aarch64 machine support NEON, but for newer advanced SIMD like SVE (and AVX512 for x86) this assumption may not hold. We need a runtime check to be sure.. Using src/include/port/simd.h to abstract away these advanced SIMD implementations may be difficult.

We will update the thread once a solution is found.

-----
Chiranmoy

pgsql-hackers by date:

Previous
From: "Malladi, Rama"
Date:
Subject: Re: [PATCH] SVE popcount support
Next
From: Bertrand Drouvot
Date:
Subject: Re: POC: track vacuum/analyze cumulative time per relation