On Fri, Sep 12, 2025 at 06:49:01PM +0000, Chiranmoy.Bhattacharya@fujitsu.com wrote:
> Using simd.h does make it easier to maintain. Is there a plan to upgrade
> simd.h to use SSE4 or SSSE3 in the future? Since SSE2 is much older, it
> lacks some of the more specialized intrinsics. For example, vectorized
> table lookup can be implemented via [0], and it’s available in SSSE3 and
> later x86 instruction sets.
There have been a couple of discussions about the possibility of requiring
x86-64-v2 for Postgres, but I'm not aware of any serious efforts in that
area.
I've attached a new version of the patch with a simd.h version of
hex_decode(). Here are the numbers:
arm
buf | HEAD | patch | % diff
-------+-------+-------+--------
16 | 22 | 23 | -5
64 | 61 | 23 | 62
256 | 158 | 47 | 70
1024 | 542 | 122 | 77
4096 | 2103 | 429 | 80
16384 | 8548 | 1673 | 80
65536 | 34663 | 6738 | 81
x86
buf | HEAD | patch | % diff
-------+-------+-------+--------
16 | 13 | 14 | -8
64 | 42 | 15 | 64
256 | 126 | 42 | 67
1024 | 461 | 149 | 68
4096 | 1802 | 576 | 68
16384 | 7166 | 2280 | 68
65536 | 28625 | 9108 | 68
A couple of notes:
* For hex_decode(), we just give up on the SIMD path and fall back on the
scalar path as soon as we see anything outside [0-9A-Za-z]. I suspect
this might introduce a regression for inputs of ~32 to ~64 bytes that
include whitespace (which must be skipped) or invalid characters, but I
don't whether those inputs are common or whether we care.
* The code makes some assumptions about endianness that might not be true
everywhere, but I've yet to dig into this further.
--
nathan