Re: refactor architecture-specific popcount code - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: refactor architecture-specific popcount code
Date
Msg-id e67ca6d9-db0b-4322-97ca-176964b94a35@iki.fi
Whole thread Raw
In response to Re: refactor architecture-specific popcount code  (John Naylor <johncnaylorls@gmail.com>)
Responses Re: refactor architecture-specific popcount code
List pgsql-hackers
On 15/01/2026 11:07, John Naylor wrote:
> 0003
> 
> s/fast/sse42/:
> 
> Seems okay in this file, but this isn't the best name, either. Maybe a
> comment to head off future "corrections", something like:
> "Technically, POPCNT is not part of SSE 4.2, and is not even a vector
> operation, but many compilers emit the popcnt instruction with
> -msse4.2 anyway."
> 
> s/slow/generic/:
> 
> I'm ambivalent about this. The "slow" designation is flat-out wrong
> since at least Power and aarch64 can emit a single instruction here
> without prodding the compiler. On the other hand, "generic" seems
> wrong too, since e.g. pg_popcount64_slow() has three configure symbols
> and two compiler builtins. :-D

"fallback", or "portable" ?

> A possible future project would be to have a truly generic simple
> fallback in pure C and put all the fancy stuff in the header for
> architectures that have unconditional hardware support. It would make
> more sense to revisit the name then.
Yeah, I noticed that on x86_64, pg_popcount_optimized is always a 
function pointer with runtime check, even if you use compiler flags to 
target a CPU where the special instructions are available unconditionally.

- Heikki




pgsql-hackers by date:

Previous
From: Anthonin Bonnefoy
Date:
Subject: Re: Add missing JIT inline pass for llvm>=17
Next
From: Hannu Krosing
Date:
Subject: Re: Patch: dumping tables data in multiple chunks in pg_dump