Re: tuple radix sort - Mailing list pgsql-hackers

From John Naylor
Subject Re: tuple radix sort
Date
Msg-id CANWCAZZiMGj6QuHfyBPv9at7xn23NPAz4nit=G6fr26V+h8MKg@mail.gmail.com
Whole thread Raw
In response to Re: tuple radix sort  (Chengpeng Yan <chengpeng_yan@Outlook.com>)
Responses Re: tuple radix sort
List pgsql-hackers
On Thu, Nov 20, 2025 at 6:13 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
> I think given https://www.boost.org/LICENSE_1_0.txt you should include a
> copy of the Boost license in this comment, as well as the copyright
> statement from the hpp file,

Done.

On Sun, Nov 23, 2025 at 12:33 PM Chengpeng Yan
<chengpeng_yan@outlook.com> wrote:
> ```
> if (part.offset == part.next_offset)
> ```
>
> Since "part" is a local copy of the struct, this check might not
> reflect the latest state updated inside the loop. It might be slightly
> more efficient to check the array directly:
>
> ```
> if (partitions[idx].offset == partitions[idx].next_offset)
> ```

Done, and removed the local copy since it wasn't doing much else.

> Since we are looking for the leftmost bit difference, we could
> accumulate the differences using bitwise OR. This avoids a conditional
> branch inside the loop:
>
> ```
> common_upper_bits |= this_common_bits;
> ```

Done.

> 3. Short-circuit for identical keys (v4-0004)
>
> When calculating common_prefix, if common_upper_bits is 0, it implies
> that all non-null keys are identical (for the bits we care about). In
> this case, we might be able to skip the radix sort entirely or handle
> it as a single partition. Currently, the code handles it by passing
> "common_upper_bits | 1" to pg_leftmost_one_pos64, which is safe but
> perhaps not the most optimal path for identical keys.

Added a short-circuit.

For v5 I've also added CHECK_FOR_INTERRUPTS and rewrote some comments.


--
John Naylor
Amazon Web Services

Attachment

pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Re: Refactoring: Use soft error reporting for *_opt_overflow functions of date/timestamp
Next
From: atma ram
Date:
Subject: Question on PostgreSQL Table Partitioning – Performance of Queries That Do Not Use the Partition Key