Re: [PATCH] Refactor bytea_sortsupport(), take two - Mailing list pgsql-hackers

From John Naylor
Subject Re: [PATCH] Refactor bytea_sortsupport(), take two
Date
Msg-id CANWCAZY0ndYj881r_4OW-COj=HvtawXc8hJ5OquL2ma1vYn2mg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Refactor bytea_sortsupport(), take two  (Aleksander Alekseev <aleksander@tigerdata.com>)
Responses Re: [PATCH] Refactor bytea_sortsupport(), take two
List pgsql-hackers
On Mon, Nov 17, 2025 at 5:57 PM Aleksander Alekseev
<aleksander@tigerdata.com> wrote:
> > > > - * More generally, it's okay that bytea callers can have NUL bytes in
> > > > - * strings because abbreviated cmp need not make a distinction between
> > > > - * terminating NUL bytes, and NUL bytes representing actual NULs in the
> > > > - * authoritative representation.  Hopefully a comparison at or past one
> >
> > > > Why is this gone -- AFAICT it's still true for bytea, no matter what
> > > > file it's located in?
> > >
> > > Actually for bytea_abbrev_convert() this comment makes no sense IMO.
> > > Presumably the reader is aware that '\x0000...' is a valid bytea, and
> > > there is no such thing as "terminating NUL bytes" for bytea.
> >
> > It makes perfect sense to me. The abbreviated datum has terminating
> > NUL bytes if the source length is shorter than 8 bytes. The comment is
> > explaining why comparing bytea abbreviated datums still works. It
> > seems like everything starting with "abbreviated cmp need not ..." is
> > still appropriate for bytea.c.
>
> OK, I returned the comment to bytea.c, but replaced "string" with
> "bytea". IMO mentioning "strings" is confusing in this context.
>
> I still don't like the fact that the comment is reasoning about
> "terminating NUL bytes" in the context of bytea, but I kept it as is
> for now. It seems confusing to me. Do you think we should rewrite
> this?

+ /*
+ * It's okay that callers can have NUL bytes in bytea because abbreviated
+ * cmp need not make a distinction between terminating NUL bytes, and NUL
+ * bytes representing actual NULs in the authoritative representation.

I mentioned everything starting with "abbreviated cmp need not ..."
was okay, so I agree that "It's okay that callers can have NUL bytes
in bytea" is redundant. Was that the confusing part for you? How about
this:

"Short byteas will have terminating NUL bytes in the abbreviated
datum. Abbreviated comparison need not  make a distinction between
thse NUL bytes, and NUL bytes representing actual NULs in the
authoritative representation." [...]

After that, the rest seems to flow better at a quick glance.

> > Likewise, this part of varlena.c is still factually accurate as an aside:
> >
> > - * (Actually, even if there were NUL bytes in the blob it would be
> > - * okay.  See remarks on bytea case above.)
> >
> > ...although with the above, we'll have to point to the relevant place
> > in bytea.c.
>
> Not sure if I follow. Previously we agreed that we disallow NUL bytes
> within strings, except for terminating ones. What we are doing here is
> refactoring / simplifying the code. Why keep irrelevant and confusing
> comments?

Okay, I'll concede that's it's no longer useful and maybe a distraction..

--
John Naylor
Amazon Web Services



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: How can end users know the cause of LR slot sync delays?
Next
From: Chao Li
Date:
Subject: Re: pg_waldump: support decoding of WAL inside tarfile