Re: [PATCH] Refactor bytea_sortsupport(), take two - Mailing list pgsql-hackers
| From | John Naylor |
|---|---|
| Subject | Re: [PATCH] Refactor bytea_sortsupport(), take two |
| Date | |
| Msg-id | CANWCAZY0ndYj881r_4OW-COj=HvtawXc8hJ5OquL2ma1vYn2mg@mail.gmail.com Whole thread Raw |
| In response to | Re: [PATCH] Refactor bytea_sortsupport(), take two (Aleksander Alekseev <aleksander@tigerdata.com>) |
| Responses |
Re: [PATCH] Refactor bytea_sortsupport(), take two
|
| List | pgsql-hackers |
On Mon, Nov 17, 2025 at 5:57 PM Aleksander Alekseev <aleksander@tigerdata.com> wrote: > > > > - * More generally, it's okay that bytea callers can have NUL bytes in > > > > - * strings because abbreviated cmp need not make a distinction between > > > > - * terminating NUL bytes, and NUL bytes representing actual NULs in the > > > > - * authoritative representation. Hopefully a comparison at or past one > > > > > > Why is this gone -- AFAICT it's still true for bytea, no matter what > > > > file it's located in? > > > > > > Actually for bytea_abbrev_convert() this comment makes no sense IMO. > > > Presumably the reader is aware that '\x0000...' is a valid bytea, and > > > there is no such thing as "terminating NUL bytes" for bytea. > > > > It makes perfect sense to me. The abbreviated datum has terminating > > NUL bytes if the source length is shorter than 8 bytes. The comment is > > explaining why comparing bytea abbreviated datums still works. It > > seems like everything starting with "abbreviated cmp need not ..." is > > still appropriate for bytea.c. > > OK, I returned the comment to bytea.c, but replaced "string" with > "bytea". IMO mentioning "strings" is confusing in this context. > > I still don't like the fact that the comment is reasoning about > "terminating NUL bytes" in the context of bytea, but I kept it as is > for now. It seems confusing to me. Do you think we should rewrite > this? + /* + * It's okay that callers can have NUL bytes in bytea because abbreviated + * cmp need not make a distinction between terminating NUL bytes, and NUL + * bytes representing actual NULs in the authoritative representation. I mentioned everything starting with "abbreviated cmp need not ..." was okay, so I agree that "It's okay that callers can have NUL bytes in bytea" is redundant. Was that the confusing part for you? How about this: "Short byteas will have terminating NUL bytes in the abbreviated datum. Abbreviated comparison need not make a distinction between thse NUL bytes, and NUL bytes representing actual NULs in the authoritative representation." [...] After that, the rest seems to flow better at a quick glance. > > Likewise, this part of varlena.c is still factually accurate as an aside: > > > > - * (Actually, even if there were NUL bytes in the blob it would be > > - * okay. See remarks on bytea case above.) > > > > ...although with the above, we'll have to point to the relevant place > > in bytea.c. > > Not sure if I follow. Previously we agreed that we disallow NUL bytes > within strings, except for terminating ones. What we are doing here is > refactoring / simplifying the code. Why keep irrelevant and confusing > comments? Okay, I'll concede that's it's no longer useful and maybe a distraction.. -- John Naylor Amazon Web Services
pgsql-hackers by date: