Re: UUID v7 - Mailing list pgsql-hackers
From | Peter Eisentraut |
---|---|
Subject | Re: UUID v7 |
Date | |
Msg-id | 10553e4c-6b66-44b9-86a7-80bb1958a767@eisentraut.org Whole thread Raw |
In response to | Re: UUID v7 ("Andrey M. Borodin" <x4mmm@yandex-team.ru>) |
Responses |
Re: UUID v7
|
List | pgsql-hackers |
On 30.01.24 14:35, Andrey M. Borodin wrote: >> On 30 Jan 2024, at 15:33, Junwang Zhao <zhjwpku@gmail.com> wrote: >> >> It's always good to add a newline at the end of a source file, though >> this might be nitpicky. > > Thanks, also fixed warning found by CFBot. I have various comments on this patch: - doc/src/sgml/func.sgml The documentation of the new functions should be broken up a bit. It's all one paragraph now. At least make it several paragraphs, or possibly tables or something else. Avoid listing the functions twice: Once before the description and then again in the description. That's just going to get out of date. The first listing is not necessary, I think. The return values in the documentation should use the public-facing type names, like "timestamp with time zone" and "smallint". The descriptions of the UUID generation functions use handwavy language in their descriptions, like "It provides much better data locality" or "unacceptable security or business intelligence implications", which isn't useful. Either we cut that all out and just say, it creates a UUIDv7, done, look elsewhere for more information, or we provide some more concretely useful details. We shouldn't label a link as "IETF standard" when it's actually a draft. - src/include/catalog/pg_proc.dat The description of uuidv4 should be "generate UUID version 4", so that it parallels uuidv7. The description of uuid_extract_time says 'extract timestamp from UUID version 7', the implementation is not limited to version 7. I think uuid_extract_time should be named uuid_extract_timestamp, because it extracts a timestamp, not a time. The functions uuid_extract_ver and uuid_extract_var could be named uuid_extract_version and uuid_extract_variant. Otherwise, it's hard to tell them apart, with only one letter different. - src/test/regress/sql/uuid.sql Why are the tests using the input format '{...}', which is not the standard one? - src/backend/utils/adt/uuid.c All this new code should have more comments. There is a lot of bit twiddling going on, and I suppose one is expected to follow along in the RFC? At least each function should have a header comment, so one doesn't have to check in pg_proc.dat what it's supposed to do. I'm suspicious that these functions all appear to return null for erroneous input, rather than raising errors. I think at least some explanation for this should be recorded somewhere. I think the behavior of uuid_extract_var(iant) is wrong. The code takes just two bits to return, but the draft document is quite clear that the variant is 4 bits (see Table 1). The uuidv7 function could really use a header comment that explains the choices that were made. The RFC draft provides various options that implementations could use; we should describe which ones we chose. I would have expected that, since gettimeofday() provides microsecond precision, we'd put the extra precision into "rand_a" as per Section 6.2 method 3. You use some kind of counter, but could you explain which method that counter implements? I don't see any acknowledgment of issues relating to concurrency or restarts. Like, how do we prevent duplicates being generated by concurrent sessions or between restarts? Maybe the counter or random stuff does that, but it's not explained.
pgsql-hackers by date: