On Thu, 2024-10-24 at 10:05 +0200, Andreas Karlsson wrote:
> Why is there no pg_locale_builtin.c?
Just that it would be a fairly small file, but I'm fine with doing
that.
> I think adding an assert to create_pg_locale() which enforces valid
> there is always a combination of ctype_is_c and casemap would be
> good,
> similar to the collate field.
Good idea.
> Why are casemap and ctype_methods not the same struct? They seem very
> closely related.
The code impact was in fairly different places, so it seemed like a
nice way to break it out. I could combine them, but it would be a
fairly large patch.
> This commit makes me tempted to handle the ctype_is_c logic for
> character classes also in callbacks and remove the if in functions
> like
> pg_wc_ispunct(). But this si something that would need to be
> benchmarked.
That's a good idea. The reason collate_is_c is important is because
there are quite a few caller-specific optimizations, but that doesn't
seem to be true of ctype_is_c.
> I wonder if the bitmask idea isn't terrible for the branch predictor
> and
> that me may want one function per character class, but this is yet
> again
> something we need to benchmark.
Agreed -- a lot of work has gone into optimizing the regex code, and we
don't want a perf regression there. But I'm also not sure exactly which
kinds of tests I should be running for that.
> Is there a reason we allocate the icu_provider in
> create_pg_locale_icu
> with MemoryContextAllocZero when we intialize everything anyway? And
> similar for other providers.
Allocating and zeroing is a good defense against new optional methods
and fields which can safely default to zero.
> = v6-0011-Introduce-hooks-for-creating-custom-pg_locale_t.patch
>
> Looks good but seems like a quite painful API to use.
How is it painful and can we make it better?
> > * Have a CREATE LOCALE PROVIDER command and make "provider" an Oid
> > rather than a char ('b'/'i'/'c'). The v6 patches brings us close to
> > this point, but I'm not sure if we want to go this far in v18.
>
> Probably necessary but I hate all the DDL commands the way to SQL
> standard is written forces us to add.
There is some precedent for a DDL-like thing without new grammar:
pg_replication_origin_create(). I don't have a strong opinion on
whether to do that or not.
>
Regards,
Jeff Davis