Re: Remaining dependency on setlocale() - Mailing list pgsql-hackers

From Chao Li
Subject Re: Remaining dependency on setlocale()
Date
Msg-id 6DDB0693-29E0-41B6-AE63-F0723F7DA8CD@gmail.com
Whole thread Raw
In response to Re: Remaining dependency on setlocale()  (Chao Li <li.evan.chao@gmail.com>)
List pgsql-hackers

> On Nov 26, 2025, at 09:50, Chao Li <li.evan.chao@gmail.com> wrote:
>
> I will review the rest 3 commits tomorrow.

10 - 0009
```
 {
     if (isalpha((unsigned char) c))
     {
-        c = toupper((unsigned char) c);
+        c = pg_ascii_toupper((unsigned char) c);
```

Just curious. As isaplha() and toupper() come from the same header file ctype.h, if we replace toupper with
pg_ascii_toupper,does isapha also need to be handled? 

11 - 0010
```
-    for (i = 0; i < len; i++)
-    {
-        unsigned char ch = (unsigned char) ident[i];
+    dstsize = len + 1;
+    result = palloc(dstsize);

-        if (ch >= 'A' && ch <= 'Z')
-            ch += 'a' - 'A';
-        else if (enc_is_single_byte && IS_HIGHBIT_SET(ch) && isupper(ch))
-            ch = tolower(ch);
-        result[i] = (char) ch;
-    }
-    result[i] = '\0';
+    needed = pg_strfold_ident(result, dstsize, ident, len);
+    Assert(needed + 1 == dstsize);
+    Assert(needed == len);
```

I think assert both dstsize and len are redundant, because dstsize=len+1, and no place to change their values.

12 - 0010
```
+/*
+ * Fold an identifier using the database default locale.
+ *
+ * For historical reasons, does not use ordinary locale behavior. Should only
+ * be used for identifier folding. XXX: can we make this equivalent to
+ * pg_strfold(..., default_locale)?
+ */
+size_t
+pg_strfold_ident(char *dest, size_t destsize, const char *src, ssize_t srclen)
+{
+    if (default_locale == NULL || default_locale->ctype == NULL)
+    {
+        int            i;
+
+        for (i = 0; i < srclen && i < destsize; i++)
+        {
+            unsigned char ch = (unsigned char) src[i];
+
+            if (ch >= 'A' && ch <= 'Z')
+                ch += 'a' - 'A';
+            dest[i] = (char) ch;
+        }
+
+        if (i < destsize)
+            dest[i] = '\0';
+
+        return srclen;
+    }
+    return default_locale->ctype->strfold_ident(dest, destsize, src, srclen,
+                                                default_locale);
+}
```

Given default_local can be NULL only at some specific moment, can we do something like

Local = default_local;
If (local == NULL || local->ctype == NULL)
  Local = libc or other fallback;
Return default_locale->ctype->strfold_ident(dest, destsize, src, srclen, local);

This way avoids the duplicate code.

13 - 0011
```
+{ name => 'lc_collate', type => 'string', context => 'PGC_SUSET', group => 'CLIENT_CONN_LOCALE',
+  short_desc => 'Sets the locale for text ordering in extensions.',
```

I just feel the GUC name is very misleading. Without carefully reading the doc, users may very easy to consider
lc_collatethe system’s locale. If it only affects extensions, then let’s name it accordingly, for example,
“extension_lc_collate”,or “legacy_lc_collate”. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Support tid range scan in parallel?
Next
From: Tatsuo Ishii
Date:
Subject: Re: Row pattern recognition