Re: Collation with upper and numeric comparing in unexpected way - Mailing list pgsql-general

From Peter Eisentraut
Subject Re: Collation with upper and numeric comparing in unexpected way
Date
Msg-id 03f1450c-a730-40af-890c-c269db74a8ec@eisentraut.org
Whole thread Raw
In response to Collation with upper and numeric comparing in unexpected way  (Matt Magoffin <postgresql.org@msqr.us>)
Responses Re: Collation with upper and numeric comparing in unexpected way
List pgsql-general
On 20.01.26 19:36, Matt Magoffin wrote:
> I am using Postgres 17 and trying to configure a collation that sorts upper case before lower case and includes
numericsorting:
 
> 
> CREATE COLLATION testsort (provider = icu, locale = 'und-u-kf-upper-kn’);
> 
> These comparisons are working as I expected:
> 
> SELECT 'id-45' < 'id-123' COLLATE testsort; -- true (45 before 123)
> 
> SELECT 'id' < 'ID' COLLATE testsort; -- false (upper case before lower case)
> 
> However combining them resulted in an unexpected result:
> 
> SELECT 'id-45' < 'ID-123' COLLATE testsort; -- true
> 
> I thought that last one would be false because “ID” would come before “id”. Is there a way to configure the collation
toachieve that? I’m trying to match the sorting behaviour in external application code.
 

I suspect that this is because the effect of the numeric sorting is a 
primary difference and the case difference is only a tertiary difference.

In other words, imagine the numeric sorting pass replacing all numbers 
by hypothetical letters corresponding to the numeric order, like

'id-45' -> 'id-X'
'id-123' -> 'id-Z'
'ID-123' -> 'ID-Z'

Then you would have

'id-45' < 'ID-123' =>
'id-X' < 'ID-Z'

which would be correct.

This is just my guess from the outside.  The numeric sorting is not a 
part of the Unicode Collation Algorithm standard, it is an extension by 
ICU, so one would have to dig into the code or documentation there, but 
I didn't find anything.

I don't know if there is a way to customize this further to get the 
effect you want.  Maybe you could reach out to an ICU support forum to 
get more expert insights there.




pgsql-general by date:

Previous
From: ManiR
Date:
Subject: Re: Request for cryptographic mechanisms used in PostgreSQL
Next
From: "Colin 't Hart"
Date:
Subject: pgBadger and postgres_fdw