Re: Sigh, LIKE indexing is *still* broken in foreign locales - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Sigh, LIKE indexing is *still* broken in foreign locales
Date
Msg-id 15579.960473051@sss.pgh.pa.us
Whole thread Raw
In response to Re: [BUGS] Re: Sigh, LIKE indexing is *still* broken in foreign locales  (Giles Lean <giles@nemeton.com.au>)
Responses Re: Sigh, LIKE indexing is *still* broken in foreign locales
List pgsql-hackers
Giles Lean <giles@nemeton.com.au> writes:
> On Thu, 8 Jun 2000 08:53:25 +0200  "Matthias Urlichs" wrote:

>> To find the position in the index where it should start scanning.

> Hmm.  That I guess is faster than locating the prefix given to LIKE in
> the index and scanning back as well as forward.

Wouldn't help.  The reason why we need both an upper and lower bound
is to know where to stop scanning as well as where to start.  "Scan
outward from the middle" doesn't tell you when you can stop.

The bounds do not have to be perfectly tight, in the sense of being
the least string >= or largest string <= the desired strings.  It's
OK if we scan a few extra tuples in some cases.  But we have to have
reasonably close bounds or we can't implement LIKE with an index.

>> Personally, I am in the "store everything on the server in Unicode"
>> camp. Let the parser convert everything to Unicode on the way in, 
>> and vice versa.

AFAIK, none of our server-side charset encodings are stateful --- and
I for one will argue that we must never accept any, for precisely the
sort of problem being discussed here.  (If a client wants to use such
a brain-dead encoding, that's not our problem...)

However, the problem at hand has little to do with encodings.  I think
it's more a matter of understanding the possible variations of
context-sensitive collation orders.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Don Baccus
Date:
Subject: Re: Proposal: TRUNCATE TABLE table RESTRICT
Next
From: Tom Lane
Date:
Subject: Re: [GENERAL] NOTIFY/LISTEN in pgsql 7.0