Thread: like performance w/o wildcards.

like performance w/o wildcards.

From

Joseph Shraibman

Date:

03 August 2003, 17:16:44

I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
and is in reality an '='.  Is this an easy change to make?

Re: like performance w/o wildcards.

From

Tom Lane

Date:

03 August 2003, 19:28:26

Joseph Shraibman <jks@selectacast.net> writes:
> I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
> and is in reality an '='.  Is this an easy change to make?

On what do you base that conclusion?

regression=# create table t1 (f1 text unique);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index 't1_f1_key' for table 't1'
CREATE TABLE
regression=# explain select * from t1 where f1 like 'foo';
                             QUERY PLAN
---------------------------------------------------------------------
 Index Scan using t1_f1_key on t1  (cost=0.00..4.82 rows=1 width=32)
   Index Cond: (f1 = 'foo'::text)
   Filter: (f1 ~~ 'foo'::text)
(3 rows)

regression=#

            regards, tom lane

Re: like performance w/o wildcards.

From

Joseph Shraibman

Date:

04 August 2003, 00:29:47

Hmm.  I didn't work for me.  I'll try and figure this out.

Tom Lane wrote:
> Joseph Shraibman <jks@selectacast.net> writes:
>
>>I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
>>and is in reality an '='.  Is this an easy change to make?
>
>
> On what do you base that conclusion?
>
> regression=# create table t1 (f1 text unique);
> NOTICE:  CREATE TABLE / UNIQUE will create implicit index 't1_f1_key' for table 't1'
> CREATE TABLE
> regression=# explain select * from t1 where f1 like 'foo';
>                              QUERY PLAN
> ---------------------------------------------------------------------
>  Index Scan using t1_f1_key on t1  (cost=0.00..4.82 rows=1 width=32)
>    Index Cond: (f1 = 'foo'::text)
>    Filter: (f1 ~~ 'foo'::text)
> (3 rows)
>
> regression=#
>
>             regards, tom lane

Re: like performance w/o wildcards.

From

Richard Huxton

Date:

04 August 2003, 05:26:01

On Monday 04 August 2003 04:29, Joseph Shraibman wrote:
> Hmm.  I didn't work for me.  I'll try and figure this out.
>
> Tom Lane wrote:
> > Joseph Shraibman <jks@selectacast.net> writes:
> >>I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
> >>and is in reality an '='.  Is this an easy change to make?

Check your locale/encoding - you probably want "C" or similar for LIKE to use
an index.
--
  Richard Huxton
  Archonet Ltd

Re: like performance w/o wildcards.

From

Joseph Shraibman

Date:

04 August 2003, 14:00:34

Richard Huxton wrote:
> On Monday 04 August 2003 04:29, Joseph Shraibman wrote:
>
>>Hmm.  I didn't work for me.  I'll try and figure this out.
>>
>>Tom Lane wrote:
>>
>>>Joseph Shraibman <jks@selectacast.net> writes:
>>>
>>>>I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
>>>>and is in reality an '='.  Is this an easy change to make?
>
>
> Check your locale/encoding - you probably want "C" or similar for LIKE to use
> an index.

I'm using latin1.  Why should it make a difference?

Re: like performance w/o wildcards.

From

Bruce Momjian

Date:

04 August 2003, 14:01:44

Well, I don't see that = would be significantly faster than LIKE with a
no-wildcard string, so I don't see the value in having LIKE detect
non-whildcard strings.

---------------------------------------------------------------------------

Joseph Shraibman wrote:
> I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
> and is in reality an '='.  Is this an easy change to make?
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: like performance w/o wildcards.

From

Joseph Shraibman

Date:

04 August 2003, 14:09:31

Bruce Momjian wrote:
> Well, I don't see that = would be significantly faster than LIKE with a
> no-wildcard string, so I don't see the value in having LIKE detect
> non-whildcard strings.
>
Because it won't use an index for a LIKE, only a seqscan.

Re: like performance w/o wildcards.

From

Bruce Momjian

Date:

04 August 2003, 14:11:21

Joseph Shraibman wrote:
> Bruce Momjian wrote:
> > Well, I don't see that = would be significantly faster than LIKE with a
> > no-wildcard string, so I don't see the value in having LIKE detect
> > non-whildcard strings.
> >
> Because it won't use an index for a LIKE, only a seqscan.

Oh, that's interesting.  I think a LIKE will already use an index,
except for non-C locales.  Is that the issue?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: like performance w/o wildcards.

From

Tom Lane

Date:

04 August 2003, 15:18:57

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Well, I don't see that = would be significantly faster than LIKE with a
> no-wildcard string, so I don't see the value in having LIKE detect
> non-whildcard strings.

You forgot about indexing.

            regards, tom lane

Re: like performance w/o wildcards.

From

Joseph Shraibman

Date:

04 August 2003, 19:26:13

Bruce Momjian wrote:
> Joseph Shraibman wrote:
>
>>Bruce Momjian wrote:
>>
>>>Joseph Shraibman wrote:
>>>
>>>
>>>>Bruce Momjian wrote:
>>>>
>>>>
>>>>
>>>>>Oh, that's interesting.  I think a LIKE will already use an index,
>>>>>except for non-C locales.  Is that the issue?
>>>>>
>>>>
>>>>Yes, I'm using latin1.  Why does an index only work on C?
>>>
>>>
>>>Because we can't determine what caracters are before/after a given
>>>character to do a restriction, e.g. col LIKE 'F*' add col >= 'F' AND col
>>>< 'G'.  In non-C, we don't know the ordering.
>>>
>>
>>Then what is the index for?
>
>
> The index is for non-LIKE comparisons, like = and >.
>
My point is what is the difference?  If the index is done char by char then LIKE ought to
be able to use the index to find 'F*', no matter what the locale is.  What can't you
figure out what is before/after a given char?  Don't you have that information?  Don't you
need it to create the index in the first place?

Re: like performance w/o wildcards.

From

Tom Lane

Date:

04 August 2003, 19:35:57

Joseph Shraibman <jks@selectacast.net> writes:
>> The index is for non-LIKE comparisons, like = and >.
>>
> My point is what is the difference?  If the index is done char by char then LIKE ought to
> be able to use the index to find 'F*', no matter what the locale is.  What can't you
> figure out what is before/after a given char?  Don't you have that information?  Don't you
> need it to create the index in the first place?

<yawn>  Read the archives.  We would love to use locale-sorted indexes
for LIKE, but we *can't*.  There are too many bizarre sorting rules.
(Hint: almost no locale does its sorting purely "char by char".)

            regards, tom lane

Re: like performance w/o wildcards.

From

Joseph Shraibman

Date:

04 August 2003, 20:21:14

Tom Lane wrote:
> Joseph Shraibman <jks@selectacast.net> writes:
>
>>>The index is for non-LIKE comparisons, like = and >.
>>>
>>
>>My point is what is the difference?  If the index is done char by char then LIKE ought to
>>be able to use the index to find 'F*', no matter what the locale is.  What can't you
>>figure out what is before/after a given char?  Don't you have that information?  Don't you
>>need it to create the index in the first place?
>
>
> <yawn>  Read the archives.  We would love to use locale-sorted indexes
> for LIKE, but we *can't*.  There are too many bizarre sorting rules.
> (Hint: almost no locale does its sorting purely "char by char".)
>
>             regards, tom lane

What percentage of locales have this problem?  Does latin1 have this problem?

And what about my original idea, can LIKE be turned into an = when there are no wildcards?

Re: like performance w/o wildcards.

From

Tom Lane

Date:

04 August 2003, 21:06:06

Joseph Shraibman <jks@selectacast.net> writes:
> What percentage of locales have this problem?  Does latin1 have this problem?

Latin1 is an encoding, not a locale.  To a first approximation, I'd say
*all* non-C locales have some kind of sorting funny business.

> And what about my original idea, can LIKE be turned into an = when there are no wildcards?

It does ... if the index-conversion optimization is enabled at all.  Not
sure it's worth special-casing the '=' case otherwise.

            regards, tom lane

Re: like performance w/o wildcards.

From

Joseph Shraibman

Date:

04 August 2003, 21:49:03

Tom Lane wrote:
> Joseph Shraibman <jks@selectacast.net> writes:
>
>>What percentage of locales have this problem?  Does latin1 have this problem?
>
>
> Latin1 is an encoding, not a locale.  To a first approximation, I'd say
> *all* non-C locales have some kind of sorting funny business.
>
OK this clears things up a bit.  The locale on my production server (redhat) is set to
en_US, which explains why LIKE doesn't use an index.  Do I just have to reset the locale
environment variable and restart postgres?  What might the side effects of that be?


>
>>And what about my original idea, can LIKE be turned into an = when there are no wildcards?
>
>
> It does ... if the index-conversion optimization is enabled at all.

Sorry, what is 'index-conversion optimization' and when is it enabled?

Re: like performance w/o wildcards.

From

"scott.marlowe"

Date:

05 August 2003, 12:00:03

On Mon, 4 Aug 2003, Joseph Shraibman wrote:

> Tom Lane wrote:
> > Joseph Shraibman <jks@selectacast.net> writes:
> >
> >>What percentage of locales have this problem?  Does latin1 have this problem?
> >
> >
> > Latin1 is an encoding, not a locale.  To a first approximation, I'd say
> > *all* non-C locales have some kind of sorting funny business.
> >
> OK this clears things up a bit.  The locale on my production server (redhat) is set to
> en_US, which explains why LIKE doesn't use an index.  Do I just have to reset the locale
> environment variable and restart postgres?  What might the side effects of that be?

Nope, changing locales involves dumping reinitting and restoring.  Sorry.
:(

> >>And what about my original idea, can LIKE be turned into an = when there are no wildcards?
> >
> >
> > It does ... if the index-conversion optimization is enabled at all.
>
> Sorry, what is 'index-conversion optimization' and when is it enabled?

I don't know what that is either.  Tom?

Re: like performance w/o wildcards.

From

Tom Lane

Date:

05 August 2003, 12:22:27

"scott.marlowe" <scott.marlowe@ihs.com> writes:
>>> It does ... if the index-conversion optimization is enabled at all.
>>
>> Sorry, what is 'index-conversion optimization' and when is it enabled?

> I don't know what that is either.  Tom?

Sorry, I just meant the code that tries to extract indexable conditions
from a LIKE clause.  With a fixed-prefix pattern, eg "x LIKE 'foo%'",
you get a range condition like "x >= 'foo' AND x < 'fop'".  For a
completely fixed pattern this degenerates to "x = 'foo'".  The whole
thing turns off in non-C locales though.

            regards, tom lane