Thread: Is is possible to persuade the query planner that ~* operations are grossly expensive?

Is is possible to persuade the query planner that ~* operations are grossly expensive?

From

Nick Howden

Date:

17 December 2002, 04:27:12

Is is possible to persuade the query planner that ~* operations are grossly
expensive - or at least much more expensive than some other operations.

My reason is that I have a databsase on which we do a lot of queries of the
form

SELECT
 *
FROM
 primary_table
 secondary_table as t1
 secondary_table as t2

WHERE primary_table.id = t1.id
AND  primary_table.id = t2.id

AND t1.col1 = 1
AND t2.col1 = 2
AND primary_table.blob ~* 'wibble';



unfortunately the query optimiser always tries to do the ~* operator before
it has done the other (more selective) queries, and over 100,000+ rows this
is very slow.

So, can I force the query optimizer to delay doing the ~* until after the
selective joins?
--
Nick Howden -  Senior IT Analyst
QinetiQ Trusted Information Management
Woodward Building, Room B009
Malvern Technology Park, WR14 3PS
Telephone 01684 895566, Fax 896660

The Information contained in this E-Mail and any subsequent correspondence
is private and is intended solely for the intended recipient(s).
For those other than the recipient any disclosure, copying, distribution,
or any action taken or omitted to be taken in reliance on such information is
prohibited and may be unlawful

Re: Is is possible to persuade the query planner that ~* operations are grossly expensive?

From

Tom Lane

Date:

17 December 2002, 09:23:45

Nick Howden <n.howden@eris.qinetiq.com> writes:
> Is is possible to persuade the query planner that ~* operations are grossly
> expensive - or at least much more expensive than some other operations.

I think you're trying to micro-optimize in the wrong place.

> My reason is that I have a databsase on which we do a lot of queries of the
> form

> SELECT
>  *
> FROM
>  primary_table
>  secondary_table as t1
>  secondary_table as t2

> WHERE primary_table.id = t1.id
> AND  primary_table.id = t2.id

> AND t1.col1 = 1
> AND t2.col1 = 2
> AND primary_table.blob ~* 'wibble';

I get perfectly reasonable-looking plans for a comparable query
structure in the regression database:

regression=# explain select * from tenk1 a, tenk1 b where
regression-# a.unique1 = b.unique1 and b.unique2 = 42
regression-# and a.stringu1 like '%z%';
                                     QUERY PLAN

-------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..9.04 rows=1 width=488)
   ->  Index Scan using tenk1_unique2 on tenk1 b  (cost=0.00..3.01 rows=1 width=244)
         Index Cond: (unique2 = 42)
   ->  Index Scan using tenk1_unique1 on tenk1 a  (cost=0.00..6.01 rows=1 width=244)
         Index Cond: (a.unique1 = "outer".unique1)
         Filter: (stringu1 ~~ '%z%'::text)
(6 rows)

I don't see much to improve on there in terms of reducing the number of
executions of the LIKE operator.

How about you show us your schema (column types, index definitions) and
your EXPLAIN ANALYZE output, rather than jumping to conclusions about
the form of a solution to your problem?

            regards, tom lane