Re: pg_trgm upgrade to 1.6 led to load average increase - Mailing list pgsql-general

From Tom Lane
Subject Re: pg_trgm upgrade to 1.6 led to load average increase
Date
Msg-id 2770956.1769109147@sss.pgh.pa.us
Whole thread Raw
In response to Re: pg_trgm upgrade to 1.6 led to load average increase  (Nicolas Seinlet <nicolas@seinlet.com>)
List pgsql-general
Nicolas Seinlet <nicolas@seinlet.com> writes:
> With pg_trgm=1.5:
> QUERY PLAN
> -----------
>  Index Scan using ir_model_data_module_name_uniq_index on ir_model_data  (cost=0.56..2.58 rows=1 width=17) (actual
time=0.027..0.028rows=1 loops=1) 
>    Index Cond: (((module)::text = 'base'::text) AND ((name)::text = 'public_user'::text))
>  Planning Time: 0.093 ms
>  Execution Time: 0.050 ms
> (4 rows)

> With pg_trgm=1.6:
> QUERY PLAN
> ----------
>  Index Scan using ir_model_data_name_idx2 on ir_model_data  (cost=0.42..2.44 rows=1 width=17) (actual
time=8403.936..9847.983rows=1 loops=1) 
>    Index Cond: ((name)::text = 'public_user'::text)
>    Rows Removed by Index Recheck: 10
>    Filter: ((module)::text = 'base'::text)
>  Planning Time: 1.564 ms
>  Execution Time: 9848.027 ms
> (6 rows)

Hmm, so the cost estimates are nearly the same for both index types.
I can reproduce that locally if the gist-indexed column is estimated
to have only one match; then the additional index condition isn't
believed to improve the selectivity any, so the planner has no
motivation to prefer using the two-column index.

What I don't reproduce here is the huge differential in actual cost.
The gist index is a bit slower for me, but not five orders of
magnitude slower.  However, my test table isn't huge --- about
80K rows --- and I suppose from these numbers that yours must be
far larger.

So I'm thinking that the default signature length for gist_trgm_ops
isn't large enough for your data set and what you want to do is
make it bigger:

CREATE INDEX ... USING GIST (name gist_trgm_ops(siglen = X));

The default value is 12 bytes, so maybe try 24 or 32.  That will
result in an index that's physically larger, but with luck
less of it will need to be scanned for any one query.

            regards, tom lane



pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Pgbouncer performance query
Next
From: KK CHN
Date:
Subject: Re: Pgbouncer performance query