Re: Bad row estimates - Mailing list pgsql-performance
From | Alex Adriaanse |
---|---|
Subject | Re: Bad row estimates |
Date | |
Msg-id | 440F1115.4020400@innovacomputing.com Whole thread Raw |
In response to | Re: Bad row estimates (Greg Stark <gsstark@mit.edu>) |
Responses |
Re: Bad row estimates
|
List | pgsql-performance |
Thank you all for your valuable input. I have tried creating a partial index, a GIST index, and a GIST + partial index, as suggested, but it does not seem to make a significant difference. For instance: CREATE INDEX test_table_1_interval_idx ON test_table_1 USING GIST (box(point(start_ts::abstime::integer, start_ts::abstime::integer), point(end_ts::abstime::integer, end_ts::abstime::integer))) WHERE id = g_id; ANALYZE test_table_1; EXPLAIN ANALYZE SELECT count(*) FROM test_table_1 INNER JOIN test_table_2 ON (test_table_2.s_id=13300613 AND test_table_1.id = test_table_2.n_id) WHERE box(point(start_ts::abstime::integer, start_ts::abstime::integer), point(end_ts::abstime::integer, end_ts::abstime::integer)) ~ box(point(now()::abstime::integer,now()::abstime::integer),point(now()::abstime::integer,now()::abstime::integer)) AND test_table_1.id = test_table_1.g_id; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=15.09..15.10 rows=1 width=0) (actual time=69.771..69.772 rows=1 loops=1) -> Nested Loop (cost=9.06..15.08 rows=1 width=0) (actual time=69.752..69.752 rows=0 loops=1) -> Index Scan using test_table_1_interval_idx on test_table_1 (cost=0.07..4.07 rows=1 width=22) (actual time=2.930..3.607rows=135 loops=1) Index Cond: (box(point((((start_ts)::abstime)::integer)::double precision, (((start_ts)::abstime)::integer)::doubleprecision), point((((end_ts)::abstime)::integer)::double precision, (((end_ts)::abstime)::integer)::doubleprecision)) ~ box(point((((now())::abstime)::integer)::double precision, (((now())::abstime)::integer)::doubleprecision), point((((now())::abstime)::integer)::double precision, (((now())::abstime)::integer)::doubleprecision))) -> Bitmap Heap Scan on test_table_2 (cost=8.99..11.00 rows=1 width=12) (actual time=0.486..0.486 rows=0 loops=135) Recheck Cond: ((test_table_2.s_id = 13300613::numeric) AND ("outer".id = test_table_2.n_id)) -> BitmapAnd (cost=8.99..8.99 rows=1 width=0) (actual time=0.485..0.485 rows=0 loops=135) -> Bitmap Index Scan on test_table_2_s_id (cost=0.00..2.17 rows=48 width=0) (actual time=0.015..0.015rows=1 loops=135) Index Cond: (s_id = 13300613::numeric) -> Bitmap Index Scan on test_table_2_n_id (cost=0.00..6.57 rows=735 width=0) (actual time=0.467..0.467rows=815 loops=135) Index Cond: ("outer".id = test_table_2.n_id) Total runtime: 69.961 ms (Note: without the GIST index the query currently runs in about 65ms) Its row estimates are still way off. As a matter of fact, it almost seems as if the index doesn't affect row estimates at all. What would you guys suggest? Thanks, Alex Greg Stark wrote: > You could actually take short cuts using expression indexes to do this. If it > works out well then you might want to implement a real data type to avoid the > overhead of the SQL conversion functions. > > Here's an example. If I were to do this for real I would look for a better > datatype than the box datatype and I would wrap the whole conversion in an SQL > function. But this will serve to demonstrate: > > stark=> create table interval_test (start_ts timestamp with time zone, end_ts timestamp with time zone); > CREATE TABLE > > stark=> create index interval_idx on interval_test using gist (box(point(start_ts::abstime::integer, end_ts::abstime::integer), point(start_ts::abstime::integer, end_ts::abstime::integer))); > CREATE INDEX > > stark=> explain select * from interval_test where box(point(now()::abstime::integer,now()::abstime::integer),point(now()::abstime::integer,now()::abstime::integer))~ box(point(start_ts::abstime::integer,end_ts::abstime::integer) , point(start_ts::abstime::integer, end_ts::abstime::integer)); > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Index Scan using interval_idx on interval_test (cost=0.07..8.36 rows=2 width=16) > Index Cond: (box(point((((now())::abstime)::integer)::double precision, (((now())::abstime)::integer)::double precision),point((((now())::abstime)::integer)::double precision, (((now())::abstime)::integer)::double precision)) ~ box(point((((start_ts)::abstime)::integer)::doubleprecision, (((end_ts)::abstime)::integer)::double precision), point((((start_ts)::abstime)::integer)::doubleprecision, (((end_ts)::abstime)::integer)::double precision))) > (2 rows) > >
pgsql-performance by date: