Re: Query planner question - Mailing list pgsql-general
From | Stephan Szabo |
---|---|
Subject | Re: Query planner question |
Date | |
Msg-id | 20030612145211.Y37555-100000@megazone23.bigpanda.com Whole thread Raw |
In response to | Query planner question (Ernest E Vogelsinger <ernest@vogelsinger.at>) |
Responses |
Re: Query planner question
|
List | pgsql-general |
On Thu, 12 Jun 2003, Ernest E Vogelsinger wrote: > > Maybe someone can shed a light on this: > > I have a (test) table, populated with 2M rows: > rid int4, -- primary key > dcid varchar, > dsid varchar, > drid int4, > owid int4, > nxid int4, > usg int4, > --- something more that's not important > > Indexes in question: > CREATE INDEX id_owid ON table (owid,nxid,usg) USING BTREE > CREATE INDEX id_dowid ON table (owid, dcid, dsid, drid, nxid, usg) > USING BTREE > CREATE INDEX id_dictid ON table (dcid, dsid, drid, nxid) USING BTREE > > Test Run > > VACUUM ANALYZE FULL; > VACUUM > EXPLAIN ANALYZE SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid = 1; > NOTICE: QUERY PLAN: > Unique (cost=402.07..402.90 rows=11 width=20) (actual time=10.13..11.49 > rows=512 loops=1) > -> Sort (cost=402.07..402.07 rows=111 width=20) (actual > time=10.13..10.46 rows=512 loops=1) > -> Index Scan using id_owid on table (cost=0.00..398.30 rows=111 > width=20) (actual time=0.05..4.44 rows=512 loops=1) > Total runtime: 11.95 msec > > EXPLAIN ANALYZE SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid is null; > NOTICE: QUERY PLAN: > Unique (cost=126611.73..128034.59 rows=18971 width=20) (actual > time=27515.63..28359.88 rows=513 loops=1) > -> Sort (cost=126611.73..126611.73 rows=189714 width=20) (actual > time=27515.62..27792.29 rows=199131 loops=1) > -> Seq Scan on table (cost=0.00..106885.04 rows=189714 > width=20) (actual time=18.76..16467.28 rows=199131 loops=1) > Total runtime: 28633.68 msec > > SET enable_seqscan TO off; > SET VARIABLE > EXPLAIN SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid is null; > -- this process uses 34M RSS!!! > NOTICE: QUERY PLAN: > Unique (cost=0.00..7887659.31 rows=18971 width=20) (actual > time=2.57..711940.78 rows=513 loops=1) > -> Index Scan using id_dictid on table (cost=0.00..7886236.46 > rows=189714 width=20) (actual time=2.57..710482.07 rows=199131 loops=1) > Total runtime: 711942.76 msec > > A couple of questions arise: > > 1) Why chooses the planner to use id_owid (and not id_dowid as it would > seem logical) for the first query? I think that it doesn't entirely know that owid=1, sort by dcid, dsid, drid can be handled by that index. I think it's possible that if you added owid to the select list it might use id_dowid instead. I think this is similar to the issues with order by, conditions and index choice, which you may find useful information in the archives about) > 2) Why is NO index used for the second query, the only difference being in > the constraint value (owid is set vs. owid is null)? IS NULL is not considered an indexable condition currently (there are past discussions and hackarounds in the archives) > 3) Why does it use id_dictid for the second query when forced to, and not > id_owid or id_dowid? As for #2, it doesn't think it can use an index with owid in the front.
pgsql-general by date: