Re: Non-unique index performance - Mailing list pgsql-general
From | Sezai YILMAZ |
---|---|
Subject | Re: Non-unique index performance |
Date | |
Msg-id | 42BBDCAF.7080906@pro-g.com.tr Whole thread Raw |
In response to | Re: Non-unique index performance (Richard Huxton <dev@archonet.com>) |
Responses |
Re: Non-unique index performance
|
List | pgsql-general |
Richard Huxton wrote: > Sezai YILMAZ wrote: > >> Hello! >> >> I have a table eith name person as described below. It has an unique >> index for id column (it is also primary key) and has an index for >> parent column. >> >> If I run a query with where clause on id column it uses the index >> (look at the first explain analyze result; it says "Index Scan >> using...") and the query for 582856 rows table results in 225,893 ms. >> >> But, if I run another query with where clause on parent column it >> does not use the index (look at the second explain analyze result; it >> says "Seq Scan using...") and the query for 582856 rows table results >> in 11192.913 ms. >> >> Why the difference of both queries is so dramatical for unique and >> non-unique indexed columns? Why PostgreSQL does not use the >> non-unique indexes (it says that it does sequential scan)? > > > Because it thinks it will be faster/cheaper. > >> I have to use an index on non-unique column. What is the solution for >> that? Is there a way to speed up non-unique indexes? > > > You don't want to force it to use an index, you want it to make better > estimates. Let's have a look... > >> ***************************************************************** >> test=> \d person >> Table "public.person" >> Column | Type | Modifiers >> ---------+-----------------------+----------- >> name | character varying(30) | >> surname | character varying(30) | >> id | integer | not null >> parent | integer | >> Indexes: >> "person_pkey" primary key, btree (id) >> "parent_ndx" btree (parent) > > > OK - all very simple. And you've said there are about 580,000 rows. > >> test=> explain analyze select id,name from person where id in >> ('17201', '338191', '244319', '515209', '20415'); > > > Why are you quoting integers? I qouted them to use indexes. The other method is type casting the values to indexed column type. I prefer the quoting method. >> test=> explain analyze select * from person where parent in ('17201', >> '338191', '244319', '515209', '20415'); >> QUERY PLAN >> ----------------------------------------------------------------------------------------------------------------- >> >> Seq Scan on person (cost=0.00..35871.26 rows=14427 width=32) (actual >> time=0.063..11192.809 rows=5 loops=1) >> Filter: ((parent = 17201) OR (parent = 338191) OR (parent = 244319) >> OR (parent = 515209) OR (parent = 20415)) >> Total runtime: 11192.913 ms > > > Hmm - for some reason it's expecting 14427 rows to be returned. If > there were that many matches, then it might well be a better choice > than going back and fore between the index and the table all the time. > > So - we need to find out why it thinks there will be so many rows > returned. > > 1. VACUUM FULL ANALYSE person; > 2. re-run the explain This solved the problem. Now it takes about 213 ms. Thanks in advance. -- sy
pgsql-general by date: