Thread: Using BOOL in indexes
Hi, I'm trying to convert an application from MS SQL / ASP / IIS to PostgreSQL / PHP / Apache. I am having trouble getting efficient queries on one of my main tables, which tends to have some fairly large records in it. Currently there are around 20000 records, and it looks like they average around 500 bytes from the VACUUM ANALYZE statistics below. I don't really want any query on this table to return more than about 20 records, so it seems to me that indexed access should be the answer, but I am having some problems with indexes containing BOOLEAN types. I can't see any reason why BOOL shouldn't work in an index, and in other systems I have commonly used them as the first component of an index, which is what I want to do here. Also, I can't see why the estimator should see a difference between "WHERE head1" and "WHERE head1=TRUE". Any help appreciated, Andrew. newsroom=# \d story Table "story" Attribute | Type | Modifier --------------+-----------+-------------------story_id | integer | not nullauthor | integer | written | timestamp | released | timestamp | withdrawn | timestamp | sent | timestamp | wcount | integer |default 0chunk_count | integer | head1 | boolean | default 'f'::boolheadpriority | integer | default 999internal | boolean | default 'f'::boolislive | boolean | default 'f'::boolstory_type | char(4) | title | text | precis | text | Indices: story_oid_skey, story_pkey, story_sk1, story_sk2, story_sk4 newsroom=# \d story_sk4 Index "story_sk4"Attribute | Type -----------+-----------head1 | booleanwritten | timestamp btree newsroom=# explain SELECT DISTINCT story.story_id, written, released, title, precis, author, head1 FROM story WHERE head1 ORDER BY written DESC LIMIT 15; NOTICE: QUERY PLAN: Unique (cost=2623.87..2868.99 rows=1401 width=49) -> Sort (cost=2623.87..2623.87 rows=14007 width=49) -> Seq Scanon story (cost=0.00..1421.57 rows=14007 width=49) EXPLAIN newsroom=# set enable_seqscan to 'off'; SET VARIABLE newsroom=# explain SELECT DISTINCT story.story_id, written, released, title, precis, author, head1 FROM story WHERE head1 ORDER BY written DESC LIMIT 15; NOTICE: QUERY PLAN: Unique (cost=100002623.87..100002868.99 rows=1401 width=49) -> Sort (cost=100002623.87..100002623.87 rows=14007 width=49) -> Seq Scan on story (cost=100000000.00..100001421.57 rows=14007 width=49) EXPLAIN newsroom=# explain SELECT DISTINCT story.story_id, written, released, title, precis, author FROM story WHERE head1=TRUE LIMIT 15; NOTICE: QUERY PLAN: Unique (cost=8846.22..9056.33 rows=1401 width=48) -> Sort (cost=8846.22..8846.22 rows=14007 width=48) -> IndexScan using story_sk4 on story (cost=0.00..7645.97 rows=14007 width=48) EXPLAIN newsroom=# vacuum verbose analyze story; NOTICE: --Relation story-- NOTICE: Pages 1238: Changed 0, reaped 0, Empty 0, New 0; Tup 18357: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 84, MaxLen 3115; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.16s/1.90u sec. NOTICE: Index story_oid_skey: Pages 39; Tuples 18357. CPU 0.00s/0.07u sec. NOTICE: Index story_sk4: Pages 94; Tuples 18357. CPU 0.01s/0.08u sec. NOTICE: Index story_sk2: Pages 51; Tuples 18357. CPU 0.01s/0.08u sec. NOTICE: Index story_sk1: Pages 70; Tuples 18357. CPU 0.02s/0.06u sec. NOTICE: Index story_pkey: Pages 59; Tuples 18357. CPU 0.02s/0.06u sec. VACUUM -- _____________________________________________________________________ Andrew McMillan, e-mail: Andrew@cat-it.co.nz Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267
> -----Original Message----- > From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On > Behalf Of Andrew McMillan > > Hi, > > I'm trying to convert an application from MS SQL / ASP / IIS to > PostgreSQL / PHP / Apache. I am having trouble getting efficient > queries on one of my main tables, which tends to have some fairly large > records in it. Currently there are around 20000 records, and it looks > like they average around 500 bytes from the VACUUM ANALYZE statistics > below. > > I don't really want any query on this table to return more than about 20 > records, so it seems to me that indexed access should be the answer, but > I am having some problems with indexes containing BOOLEAN types. > > I can't see any reason why BOOL shouldn't work in an index, and in other > systems I have commonly used them as the first component of an index, > which is what I want to do here. > > Also, I can't see why the estimator should see a difference between > "WHERE head1" and "WHERE head1=TRUE". > > > newsroom=# explain SELECT DISTINCT story.story_id, written, released, > title, precis, author, head1 FROM story WHERE head1 ORDER BY written Please add head1 to ORDER BY clause i.e. ORDER BY head1,written. Regards. Hiroshi Inoue Inoue@tpf.co.jp
> -----Original Message----- > From: Hiroshi Inoue [mailto:Inoue@tpf.co.jp] > > -----Original Message----- > > From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On > > Behalf Of Andrew McMillan > > > > Hi, > > > > I'm trying to convert an application from MS SQL / ASP / IIS to > > PostgreSQL / PHP / Apache. I am having trouble getting efficient > > queries on one of my main tables, which tends to have some fairly large > > records in it. Currently there are around 20000 records, and it looks > > like they average around 500 bytes from the VACUUM ANALYZE statistics > > below. > > > > I don't really want any query on this table to return more than about 20 > > records, so it seems to me that indexed access should be the answer, but > > I am having some problems with indexes containing BOOLEAN types. > > > > I can't see any reason why BOOL shouldn't work in an index, and in other > > systems I have commonly used them as the first component of an index, > > which is what I want to do here. > > > > Also, I can't see why the estimator should see a difference between > > "WHERE head1" and "WHERE head1=TRUE". > > > > > > newsroom=# explain SELECT DISTINCT story.story_id, written, released, > > title, precis, author, head1 FROM story WHERE head1 ORDER BY written > > Please add head1 to ORDER BY clause i.e. ORDER BY head1,written. > Sorry,it wouldn't help unless there's an index e.g. on (head1,written, story_id, released, title, precis, author). However isn't (story_id) a primary key ? If so,couldn't you change your query as follows ? SELECT story.story_id, written, released, title, precis, author, head1 FROM story WHERE head1=TRUE ORDER BY head1, written DESC LIMIT 15. Regards. Hiroshi Inoue Inoue@tpf.co.jp
Hiroshi Inoue wrote: > Hiroshi Inoue wrote: > > Andrew McMillan wrote: > > > > > > Hi, > > > > > > I'm trying to convert an application from MS SQL / ASP / IIS to > > > PostgreSQL / PHP / Apache. I am having trouble getting efficient > > > queries on one of my main tables, which tends to have some fairly large > > > records in it. Currently there are around 20000 records, and it looks > > > like they average around 500 bytes from the VACUUM ANALYZE statistics > > > below. > > > > > > I don't really want any query on this table to return more than about 20 > > > records, so it seems to me that indexed access should be the answer, but > > > I am having some problems with indexes containing BOOLEAN types. > > > > > > I can't see any reason why BOOL shouldn't work in an index, and in other > > > systems I have commonly used them as the first component of an index, > > > which is what I want to do here. > > > > > > Also, I can't see why the estimator should see a difference between > > > "WHERE head1" and "WHERE head1=TRUE". > > > > > > > > > newsroom=# explain SELECT DISTINCT story.story_id, written, released, > > > title, precis, author, head1 FROM story WHERE head1 ORDER BY written > > > > Please add head1 to ORDER BY clause i.e. ORDER BY head1,written. > > > > Sorry,it wouldn't help unless there's an index e.g. on (head1,written, > story_id, released, title, precis, author). > However isn't (story_id) a primary key ? > If so,couldn't you change your query as follows ? > > SELECT story.story_id, written, released, title, precis, author, head1 > FROM story WHERE head1=TRUE ORDER BY head1, written DESC > LIMIT 15. Thanks Hiroshi, I already have such an index, but as you can see below, it is still not used: newsroom=# explain SELECT story.story_id, written, released, title, precis, author, head1 FROM story WHERE head1=TRUE ORDER BY head1, written DESC LIMIT 15; NOTICE: QUERY PLAN: Sort (cost=2669.76..2669.76 rows=14007 width=49) -> Seq Scan on story (cost=0.00..1467.46 rows=14007 width=49) EXPLAIN newsroom=# \d story_sk4 Index "story_sk4"Attribute | Type -----------+-----------head1 | booleanwritten | timestamp btree Regards, Andrew. -- _____________________________________________________________________ Andrew McMillan, e-mail: Andrew@cat-it.co.nz Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267
> -----Original Message----- > From: andrew@socrates.catalyst.net.nz > > Hiroshi Inoue wrote: > > Hiroshi Inoue wrote: > > > Andrew McMillan wrote: > > > > > > > > Hi, > > > > > > > > Sorry,it wouldn't help unless there's an index e.g. on (head1,written, > > story_id, released, title, precis, author). > > However isn't (story_id) a primary key ? > > If so,couldn't you change your query as follows ? > > > > SELECT story.story_id, written, released, title, precis, author, head1 > > FROM story WHERE head1=TRUE ORDER BY head1, written DESC > > LIMIT 15. > > Thanks Hiroshi, > > I already have such an index, but as you can see below, it is still not > used: > > newsroom=# explain SELECT story.story_id, written, released, title, > precis, author, head1 FROM story WHERE head1=TRUE ORDER BY head1, > written DESC LIMIT 15; Oops,please add DESC to head1 also i.e ORDER BY head1 DESC,written DESC. Regards. Hiroshi Inoue Inoue@tpf.co.jp