which is better: using OR clauses or UNION? - Mailing list pgsql-sql
From | adam_pgsql |
---|---|
Subject | which is better: using OR clauses or UNION? |
Date | |
Msg-id | D7B38BB0-89D0-4351-9936-27F9D9B4AFE0@witneyweb.org Whole thread Raw |
Responses |
Re: which is better: using OR clauses or UNION?
Re: which is better: using OR clauses or UNION? Re: which is better: using OR clauses or UNION? Re: which is better: using OR clauses or UNION? |
List | pgsql-sql |
Hi, I have a query hitting a table of 25 million rows. The table has a text field ('identifier') which i need to query for matchingrows. The question is if i have multiple strings to match against this field I can use multiple OR sub-statementsor multiple statements in a UNION. The UNION seems to run quicker.... is this to be expected? or is there anythingelse I can do improve the speed of this query? Some query details: table "dba_data_base", index: "in_dba_data_base_identifier" btree (lower(identifier) varchar_pattern_ops) Query 1 ------- datadb=#explain analyse SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_valueFROM dba_data_base aWHERE ( lower(identifier)LIKE lower('BUGS0000001884677') OR lower(identifier) LIKE lower('BUGS0000001884678') OR lower(identifier)LIKE lower('BUGS0000001884679') OR lower(identifier) LIKE lower('SpTIGR4-2210 (6F24)') )ORDER BY a.identifier; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------- Sort (cost=14930.13..14939.77 rows=3857 width=62) (actual time=3208.466..3208.652 rows=318 loops=1) Sort Key: identifier-> Bitmap Heap Scan on dba_data_base a (cost=134.43..14700.38 rows=3857 width=62) (actual time=81.106..3207.721rows=318 loops=1) Recheck Cond: ((lower(identifier) ~~ 'bugs0000001884677'::text) OR (lower(identifier)~~ 'bugs0000001884678'::text) OR (lower(identifier) ~~ 'bugs0000001884679'::text) OR (lower(identifier)~ ~ 'sptigr4-2210 (6f24)'::text)) Filter: ((lower(identifier) ~~ 'bugs0000001884677'::text) OR (lower(identifier) ~~'bugs0000001884678'::text) OR (lower(identifier) ~~ 'bugs0000001884679'::text) OR (lower(identifier) ~~ 'spt igr4-2210 (6f24)'::text)) -> BitmapOr (cost=134.43..134.43 rows=3857 width=0) (actual time=71.397..71.397 rows=0loops=1) -> Bitmap Index Scan on in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actualtime=0.029..0.029 rows=0 loops=1) Index Cond: (lower(identifier) ~=~ 'bugs0000001884677'::charactervarying) -> Bitmap Index Scan on in_dba_data_base_identifier (cost=0.00..32.64rows=964 width=0) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (lower(identifier)~=~ 'bugs0000001884678'::character varying) -> Bitmap Index Scan on in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (lower(identifier) ~=~ 'bugs0000001884679'::character varying) -> Bitmap Index Scan on in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=71.347..71.347 rows=318 loops=1) Index Cond: (lower(identifier) ~=~ 'sptigr4-2210 (6f24)'::character varying) Total runtime: 3208.904 ms Query 2 ------- datadb=#explain analyse SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_valueFROM dba_data_base aWHERE lower(identifier)LIKE lower('BUGS0000001884677') UNION SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_valueFROM dba_data_base aWHERE lower(identifier) LIKE lower('BUGS0000001884678') UNION SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_valueFROM dba_data_base aWHERE lower(identifier) LIKE lower('BUGS0000001884679') UNION SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_valueFROM dba_data_base aWHERE lower(identifier) LIKE lower('SpTIGR4-2210(6F24)')ORDER BY identifier; Sort (cost=15702.26..15711.90 rows=3856 width=62) (actual time=3.688..3.886 rows=317 loops=1) Sort Key: identifier -> Unique (cost=15414.74..15472.58 rows=3856 width=62) (actual time=2.663..3.387 rows=317 loops=1) -> Sort (cost=15414.74..15424.38rows=3856 width=62) (actual time=2.660..2.834 rows=318 loops=1) Sort Key: bioassay_id,identifier, ratio, log_ratio, p_value -> Append (cost=32.88..15185.06 rows=3856 width=62) (actualtime=0.320..2.131 rows=318 loops=1) -> Bitmap Heap Scan on dba_data_base a (cost=32.88..3786.62rows=964 width=62) (actual time=0.041..0.041 rows=0 loops=1) Filter: (lower(identifier)~~ 'bugs0000001884677'::text) -> Bitmap Index Scan on in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=0.036..0.036 rows=0 loops=1) Index Cond: (lower(identifier) ~=~ 'bugs0000001884677'::character varying) -> Bitmap HeapScan on dba_data_base a (cost=32.88..3786.62 rows=964 width=62) (actual time=0.010..0.010 rows=0 loops=1) Filter: (lower(identifier) ~~ 'bugs0000001884678'::text) -> Bitmap Index Scan on in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (lower(identifier) ~=~ 'bugs0000001884678'::character varying) -> Bitmap HeapScan on dba_data_base a (cost=32.88..3786.62 rows=964 width=62) (actual time=0.010..0.010 rows=0 loops=1) Filter: (lower(identifier) ~~ 'bugs0000001884679'::text) -> Bitmap Index Scan on in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (lower(identifier) ~=~ 'bugs0000001884679'::character varying) -> Bitmap HeapScan on dba_data_base a (cost=32.88..3786.62 rows=964 width=62) (actual time=0.255..1.676 rows=318 loops=1) Filter: (lower(identifier) ~~ 'sptigr4-2210 (6f24)'::text) -> Bitmap Index Scanon in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=0.178..0.178 rows=318 loops=1) Index Cond: (lower(identifier) ~=~ 'sptigr4-2210 (6f24)'::character varying) Total runtime: 4.174 ms Also which should scale better if I add more strings to match? would there be any better design patterns for this problem? Thanks for any help Adam select version(); version ---------------------------------------------------------------- PostgreSQL 8.2.12 on i686-pc-linux-gnu, compiled by GCC 2.95.4