WHERE ... NOT NULL ... OR ... (SELECT...) - Mailing list pgsql-sql
From | Mario Splivalo |
---|---|
Subject | WHERE ... NOT NULL ... OR ... (SELECT...) |
Date | |
Msg-id | 55DA348D.9050704@splivalo.hr Whole thread Raw |
Responses |
Re: WHERE ... NOT NULL ... OR ... (SELECT...)
|
List | pgsql-sql |
I have a query, like this: valipile=# explain select * from account_analytic_line where move_id in (SELECT id FROM account_move_line); QUERY PLAN ---------------------------------------------------------------------------------------Hash Semi Join (cost=60799.74..96694.82rows=329568 width=162) Hash Cond: (account_analytic_line.move_id = account_move_line.id) -> SeqScan on account_analytic_line (cost=0.00..9620.68 rows=329568 width=162) -> Hash (cost=41292.66..41292.66 rows=1188966 width=4) -> Seq Scan on account_move_line (cost=0.00..41292.66 rows=1188966 width=4) (5 rows) Which is all fine. However, as move_id in account_analytic_line is NULLable I want to include that one into my query. But then: valipile=# explain select * from account_analytic_line where move_id is null or move_id in (SELECT id FROM account_move_line); QUERY PLAN -----------------------------------------------------------------------------------------Seq Scan on account_analytic_line (cost=0.00..9039221110.12 rows=164784 width=162) Filter: ((move_id IS NULL) OR (SubPlan 1)) SubPlan 1 -> Materialize (cost=0.00..51882.49 rows=1188966width=4) -> Seq Scan on account_move_line (cost=0.00..41292.66 rows=1188966 width=4) (5 rows) This, of course, takes forever. (There are no indexes/constraints/whatever on the tables as I'm deleting old data from the database) Now, I did 'circumvent' the waiting with using UNION: valipile=# explain select * from account_analytic_line where move_id in (select id from account_move_line) union select * from account_analytic_line where move_id is null; ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Unique (cost=193891.55..212017.84 rows=329569 width=162) -> Sort (cost=193891.55..194715.47 rows=329569 width=162) SortKey: account_analytic_line.id, account_analytic_line.create_uid, account_analytic_line.create_date, account_analytic_line.write_date, account_analytic_line.write_uid, account_analytic_line.amount, account_analytic_line.user_id, account_analy -> Append (cost=60799.74..109611.18 rows=329569width=162) -> Hash Semi Join (cost=60799.74..96694.82 rows=329568 width=162) Hash Cond: (account_analytic_line.move_id = account_move_line.id) -> Seq Scan on account_analytic_line (cost=0.00..9620.68 rows=329568 width=162) -> Hash (cost=41292.66..41292.66 rows=1188966 width=4) -> Seq Scan on account_move_line (cost=0.00..41292.66 rows=1188966 width=4) -> Seq Scan on account_analytic_line account_analytic_line_1 (cost=0.00..9620.68 rows=1 width=162) Filter: (move_id IS NULL) (11 rows) but I'm curious why postgres chooses such poor query plan for the 'OR column IS NULL' addition ? Mario -- Mario Splivalo mario@splivalo.hr "I can do it quick, I can do it cheap, I can do it well. Pick any two."