BUG #13792: Weird querry planner behavior - Mailing list pgsql-bugs
From | ydolgikh@jerasoft.net |
---|---|
Subject | BUG #13792: Weird querry planner behavior |
Date | |
Msg-id | 20151203160644.5889.21946@wrigleys.postgresql.org Whole thread Raw |
Responses |
Re: BUG #13792: Weird querry planner behavior
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 13792 Logged by: Yurii Dolhikh Email address: ydolgikh@jerasoft.net PostgreSQL version: 9.3.10 Operating system: CentOS Description: Reporting as a bug as suggested by Craig Ringer. Stackoverflow thread: http://stackoverflow.com/questions/34064639/postgresql-weird-querry-planner-behavior Assume I have a query like this: SELECT * FROM clients c INNER JOIN clients_balances cb ON cb.id_clients = c.id LEFT JOIN clients com ON com.id = c.id_companies LEFT JOIN clients com_real ON com_real.id = c.id_companies_real LEFT JOIN rate_tables rt_orig ON rt_orig.id = c.orig_rate_table LEFT JOIN rate_tables rt_term ON rt_term.id = c.term_rate_table LEFT JOIN payment_terms pt ON pt.id = c.id_payment_terms LEFT JOIN paygw_clients_profiles cpgw ON (cpgw.id_clients = c.id AND cpgw.id_companies = c.id_companies_real) WHERE EXISTS (SELECT * FROM accounts WHERE (name LIKE 'x' OR accname LIKE 'x' OR ani LIKE 'x') AND id_clients = c.id) AND c."type" = '0' AND c."id" > 0 ORDER BY c."name"; This query takes around 35 seconds to run when used in the production environment ("clients" has about 1 million records). However, if I take out ANY join - the query will take only about 300 ms to execute. I've played around with the query planner settings, but to no avail. Here are the two explain analyze outputs: http://explain.depesz.com/s/hzy (slow - 48049.574 ms) http://explain.depesz.com/s/FWCd (fast - 286.234 ms, rate_tables JOIN removed) http://explain.depesz.com/s/MyRf (fast - 539.733 ms, paygw_clients_profiles JOIN removed) It looks like in the fast case the planner starts from the EXISTS statement and has to perform join for only two rows in total. However, in the slow case it will first join all the tables and then filter by EXISTS. What I need to do is to make this query run in a reasonable time with all seven join in place. Postgres version is 9.3.10 on CentOS 6.3. Thanks. UPDATE Rewriting the query like this: SELECT * FROM clients c INNER JOIN clients_balances cb ON cb.id_clients = c.id INNER JOIN accounts a ON a.id_clients = c.id AND (a.name = 'x' OR a.accname = 'x' OR a.ani = 'x') LEFT JOIN clients com ON com.id = c.id_companies LEFT JOIN clients com_real ON com_real.id = c.id_companies_real LEFT JOIN rate_tables rt_orig ON rt_orig.id = c.orig_rate_table LEFT JOIN rate_tables rt_term ON rt_term.id = c.term_rate_table LEFT JOIN payment_terms pt ON pt.id = c.id_payment_terms LEFT JOIN paygw_clients_profiles cpgw ON (cpgw.id_clients = c.id AND cpgw.id_companies = c.id_companies_real) WHERE c."type" = '0' AND c.id > 0 ORDER BY c."name"; makes it run fast, however, this is not acceptable, as account filtration parameters are optional, and I still need the result if there are no matches in that table. Using "LEFT JOIN accounts" instead of "INNER JOIN accounts" kills the performance again.
pgsql-bugs by date: