Thread: Query slow again after adding an `OR` operation (was: Slow PostgreSQL10.6 query)
Query slow again after adding an `OR` operation (was: Slow PostgreSQL10.6 query)
From
Behrang Saeedzadeh
Date:
This is a follow up to https://www.postgresql.org/message-id/flat/CAERAJ%2B-1buiJ%2B_JWEo0a9Ao-CVMWpgp%3DEnFx1dJtnB3WmMi2zQ%40mail.gmail.com
The query (generated by Hibernate) got a bit more complex and performance degraded again. I have uploaded all the details here (with changed table names, etc.): https://github.com/behrangsa/slow-query
In short, the new query is:
```
SELECT inv.id AS i_id, inv.invoice_date AS inv_d, inv.invoice_xid AS inv_xid, inv.invoice_type AS inv_type, brs.branch_id AS br_id, cinvs.company_id AS c_id FROM invoices inv LEFT OUTER JOIN branch_invoices brs ON inv.id = brs.invoice_id LEFT OUTER JOIN company_invoices cinvs ON inv.id = cinvs.invoice_id INNER JOIN branches br ON brs.branch_id = br.id WHERE brs.branch_id IN (SELECT br1.id FROM branches br1 INNER JOIN access_rights ar1 ON br1.id = ar1.branch_id INNER JOIN users usr1 ON ar1.user_id = usr1.id INNER JOIN groups grp1 ON ar1.group_id = grp1.id INNER JOIN group_permissions gpr1 ON grp1.id = gpr1.group_id INNER JOIN permissions prm1 ON gpr1.permission_id = prm1.id WHERE usr1.id = 1636 AND prm1.code = 'C2' AND ar1.access_type = 'T1') OR brs.branch_id IN (SELECT br3.id FROM companies cmp INNER JOIN branches br3 ON cmp.id = br3.company_id INNER JOIN access_rights ar2 ON cmp.id = ar2.company_id INNER JOIN users usr2 ON ar2.user_id = usr2.id INNER JOIN groups g2 ON ar2.group_id = g2.id INNER JOIN group_permissions gpr2 ON g2.id = gpr2.group_id INNER JOIN permissions prm2 ON gpr2.permission_id = prm2.id WHERE usr2.id = 1636 AND prm2.code = 'C2' AND ar2.access_type = 'T1' ORDER BY br3.id) ORDER BY inv.invoice_date DESC, br.name ASC LIMIT 12;
```
I tried tweaking join_collapse_limit and from_collapse_limit (I tried up to 30) but couldn't improve the performance (I also increased geqo_threshold to join_collapse_limit + 2).
Any chance of making PostgreSQL 10.6 choose a better plan without rewriting the Hibernate generated query?
Behrang Saeedzadeh
Re: Query slow again after adding an `OR` operation (was: SlowPostgreSQL 10.6 query)
From
Michael Lewis
Date:
Are you prefixing this auto generated query with set join_collapse_limit = 30, or are you changing the default and reloading config? That is, can you impact ONLY this query with these config changes? I wouldn't assume so, so any hack/query hint like turning off hashjoins (which seem to be chosen instead of nested loop because of bad estimates for this plan) will likely have serious impact on other queries.
I know you don't have the flexibility to change the query to be one that follows best practices, but it is a bit disappointing that your ORM generates that OR condition instead of something like brs.branch_id IN (query1 union all query2). The join to branch_invoices also must function as inner join rather than left, but I am not sure if declaring a join type as left impacts the performance significantly.
When performance matters, there's nothing quite like being able to customize the query directly.
Re: Query slow again after adding an `OR` operation (was: SlowPostgreSQL 10.6 query)
From
David Rowley
Date:
On Thu, 10 Oct 2019 at 01:07, Behrang Saeedzadeh <behrangsa@gmail.com> wrote: > > This is a follow up to https://www.postgresql.org/message-id/flat/CAERAJ%2B-1buiJ%2B_JWEo0a9Ao-CVMWpgp%3DEnFx1dJtnB3WmMi2zQ%40mail.gmail.com > > The query (generated by Hibernate) got a bit more complex and performance degraded again. I have uploaded all the detailshere (with changed table names, etc.): https://github.com/behrangsa/slow-query > > In short, the new query is: The query mostly appears slow due to the "Rows Removed By Filter" in the OR condition. The only way to get around not scanning the entire branch_invoices table would be to somehow write the way in such a way that allows it to go on the inner side of the join. You could do that if you ensure there's an index on branch_invoices (branch_id) and format the query as: SELECT inv.id AS i_id, inv.invoice_date AS inv_d, inv.invoice_xid AS inv_xid, inv.invoice_type AS inv_type, brs.branch_id AS br_id, cinvs.company_id AS c_id FROM invoices inv LEFT OUTER JOIN branch_invoices brs ON inv.id = brs.invoice_id LEFT OUTER JOIN company_invoices cinvs ON inv.id = cinvs.invoice_id INNER JOIN branches br ON brs.branch_id = br.id WHERE brs.branch_id IN (SELECT br1.id FROM branches br1 INNER JOIN access_rights ar1 ON br1.id = ar1.branch_id INNER JOIN users usr1 ON ar1.user_id = usr1.id INNER JOIN groups grp1 ON ar1.group_id = grp1.id INNER JOIN group_permissions gpr1 ON grp1.id = gpr1.group_id INNER JOIN permissions prm1 ON gpr1.permission_id = prm1.id WHERE usr1.id = 1636 AND prm1.code = 'C2' AND ar1.access_type = 'T1') UNION ALL SELECT br3.id FROM companies cmp INNER JOIN branches br3 ON cmp.id = br3.company_id INNER JOIN access_rights ar2 ON cmp.id = ar2.company_id INNER JOIN users usr2 ON ar2.user_id = usr2.id INNER JOIN groups g2 ON ar2.group_id = g2.id INNER JOIN group_permissions gpr2 ON g2.id = gpr2.group_id INNER JOIN permissions prm2 ON gpr2.permission_id = prm2.id WHERE usr2.id = 1636 AND prm2.code = 'C2' AND ar2.access_type = 'T1') ORDER BY inv.invoice_date DESC, br.name ASC LIMIT 12; The planner may then choose to pullup the subquery and uniquify it then put it on the outside of a nested loop join then lookup the branch_invoices record using the index on branch_id. I think this is quite a likely plan since the planner estimates there's only going to be 1 row from each of the subqueries. Also note, that the LEFT JOIN you have to branch_invoices is not really a left join since you're insisting that the branch_id must be in the first or 2nd sub-plan. There's no room for it to be NULL. The planner will just convert that to an INNER JOIN with the above query since that'll give it the flexibility to put the subquery in the IN clause on the outside of the join (after having uniquified it). You'll need to decide what you actually want the behaviour to be here. If you do need those NULL rows then you'd better move your WHERE quals down into the join condition for branch_invoices table. I'd suggest testing with some mock-up data if you're uncertain of what I mean. If you find that is faster and you can't rewrite the query due to it having been generated by Hibernate, then that sounds like a problem with Hibernate. PostgreSQL does not currently attempt to do any rewrites which convert OR clauses to use UNION or UNION ALL. No amount of tweaking the planner settings is going to change that fact. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services