Re: UNION ALL on partitioned tables won't use indices. - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: UNION ALL on partitioned tables won't use indices. |
Date | |
Msg-id | 20131125044916.GC1045072@tornado.leadboat.com Whole thread Raw |
In response to | Re: UNION ALL on partitioned tables won't use indices. (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: UNION ALL on partitioned tables won't use indices.
|
List | pgsql-hackers |
On Sat, Nov 23, 2013 at 01:35:32PM -0500, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > I'm unclear on the key ideas behind distinguishing em_is_child members from > > ordinary EC members. src/backend/optimizer/README says "These members are > > *not* full-fledged members of the EquivalenceClass and do not affect the > > class's overall properties at all." Is that an optimization to avoid futile > > planner work, or is it necessary for correctness? If the latter, what sort of > > query might give wrong answers if an EquivalenceMember were incorrectly added > > as em_is_child = false? > > See commit dd4134ea, which added that text. IIRC, the key point is that > among "real" EC members, there will never be duplicates: a Var "x.y" > for instance cannot appear in two different ECs, because we'd have merged > the two ECs into one instead. However, this property does *not* hold for > child EC members. The problem is that two completely unrelated UNION ALL > child subselects might have, say, constant "1" in their tlists. This > should not lead to concluding that we can merge ECs that mention their > UNION ALL parent variables. Thanks for the pointer; I found things clearer after reviewing the ECs from the test case queries from commits dd4134ea and 57664ed2. > I've not looked at these patches yet, but if they're touching equivclass.c > at all, I'd guess that it's wrong. Only PATCH-1 touched equivclass.c. > My gut feeling after two minutes' thought is that the best fix is for > expand_inherited_rtentry to notice when the parent table is already an > appendrel member, and enlarge that appendrel instead of making a new one. > (This depends on the assumption that pull_up_simple_union_all always > happens first, but that seems OK.) I'm not sure if that concept exactly > corresponds to either PATCH-3 or PATCH-4, but that's the way I'd think > about it. PATCH-4 does approximately that. I agree that's the right general direction. > > However, adding a qual to one of the inheritance queries once again defeated > > MergeAppend with the patches for approaches (2) and (3). > > That's an independent limitation, see is_safe_append_member: > > * Also, the child can't have any WHERE quals because there's no place to > * put them in an appendrel. (This is a bit annoying...) > > It'd be nice to fix that, but it's not going to be easy, and it should > be a separate patch IMO. It's pretty much unrelated to the question at > hand here. After further study, I agree. It would still be good to understand why that test case crashed PATCH-1, then ensure that the other patches don't have a similar lurking bug. An alternative to extending our ability to pull up UNION ALL subqueries, having perhaps broader applicability, would be to push down the possibly-useful sort order to subqueries we can't pull up. We'd sometimes have two levels of MergeAppend, but that could still handily beat the explicit-sort plan. In any case, it is indeed a separate topic. For the sake of the archives, you can get such plans today by manually adding the ORDER BY to the relevant UNION ALL branch: EXPLAIN (ANALYZE) (SELECT oid, * FROM pg_proc WHERE protransform = 0 ORDER BY oid) UNION ALL SELECT oid, * FROM pg_proc ORDERBY 1 LIMIT 5; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=0.57..1.31 rows=5 width=544) (actual time=0.102..0.155 rows=5 loops=1) -> Merge Append (cost=0.57..748.25 rows=5084width=544) (actual time=0.095..0.130 rows=5 loops=1) Sort Key: pg_proc_1.oid -> Index Scan usingpg_proc_oid_index on pg_proc pg_proc_1 (cost=0.28..332.83 rows=2538 width=544) (actual time=0.058..0.069 rows=3 loops=1) Filter: ((protransform)::oid = 0::oid) -> Index Scan using pg_proc_oid_index on pg_proc (cost=0.28..326.47rows=2546 width=544) (actual time=0.029..0.036 rows=3 loops=1) -- Noah Misch EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: