Thread: Re: v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION
Re: v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION
From
"ir. F.T.M. van Vugt bc."
Date:
(Should probably be in [SQL] by now....) I've changed my table declarations to agree on the datatypes and only one simular problem with an update-query doesn't seem to be solved. (see plan below) * the concatenation in the lbar select can't be avoided, it's just the way the data is => this does result in a resulting type 'text', AFAIK * the aux_address.old_id is also of type 'text' Still, the planner does a nested loop here against large costs... ;( Any hints on this (last) one....? TIA, Frank. trial=# explain update address set region_id = lbar.region_id from (select debtor_id || '-' || address_seqnr as id, region_id from list_base_regions) as lbar, aux_address aa where lbar.id = aa.old_id and address.id = aa.id; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Merge Join (cost=1.07..65.50 rows=3 width=253) Merge Cond: ("outer".id = "inner".id) -> Nested Loop (cost=0.00..643707.03 rows=3980 width=28) Join Filter: (((("inner".debtor_id)::text || '-'::text) || ("inner".address_seqnr)::text) = "outer".old_id) -> Index Scan using aux_address_idx2 on aux_address aa (cost=0.00..81.88 rows=3989 width=16) -> Seq Scan on list_base_regions (cost=0.00..71.80 rows=3980 width=12) -> Sort (cost=1.07..1.08 rows=3 width=225) Sort Key: address.id -> Seq Scan on address (cost=0.00..1.05 rows=3 width=225) Filter: ((id = 1) IS NOT TRUE) (10 rows)
"ir. F.T.M. van Vugt bc." <ftm.van.vugt@foxi.nl> writes: > Any hints on this (last) one....? > -> Nested Loop (cost=0.00..643707.03 rows=3980 width=28) > Join Filter: (((("inner".debtor_id)::text || '-'::text) || > ("inner".address_seqnr)::text) = "outer".old_id) Looks to me like debtor_id and address_seqnr are not text type, but are being compared to things that are text. Hard to tell exactly what's going on though --- I suppose this query is getting rewritten by a rule? regards, tom lane
> > Any hints on this (last) one....? > > -> Nested Loop (cost=0.00..643707.03 rows=3980 width=28) > > Join Filter: (((("inner".debtor_id)::text || '-'::text) || > > ("inner".address_seqnr)::text) = "outer".old_id) > > Looks to me like debtor_id and address_seqnr are not text type, but are > being compared to things that are text. They were coerced, yes, but changing those original types helps only so much: * lbar.debtor_id is of type text * lbar.address_seqnr is of type text * aa.old_id is of type text trial=# explain update address set region_id = lbar.region_id from (select debtor_id || '-' || address_seqnr as f_id, region_id from list_base_regions) as lbar, aux_address aa where lbar.f_id = aa.old_id and address.id = aa.id; Since the left side of the join clause is composed out of three concatenated text-parts resulting in one single piece of type text, I'd expect the planner to avoid the nested loop. Still: QUERY PLAN -------------------------------------------------------------------------------------------------------- Merge Join (cost=1.07..16.07 rows=1 width=309) Merge Cond: ("outer".id = "inner".id) -> Nested Loop (cost=0.00..149669.38 rows=1000 width=84) Join Filter: ((("inner".debitor_id || '-'::text) || "inner".address_seqnr) = "outer".old_id) -> Index Scan using aux_address_idx2 on aux_address aa (cost=0.00..81.88 rows=3989 width=16) -> Seq Scan on list_base_regions (cost=0.00..20.00 rows=1000 width=68) -> Sort (cost=1.07..1.08 rows=3 width=225) Sort Key: address.id -> Seq Scan on address (cost=0.00..1.05 rows=3 width=225) Filter: ((id = 1) IS NOT TRUE) (10 rows) > Hard to tell exactly what's going on though Does this help? NB: it seems the data types part of the manual doesn't enlighten me on this subject, any suggestions where to find more input? Regards, Frank.
"Frank van Vugt" <ftm.van.vugt@foxi.nl> writes: > Since the left side of the join clause is composed out of three concatenated > text-parts resulting in one single piece of type text, I'd expect the > planner to avoid the nested loop. Probably not, since the first thing it does is to flatten the sub-select, leaving it with a concatenation expression in the WHERE-clause. (I was too sleepy last night to realize that you were comparing a concatenation to old_id, rather than making two separate comparisons :-() We really need to fix the planner to be able to do merge/hash on "arbitrary expression = arbitrary expression", not only "Var = Var". IIRC, this is doable in principle, but there are a few routines that would need to be improved. regards, tom lane
> Probably not, since the first thing it does is to flatten the > sub-select, leaving it with a concatenation expression in the > WHERE-clause. Ah, I see. So, I'll just split this thingy into two seperate queries, starting with creating a temp table containing the straight subselect results. > We really need to fix the planner to be able to do merge/hash on > "arbitrary expression = arbitrary expression", not only "Var = Var". I can get around it, so I'm not complaining ;-) Tom, thanks a *lot* for the prompt responses !! Best, Frank.