Re: Problems with plan estimates in postgres_fdw - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Problems with plan estimates in postgres_fdw |
Date | |
Msg-id | 5C73C403.7080109@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Problems with plan estimates in postgres_fdw (Antonin Houska <ah@cybertec.at>) |
Responses |
Re: Problems with plan estimates in postgres_fdw
|
List | pgsql-hackers |
(2019/02/22 22:54), Antonin Houska wrote: > Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote: >> Maybe, my explanation in that thread was not enough, but the changes proposed >> by the patch posted there wouldn't obsolete the above. Let me explain using a >> foreign-table variant of the example shown in the comments for >> make_group_input_target(): >> >> SELECT a+b, sum(c+d) FROM foreign_table GROUP BY a+b; >> >> When called from postgresGetForeignPaths(), the reltarget for the base >> relation foreign_table would be {a, b, c, d}, for the same reason as the LIMIT >> example in [1], and the foreign_table's rel_startup_cost and rel_total_cost >> would be calculated based on this reltarget in that callback routine. (The >> tlist eval costs would be zeroes, though). BUT: before called from >> postgresGetForeignUpperPaths() with the UPPERREL_GROUP_AGG stage, the >> reltarget would be replaced with {a+b, c, d}, which is the final scan target >> as explained in the comments for make_group_input_target(), and the eval costs >> of the new reltarget wouldn't be zeros, so the costs of scanning the >> foreign_table would need to be adjusted to include the eval costs. That's why >> I propose the assignments in estimate_path_cost_size() shown above. > > Yes, apply_scanjoin_target_to_paths() can add some more costs, including those > introduced by make_group_input_target(), which postgres_fdw needs to account > for. This is what you fix in > > https://www.postgresql.org/message-id/5C66A056.60007%40lab.ntt.co.jp That's right. (I'll post a new version of the patch to address your comments later.) >> On the other hand, the code bit added by >> 0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch handles the >> case where a post-scan/join processing step other than grouping/aggregation >> (ie, the final sort or LIMIT/OFFSET) is performed directly on a base or join >> relation, as in the LIMIT example. So, the two changes are handling different >> cases, hence both changes would be required. > > Do you mean this part? No, I don't. What I meant was this part: + /* + * If this is an UPPERREL_ORDERED step performed on the final + * scan/join relation, the costs obtained from the cache wouldn't yet + * contain the eval costs for the final scan/join target, which would + * have been updated by apply_scanjoin_target_to_paths(); add the eval + * costs now. + */ + if (fpextra && !IS_UPPER_REL(foreignrel)) + { + /* The costs should have been obtained from the cache. */ + Assert(fpinfo->rel_startup_cost >= 0 && + fpinfo->rel_total_cost >= 0); + + startup_cost += foreignrel->reltarget->cost.startup; + run_cost += foreignrel->reltarget->cost.per_tuple * rows; + } The case handled by this would be different from one handled by the fix, but as I said in the nearby thread, this part might be completely redundant. Will re-consider about this. > + /* > + * If this includes an UPPERREL_ORDERED step, the given target, which > + * would be the final target to be applied to the resulting path, might > + * have different expressions from the underlying relation's reltarget > + * (see make_sort_input_target()); adjust tlist eval costs. > + */ > + if (fpextra&& fpextra->target != foreignrel->reltarget) > + { > + QualCost oldcost = foreignrel->reltarget->cost; > + QualCost newcost = fpextra->target->cost; > + > + startup_cost += newcost.startup - oldcost.startup; > + total_cost += newcost.startup - oldcost.startup; > + total_cost += (newcost.per_tuple - oldcost.per_tuple) * rows; > + } > > You should not need this. Consider what grouping_planner() does if > (!have_grouping&& !activeWindows&& parse->sortClause != NIL): > > sort_input_target = make_sort_input_target(...); > ... > grouping_target = sort_input_target; > ... > scanjoin_target = grouping_target; > ... > apply_scanjoin_target_to_paths(...); > > So apply_scanjoin_target_to_paths() effectively accounts for the additional > costs introduced by make_sort_input_target(). Yeah, but I think the code bit cited above is needed. Let me explain using yet another example with grouping and ordering: SELECT a+b, random() FROM foreign_table GROUP BY a+b ORDER BY a+b; For this query, the sort_input_target would be {a+b}, (to which the foreignrel->reltarget would have been set when called from estimate_path_cost_size() called from postgresGetForeignUpperPaths() with the UPPERREL_ORDERED stage), but the final target would be {a+b, random()}, so the sort_input_target isn't the same as the final target in that case; the costs needs to be adjusted so that the costs include the ones of generating the final target. That's the reason why I added the code bit you cited. > Note about the !activeWindows test: It occurs me now that no paths should be > added to UPPERREL_ORDERED relation if the query needs WindowAgg node, because > postgres_fdw currently cannot handle UPPERREL_WINDOW relation. Or rather in > general, the FDW should not skip any stage just because it doesn't know how to > build the paths. That's right. In postgres_fdw, by the following code in postgresGetForeignUpperPaths(), we will give up on doing the UPPERREL_ORDERED step remotely, if the query has the UPPERREL_WINDOW (and/or UPPERREL_DISTINCT) steps, because in that case we would have input_rel->fdw_private=NULL for a call to postgresGetForeignUpperPaths() with the UPPERREL_ORDERED stage: /* * If input rel is not safe to pushdown, then simply return as we cannot * perform any post-join operations on the foreign server. */ if (!input_rel->fdw_private || !((PgFdwRelationInfo *) input_rel->fdw_private)->pushdown_safe) return; Best regards, Etsuro Fujita
pgsql-hackers by date: