Home > mailing lists

Re: Problems with plan estimates in postgres_fdw - Mailing list pgsql-hackers

From	Etsuro Fujita
Subject	Re: Problems with plan estimates in postgres_fdw
Date	February 25, 2019 10:31:31
Msg-id	5C73C403.7080109@lab.ntt.co.jp Whole thread Raw
In response to	Re: Problems with plan estimates in postgres_fdw (Antonin Houska <ah@cybertec.at>)
Responses	Re: Problems with plan estimates in postgres_fdw
List	pgsql-hackers

Tree view

(2019/02/22 22:54), Antonin Houska wrote:
> Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp>  wrote:
>> Maybe, my explanation in that thread was not enough, but the changes proposed
>> by the patch posted there wouldn't obsolete the above.  Let me explain using a
>> foreign-table variant of the example shown in the comments for
>> make_group_input_target():
>>
>>      SELECT a+b, sum(c+d) FROM foreign_table GROUP BY a+b;
>>
>> When called from postgresGetForeignPaths(), the reltarget for the base
>> relation foreign_table would be {a, b, c, d}, for the same reason as the LIMIT
>> example in [1], and the foreign_table's rel_startup_cost and rel_total_cost
>> would be calculated based on this reltarget in that callback routine.  (The
>> tlist eval costs would be zeroes, though).  BUT: before called from
>> postgresGetForeignUpperPaths() with the UPPERREL_GROUP_AGG stage, the
>> reltarget would be replaced with {a+b, c, d}, which is the final scan target
>> as explained in the comments for make_group_input_target(), and the eval costs
>> of the new reltarget wouldn't be zeros, so the costs of scanning the
>> foreign_table would need to be adjusted to include the eval costs.  That's why
>> I propose the assignments in estimate_path_cost_size() shown above.
>
> Yes, apply_scanjoin_target_to_paths() can add some more costs, including those
> introduced by make_group_input_target(), which postgres_fdw needs to account
> for. This is what you fix in
>
> https://www.postgresql.org/message-id/5C66A056.60007%40lab.ntt.co.jp

That's right.  (I'll post a new version of the patch to address your 
comments later.)

>> On the other hand, the code bit added by
>> 0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch handles the
>> case where a post-scan/join processing step other than grouping/aggregation
>> (ie, the final sort or LIMIT/OFFSET) is performed directly on a base or join
>> relation, as in the LIMIT example.  So, the two changes are handling different
>> cases, hence both changes would be required.
>
> Do you mean this part?

No, I don't.  What I meant was this part:

+       /*
+        * If this is an UPPERREL_ORDERED step performed on the final
+        * scan/join relation, the costs obtained from the cache 
wouldn't yet
+        * contain the eval costs for the final scan/join target, which 
would
+        * have been updated by apply_scanjoin_target_to_paths(); add 
the eval
+        * costs now.
+        */
+       if (fpextra && !IS_UPPER_REL(foreignrel))
+       {
+           /* The costs should have been obtained from the cache. */
+           Assert(fpinfo->rel_startup_cost >= 0 &&
+                  fpinfo->rel_total_cost >= 0);
+
+           startup_cost += foreignrel->reltarget->cost.startup;
+           run_cost += foreignrel->reltarget->cost.per_tuple * rows;
+       }

The case handled by this would be different from one handled by the fix, 
but as I said in the nearby thread, this part might be completely 
redundant.  Will re-consider about this.

> +       /*
> +        * If this includes an UPPERREL_ORDERED step, the given target, which
> +        * would be the final target to be applied to the resulting path, might
> +        * have different expressions from the underlying relation's reltarget
> +        * (see make_sort_input_target()); adjust tlist eval costs.
> +        */
> +       if (fpextra&&  fpextra->target != foreignrel->reltarget)
> +       {
> +               QualCost        oldcost = foreignrel->reltarget->cost;
> +               QualCost        newcost = fpextra->target->cost;
> +
> +               startup_cost += newcost.startup - oldcost.startup;
> +               total_cost += newcost.startup - oldcost.startup;
> +               total_cost += (newcost.per_tuple - oldcost.per_tuple) * rows;
> +       }
>
> You should not need this. Consider what grouping_planner() does if
> (!have_grouping&&  !activeWindows&&  parse->sortClause != NIL):
>
>     sort_input_target = make_sort_input_target(...);
>     ...
>     grouping_target = sort_input_target;
>     ...
>     scanjoin_target = grouping_target;
>     ...
>     apply_scanjoin_target_to_paths(...);
>
> So apply_scanjoin_target_to_paths() effectively accounts for the additional
> costs introduced by make_sort_input_target().

Yeah, but I think the code bit cited above is needed.  Let me explain 
using yet another example with grouping and ordering:

     SELECT a+b, random() FROM foreign_table GROUP BY a+b ORDER BY a+b;

For this query, the sort_input_target would be {a+b}, (to which the 
foreignrel->reltarget would have been set when called from 
estimate_path_cost_size() called from postgresGetForeignUpperPaths() 
with the UPPERREL_ORDERED stage), but the final target would be {a+b, 
random()}, so the sort_input_target isn't the same as the final target 
in that case; the costs needs to be adjusted so that the costs include 
the ones of generating the final target.  That's the reason why I added 
the code bit you cited.

> Note about the !activeWindows test: It occurs me now that no paths should be
> added to UPPERREL_ORDERED relation if the query needs WindowAgg node, because
> postgres_fdw currently cannot handle UPPERREL_WINDOW relation. Or rather in
> general, the FDW should not skip any stage just because it doesn't know how to
> build the paths.

That's right.  In postgres_fdw, by the following code in 
postgresGetForeignUpperPaths(), we will give up on doing the 
UPPERREL_ORDERED step remotely, if the query has the UPPERREL_WINDOW 
(and/or UPPERREL_DISTINCT) steps, because in that case we would have 
input_rel->fdw_private=NULL for a call to postgresGetForeignUpperPaths() 
with the UPPERREL_ORDERED stage:

     /*
      * If input rel is not safe to pushdown, then simply return as we 
cannot
      * perform any post-join operations on the foreign server.
      */
     if (!input_rel->fdw_private ||
         !((PgFdwRelationInfo *) input_rel->fdw_private)->pushdown_safe)
         return;

Best regards,
Etsuro Fujita

pgsql-hackers by date:

From: Masahiko Sawada
Date: 25 February 2019, 10:17:02
Subject: Re: reloption to prevent VACUUM from truncating empty pages at theend of relation

From: ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date: 25 February 2019, 10:57:09
Subject: Re: Remove Deprecated Exclusive Backup Mode

Re: Problems with plan estimates in postgres_fdw - Mailing list pgsql-hackers

Previous

Next