Re: Push down more full joins in postgres_fdw - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Push down more full joins in postgres_fdw |
Date | |
Msg-id | 3e5989a9-c322-cc0d-1476-cd0707675d10@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Push down more full joins in postgres_fdw (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>) |
Responses |
Re: Push down more full joins in postgres_fdw
Re: Push down more full joins in postgres_fdw |
List | pgsql-hackers |
On 2016/09/06 22:07, Ashutosh Bapat wrote: > On Fri, Sep 2, 2016 at 3:55 PM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote: > On 2016/08/22 15:49, Ashutosh Bapat wrote: > 1. deparsePlaceHolderVar looks odd - each of the deparse* > function is > named as deparse + <name of the parser node the string would parse > into>. PlaceHolderVar is not a parser node, so no string is > going to be > parsed as a PlaceHolderVar. May be you want to name it as > deparseExerFromPlaceholderVar or something like that. > The name "deparseExerFromPlaceholderVar" seems long to me. How > about "deparsePlaceHolderExpr"? > There isn't any node with name PlaceHolderExpr. I'll rename it to "deparseExerInPlaceholderVar", then. > 3. I think registerAlias stuff should happen really at the time of > creating paths and should be stored in fpinfo. Without that it > will be > computed every time we deparse the query. This means every time > we try > to EXPLAIN the query at various levels in the join tree and once > for the > query to be sent to the foreign server. > Hmm. I think the overhead in calculating aliases would be > negligible compared to the overhead in explaining each remote query > for costing or sending the remote query for execution. So, I > created aliases in the same way as remote params created and stored > into params_list in deparse_expr_cxt. I'm not sure it's worth > complicating the code. > We defer remote parameter creation till deparse time since the the > parameter numbers are dependent upon the sequence in which we deparse > the query. Creating them at the time of path creation and storing them > in fpinfo is not possible because a. those present in the joining > relations will conflict with each other and need some kind of > serialization at the time of deparsing b. those defer for differently > parameterized paths or paths with different pathkeys. We don't defer it > because it's very light on performance. > > That's not true with the alias information. As long as we detect which > relations need subqueries, their RTIs are enough to create unique > aliases e.g. if a relation involves RTIs 1, 2, 3 corresponding subquery > can have alias r123 without conflicting with any other alias. Hmm. But another concern about the approach of generating an subselect alias for a path, if needed, at the path creation time would be that the path might be rejected by add_path, which would result in totally wasting cycles for generating the subselect alias for the path. > However minimum overhead it might have, we will deparse the query every > time we create a path involving that kind of relation i.e. for different > pathkeys, different parameterization and different joins in which that > relation participates in. This number can be significant. We want to > avoid this overhead if we can. Exactly. As I said above, I think the overhead would be negligible compared to the overhead in explaining each remote query for costing or the overhead in sending the final remote query for execution, though. > 5. The blocks related to inner and outer relations in > deparseFromExprForRel() look same. We should probably separate > that code > out into a function and call it at two places. > Done. > I see you have created function deparseOperandRelation() for the > same. I guess, this should be renamed as deparseRangeTblRef() since it > will create RangeTblRef node when parsed back. OK, if there no opinions of others, I'll rename it to the name proposed by you, "deparseRangeTblRef". > 6. > ! if (is_placeholder) > ! errcontext("placeholder expression at position %d in > select list", > ! errpos->cur_attno); > A user wouldn't know what a placeholder expression is, there is > no such > term in the documentation. We have to device a better way to > provide an > error context for an expression in general. > Though I proposed that, I don't think that it's that important to > let users know that the expression is from a PHV. How about just > saying "expression", not "placeholder expression", so that we have > the message "expression at position %d in select list" in the context? > Hmm, that's better than placeholder expression, but not as explanatory > as it should be since we won't be printing the "select list" in the > error context and user won't have a clue as to what error context is > about. I don't think so. Consider an example of the conversion error message, which is from the regression test: SELECT ft1.c1, ft2.c2, ft1 FROM ft1, ft2 WHERE ft1.c1 = ft2.c1 AND ft1.c1 = 1; ERROR: invalid input syntax for integer: "foo" CONTEXT: whole-row reference to foreign table "ft1" As shown in the example, the error message is displayed under a remote query for execution. So, ISTM it's reasonable to print something like "expression at position %d in select list" in the context if an expression in a PHV. > But I don't have any good suggestions right now. May be we should > print the whole expression? But that can be very long, I don't know. ISTM that it's a bit too expensive to print the whole expression in the error context. > This patch tries to do two things at a time 1. support join pushdown for > FULL join when the joining relations have remote conditions 2. better > support for fetching placeholder vars, whole row references and some > system columns. To make reviews easy, I think we should split the patch > into two 1. supporting subqueries to be deparsed with support for one of > the above (I will suggest FULL join support) 2. support for the other. OK, will try. Best regards, Etsuro Fujita
pgsql-hackers by date: