Home > mailing lists
Re: Join push-down support for foreign tables - Mailing list pgsql-hackers

From	Kouhei Kaigai
Subject	Re: Join push-down support for foreign tables
Date	March 3, 2015 09:04:29
Msg-id	9A28C8860F777E439AA12E8AEA7694F8010BAB26@BPXM15GP.gisp.nec.co.jp Whole thread Raw
In response to	Join push-down support for foreign tables (Shigeru Hanada <shigeru.hanada@gmail.com>)
Responses	Re: Join push-down support for foreign tables
List	pgsql-hackers
Tree view
Hanada-san,

I checked the patch, below is the random comments from my side.

* Context variables
-------------------
Sorry, I might give you a wrong suggestion.
The foreign_glob_cxt and deparse_expr_cxt were re-defined as follows:

typedef struct foreign_glob_cxt{       PlannerInfo *root;                      /* global planner state */
-       RelOptInfo *foreignrel;         /* the foreign relation we are planning
+       RelOptInfo *outerrel;           /* the foreign relation, or outer child
+       RelOptInfo *innerrel;           /* inner child, only set for join */} foreign_glob_cxt;
/*
@@ -86,9 +89,12 @@ typedef struct foreign_loc_cxttypedef struct deparse_expr_cxt{       PlannerInfo *root;
       /* global planner state */
 
-       RelOptInfo *foreignrel;         /* the foreign relation we are planning
+       RelOptInfo *outerrel;           /* the foreign relation, or outer child
+       RelOptInfo *innerrel;           /* inner child, only set for join */       StringInfo      buf;
  /* output buffer to append to */       List      **params_list;        /* exprs that will become remote Params
 
+       ForeignScan *outerplan;         /* outer child's ForeignScan node */
+       ForeignScan *innerplan;         /* inner child's ForeignScan node */} deparse_expr_cxt;

However, the outerrel does not need to have double-meaning.

RelOptInfo->reloptkind gives us information whether the target
relation is base-relation or join-relation.
So, foreign_expr_walker() can be implemented as follows:   if (bms_is_member(var->varno, glob_cxt->foreignrel->relids)
&&      var->varlevelsup == 0)   {           :
 
also, deparseVar() can checks relation type using:   if (context->foreignrel->reloptkind == RELOPT_JOINREL)   {
deparseJoinVar(...);

In addition, what we need in deparse_expr_cxt are target-list of
outer and inner relation in deparse_expr_cxt.
How about to add inner_tlist/outer_tlist instead of innerplan and
outerplan in deparse_expr_cxt?
The deparseJoinVar() references these fields, but only targetlist.



* GetForeignJoinPath method of FDW
----------------------------------
It should be portion of the interface patch, so I added these
enhancement of FDW APIs with documentation updates.
Please see the newer version of foreign/custom-join interface patch.



* enable_foreignjoin parameter
------------------------------
I'm uncertain whether we should have this GUC parameter that affects
to all FDW implementation. Rather than a built-in one, my preference
is an individual GUC variable defined with DefineCustomBoolVariable(),
by postgres_fdw.
Pros: It offers user more flexible configuration.
Cons: Each FDW has to implement this GUC by itself?



* syntax violated query if empty targetlist
-------------------------------------------
At least NULL shall be injected if no columns are referenced.
Also, add a dummy entry to fdw_ps_tlist to fit slot tuple descriptor.

postgres=# explain verbose select NULL from ft1,ft2 where aid=bid;                        QUERY PLAN
---------------------------------------------------------------------------Foreign Scan  (cost=100.00..129.25 rows=2925
width=0) Output: NULL::unknown  Remote SQL: SELECT  FROM (SELECT bid, NULL FROM public.t2) l (a_0, a_1) INNERJOIN
(SELECTaid, NULL FROM public.t1) r (a_0, a_1)  ON ((r.a_0 = l.a_0))
 



* Bug reported by Thom Brown
-----------------------------
# EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN countries ON people.country_id =
countries.idLIMIT 3) x;
 
ERROR:  could not open relation with OID 0

Sorry, it was a problem caused by my portion. The patched setrefs.c
checks fdw_/custom_ps_tlist to determine whether Foreign/CustomScan
node is associated with a certain base relation. If *_ps_tlist is
valid, it also expects scanrelid == 0.
However, things we should check is incorrect. We may have a case
with empty *_ps_tlist if remote join expects no columns.
So, I adjusted the condition to check scanrelid instead.


* make_tuple_from_result_row() call
------------------------------------
The 4th argument (newly added) is referenced without NULL checks,
however, store_returning_result() and analyze_row_processor() put
NULL on this argument, then it leads segmentation fault.
RelationGetDescr(fmstate->rel or astate->rel) should be given on
the caller.

* regression test crash
------------------------
The query below gets crashed: UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT FROM ft1 WHERE
ft1.c1= ft2.c2 AND ft1.c1 % 10 = 9;
 

According to the crash dump, tidin() got a cstring input with
unexpected format. I guess column number is incorrectly assigned,
but have no clear scenario at this moment.

#0  0x00007f9b45a11513 in conversion_error_callback (arg=0x7fffc257ecc0)   at postgres_fdw.c:3293
#1  0x00000000008e51d6 in errfinish (dummy=0) at elog.c:436
#2  0x00000000008935cd in tidin (fcinfo=0x7fffc257e8a0) at tid.c:69
/home/kaigai/repo/sepgsql/src/backend/utils/adt/tid.c:69:1734:beg:0x8935cd
(gdb) p str
$6 = 0x1d17cf7 "foo"

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: Shigeru Hanada [mailto:shigeru.hanada@gmail.com]
> Sent: Monday, March 02, 2015 9:48 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Robert Haas; PostgreSQL-development
> Subject: ##freemail## Re: [HACKERS] Join push-down support for foreign tables
> 
> Attached is the revised/rebased version of the $SUBJECT.
> 
> This patch is based on Kaigai-san's custom/foreign join patch, so
> please apply it before this patch.  In this version I changed some
> points from original postgres_fdw.
> 
> 1) Disabled SELECT clause optimization
> ~9.4 postgres_fdw lists only columns actually used in SELECT clause,
> but AFAIS it makes SQL generation complex.  So I disabled such
> optimization and put "NULL" for unnecessary columns in SELECT clause
> of remote query.
> 
> 2) Extended deparse context
> To allow deparsing based on multiple source relations, I added some
> members to context structure.  They are unnecessary for simple query
> with single foreign table, but IMO it should be integrated.
> 
> With Kaigai-san's advise, changes for supporting foreign join on
> postgres_fdw is minimized into postgres_fdw itself.  But I added new
> FDW API named GetForeignJoinPaths() to keep the policy that all
> interface between core and FDW should be in FdwRoutine, instead of
> using hook function.  Now I'm writing document about it, and will post
> it in a day.
> 
> 2015-02-19 16:19 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> > 2015-02-17 10:39 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
> >> Let me put some comments in addition to where you're checking now.
> >>
> >> [design issues]
> >> * Cost estimation
> >> Estimation and evaluation of cost for remote join query is not an
> >> obvious issue. In principle, local side cannot determine the cost
> >> to run remote join without remote EXPLAIN, because local side has
> >> no information about JOIN logic applied on the remote side.
> >> Probably, we have to put an assumption for remote join algorithm,
> >> because local planner has no idea about remote planner's choice
> >> unless foreign-join don't take "use_remote_estimate".
> >> I think, it is reasonable assumption (even if it is incorrect) to
> >> calculate remote join cost based on local hash-join algorithm.
> >> If user wants more correct estimation, remote EXPLAIN will make
> >> more reliable cost estimation.
> >
> > Hm, I guess that you chose hash-join as "least-costed join".  In the
> > pgbench model, most combination between two tables generate hash join
> > as cheapest path.  Remote EXPLAIN is very expensive in the context of
> > planning, so it would easily make the plan optimization meaningless.
> > But giving an option to users is good, I agree.
> >
> >>
> >> It also needs a consensus whether cost for remote CPU execution is
> >> equivalent to local CPU. If we think local CPU is rare resource
> >> than remote one, a discount rate will make planner more preferable
> >> to choose remote join than local one
> >
> > Something like cpu_cost_ratio as a new server-level FDW option?
> >
> >>
> >> Once we assume a join algorithm for remote join, unit cost for
> >> remote CPU, we can calculate a cost for foreign join based on
> >> the local join logic plus cost for network translation (maybe
> >> fdw_tuple_cost?).
> >
> > Yes, sum of these costs is the total cost of a remote join.
> >     o fdw_startup_cost
> >     o hash-join cost, estimated as a local join
> >     o fdw_tuple_cost * rows * width
> >
> >> * FDW options
> >> Unlike table scan, FDW options we should refer is unclear.
> >> Table level FDW options are associated with a foreign table as
> >> literal. I think we have two options here:
> >> 1. Foreign-join refers FDW options for foreign-server, but ones
> >>    for foreign-tables are ignored.
> >> 2. Foreign-join is prohibited when both of relations don't have
> >>    identical FDW options.
> >> My preference is 2. Even though N-way foreign join, it ensures
> >> all the tables involved with (N-1)-way foreign join has identical
> >> FDW options, thus it leads we can make N-way foreign join with
> >> all identical FDW options.
> >> One exception is "updatable" flag of postgres_fdw. It does not
> >> make sense on remote join, so I think mixture of updatable and
> >> non-updatable foreign tables should be admitted, however, it is
> >> a decision by FDW driver.
> >>
> >> Probably, above points need to take time for getting consensus.
> >> I'd like to see your opinion prior to editing your patch.
> >
> > postgres_fdw can't push down a join which contains foreign tables on
> > multiple servers, so use_remote_estimate and fdw_startup_cost  are the
> > only FDW options to consider.  So we have options for each option.
> >
> > 1-a. If all foreign tables in the join has identical
> > use_remote_estimate, allow pushing down.
> > 1-b. If any of foreign table in the join has true as
> > use_remote_estimate, use remote estimate.
> >
> > 2-a. If all foreign tables in the join has identical fdw_startup_cost,
> > allow pushing down.
> > 2-b. Always use max value in the join. (cost would be more expensive)
> > 2-c. Always use min value in the join.  (cost would be cheaper)
> >
> > I prefer 1-a and 2-b, so more joins avoid remote EXPLAIN but have
> > reasonable cost about startup.
> >
> > I agree about "updatable" option.
> >
> >>
> >> [implementation issues]
> >> The interface does not intend to add new Path/Plan type for each scan
> >> that replaces foreign joins. What postgres_fdw should do is, adding
> >> ForeignPath towards a particular joinrel, then it populates ForeignScan
> >> with remote join query once it got chosen by the planner.
> >
> > That idea is interesting, and make many things simpler.  Please let me consider.
> >
> >>
> >> A few functions added in src/backend/foreign/foreign.c are not
> >> called by anywhere, at this moment.
> >>
> >> create_plan_recurse() is reverted to static. It is needed for custom-
> >> join enhancement, if no other infrastructure can support.
> >
> > I made it back to static because I thought that create_plan_recurse
> > can be called by core before giving control to FDWs.  But I'm not sure
> > it can be applied to custom scans.  I'll recheck that part.
> >
> >
> > --
> > Shigeru HANADA
> 
> 
> 
> --
> Shigeru HANADA
pgsql-hackers by date:
From: Fabien COELHO
Date: 03 March 2015, 08:48:54
Subject: improve pgbench syntax error messages
From: Kouhei Kaigai
Date: 03 March 2015, 09:31:25
Subject: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Re: Join push-down support for foreign tables - Mailing list pgsql-hackers

Previous

Next