Re: Join push-down support for foreign tables - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: Join push-down support for foreign tables |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F8010BAB26@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Join push-down support for foreign tables (Shigeru Hanada <shigeru.hanada@gmail.com>) |
Responses |
Re: Join push-down support for foreign tables
|
List | pgsql-hackers |
Hanada-san, I checked the patch, below is the random comments from my side. * Context variables ------------------- Sorry, I might give you a wrong suggestion. The foreign_glob_cxt and deparse_expr_cxt were re-defined as follows: typedef struct foreign_glob_cxt{ PlannerInfo *root; /* global planner state */ - RelOptInfo *foreignrel; /* the foreign relation we are planning + RelOptInfo *outerrel; /* the foreign relation, or outer child + RelOptInfo *innerrel; /* inner child, only set for join */} foreign_glob_cxt; /* @@ -86,9 +89,12 @@ typedef struct foreign_loc_cxttypedef struct deparse_expr_cxt{ PlannerInfo *root; /* global planner state */ - RelOptInfo *foreignrel; /* the foreign relation we are planning + RelOptInfo *outerrel; /* the foreign relation, or outer child + RelOptInfo *innerrel; /* inner child, only set for join */ StringInfo buf; /* output buffer to append to */ List **params_list; /* exprs that will become remote Params + ForeignScan *outerplan; /* outer child's ForeignScan node */ + ForeignScan *innerplan; /* inner child's ForeignScan node */} deparse_expr_cxt; However, the outerrel does not need to have double-meaning. RelOptInfo->reloptkind gives us information whether the target relation is base-relation or join-relation. So, foreign_expr_walker() can be implemented as follows: if (bms_is_member(var->varno, glob_cxt->foreignrel->relids) && var->varlevelsup == 0) { : also, deparseVar() can checks relation type using: if (context->foreignrel->reloptkind == RELOPT_JOINREL) { deparseJoinVar(...); In addition, what we need in deparse_expr_cxt are target-list of outer and inner relation in deparse_expr_cxt. How about to add inner_tlist/outer_tlist instead of innerplan and outerplan in deparse_expr_cxt? The deparseJoinVar() references these fields, but only targetlist. * GetForeignJoinPath method of FDW ---------------------------------- It should be portion of the interface patch, so I added these enhancement of FDW APIs with documentation updates. Please see the newer version of foreign/custom-join interface patch. * enable_foreignjoin parameter ------------------------------ I'm uncertain whether we should have this GUC parameter that affects to all FDW implementation. Rather than a built-in one, my preference is an individual GUC variable defined with DefineCustomBoolVariable(), by postgres_fdw. Pros: It offers user more flexible configuration. Cons: Each FDW has to implement this GUC by itself? * syntax violated query if empty targetlist ------------------------------------------- At least NULL shall be injected if no columns are referenced. Also, add a dummy entry to fdw_ps_tlist to fit slot tuple descriptor. postgres=# explain verbose select NULL from ft1,ft2 where aid=bid; QUERY PLAN ---------------------------------------------------------------------------Foreign Scan (cost=100.00..129.25 rows=2925 width=0) Output: NULL::unknown Remote SQL: SELECT FROM (SELECT bid, NULL FROM public.t2) l (a_0, a_1) INNERJOIN (SELECTaid, NULL FROM public.t1) r (a_0, a_1) ON ((r.a_0 = l.a_0)) * Bug reported by Thom Brown ----------------------------- # EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN countries ON people.country_id = countries.idLIMIT 3) x; ERROR: could not open relation with OID 0 Sorry, it was a problem caused by my portion. The patched setrefs.c checks fdw_/custom_ps_tlist to determine whether Foreign/CustomScan node is associated with a certain base relation. If *_ps_tlist is valid, it also expects scanrelid == 0. However, things we should check is incorrect. We may have a case with empty *_ps_tlist if remote join expects no columns. So, I adjusted the condition to check scanrelid instead. * make_tuple_from_result_row() call ------------------------------------ The 4th argument (newly added) is referenced without NULL checks, however, store_returning_result() and analyze_row_processor() put NULL on this argument, then it leads segmentation fault. RelationGetDescr(fmstate->rel or astate->rel) should be given on the caller. * regression test crash ------------------------ The query below gets crashed: UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT FROM ft1 WHERE ft1.c1= ft2.c2 AND ft1.c1 % 10 = 9; According to the crash dump, tidin() got a cstring input with unexpected format. I guess column number is incorrectly assigned, but have no clear scenario at this moment. #0 0x00007f9b45a11513 in conversion_error_callback (arg=0x7fffc257ecc0) at postgres_fdw.c:3293 #1 0x00000000008e51d6 in errfinish (dummy=0) at elog.c:436 #2 0x00000000008935cd in tidin (fcinfo=0x7fffc257e8a0) at tid.c:69 /home/kaigai/repo/sepgsql/src/backend/utils/adt/tid.c:69:1734:beg:0x8935cd (gdb) p str $6 = 0x1d17cf7 "foo" Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com> > -----Original Message----- > From: Shigeru Hanada [mailto:shigeru.hanada@gmail.com] > Sent: Monday, March 02, 2015 9:48 PM > To: Kaigai Kouhei(海外 浩平) > Cc: Robert Haas; PostgreSQL-development > Subject: ##freemail## Re: [HACKERS] Join push-down support for foreign tables > > Attached is the revised/rebased version of the $SUBJECT. > > This patch is based on Kaigai-san's custom/foreign join patch, so > please apply it before this patch. In this version I changed some > points from original postgres_fdw. > > 1) Disabled SELECT clause optimization > ~9.4 postgres_fdw lists only columns actually used in SELECT clause, > but AFAIS it makes SQL generation complex. So I disabled such > optimization and put "NULL" for unnecessary columns in SELECT clause > of remote query. > > 2) Extended deparse context > To allow deparsing based on multiple source relations, I added some > members to context structure. They are unnecessary for simple query > with single foreign table, but IMO it should be integrated. > > With Kaigai-san's advise, changes for supporting foreign join on > postgres_fdw is minimized into postgres_fdw itself. But I added new > FDW API named GetForeignJoinPaths() to keep the policy that all > interface between core and FDW should be in FdwRoutine, instead of > using hook function. Now I'm writing document about it, and will post > it in a day. > > 2015-02-19 16:19 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>: > > 2015-02-17 10:39 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>: > >> Let me put some comments in addition to where you're checking now. > >> > >> [design issues] > >> * Cost estimation > >> Estimation and evaluation of cost for remote join query is not an > >> obvious issue. In principle, local side cannot determine the cost > >> to run remote join without remote EXPLAIN, because local side has > >> no information about JOIN logic applied on the remote side. > >> Probably, we have to put an assumption for remote join algorithm, > >> because local planner has no idea about remote planner's choice > >> unless foreign-join don't take "use_remote_estimate". > >> I think, it is reasonable assumption (even if it is incorrect) to > >> calculate remote join cost based on local hash-join algorithm. > >> If user wants more correct estimation, remote EXPLAIN will make > >> more reliable cost estimation. > > > > Hm, I guess that you chose hash-join as "least-costed join". In the > > pgbench model, most combination between two tables generate hash join > > as cheapest path. Remote EXPLAIN is very expensive in the context of > > planning, so it would easily make the plan optimization meaningless. > > But giving an option to users is good, I agree. > > > >> > >> It also needs a consensus whether cost for remote CPU execution is > >> equivalent to local CPU. If we think local CPU is rare resource > >> than remote one, a discount rate will make planner more preferable > >> to choose remote join than local one > > > > Something like cpu_cost_ratio as a new server-level FDW option? > > > >> > >> Once we assume a join algorithm for remote join, unit cost for > >> remote CPU, we can calculate a cost for foreign join based on > >> the local join logic plus cost for network translation (maybe > >> fdw_tuple_cost?). > > > > Yes, sum of these costs is the total cost of a remote join. > > o fdw_startup_cost > > o hash-join cost, estimated as a local join > > o fdw_tuple_cost * rows * width > > > >> * FDW options > >> Unlike table scan, FDW options we should refer is unclear. > >> Table level FDW options are associated with a foreign table as > >> literal. I think we have two options here: > >> 1. Foreign-join refers FDW options for foreign-server, but ones > >> for foreign-tables are ignored. > >> 2. Foreign-join is prohibited when both of relations don't have > >> identical FDW options. > >> My preference is 2. Even though N-way foreign join, it ensures > >> all the tables involved with (N-1)-way foreign join has identical > >> FDW options, thus it leads we can make N-way foreign join with > >> all identical FDW options. > >> One exception is "updatable" flag of postgres_fdw. It does not > >> make sense on remote join, so I think mixture of updatable and > >> non-updatable foreign tables should be admitted, however, it is > >> a decision by FDW driver. > >> > >> Probably, above points need to take time for getting consensus. > >> I'd like to see your opinion prior to editing your patch. > > > > postgres_fdw can't push down a join which contains foreign tables on > > multiple servers, so use_remote_estimate and fdw_startup_cost are the > > only FDW options to consider. So we have options for each option. > > > > 1-a. If all foreign tables in the join has identical > > use_remote_estimate, allow pushing down. > > 1-b. If any of foreign table in the join has true as > > use_remote_estimate, use remote estimate. > > > > 2-a. If all foreign tables in the join has identical fdw_startup_cost, > > allow pushing down. > > 2-b. Always use max value in the join. (cost would be more expensive) > > 2-c. Always use min value in the join. (cost would be cheaper) > > > > I prefer 1-a and 2-b, so more joins avoid remote EXPLAIN but have > > reasonable cost about startup. > > > > I agree about "updatable" option. > > > >> > >> [implementation issues] > >> The interface does not intend to add new Path/Plan type for each scan > >> that replaces foreign joins. What postgres_fdw should do is, adding > >> ForeignPath towards a particular joinrel, then it populates ForeignScan > >> with remote join query once it got chosen by the planner. > > > > That idea is interesting, and make many things simpler. Please let me consider. > > > >> > >> A few functions added in src/backend/foreign/foreign.c are not > >> called by anywhere, at this moment. > >> > >> create_plan_recurse() is reverted to static. It is needed for custom- > >> join enhancement, if no other infrastructure can support. > > > > I made it back to static because I thought that create_plan_recurse > > can be called by core before giving control to FDWs. But I'm not sure > > it can be applied to custom scans. I'll recheck that part. > > > > > > -- > > Shigeru HANADA > > > > -- > Shigeru HANADA
pgsql-hackers by date: