Re: [v9.5] Custom Plan API - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: [v9.5] Custom Plan API |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F8F9E93E@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: [v9.5] Custom Plan API (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: [v9.5] Custom Plan API
|
List | pgsql-hackers |
> On 7 May 2014 02:05, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > > Prior to the development cycle towards v9.5, I'd like to reopen the > > discussion of custom-plan interface. Even though we had lots of > > discussion during the last three commit-fests, several issues are > > still under discussion. So, I'd like to clarify direction of the > > implementation, prior to the first commit-fest. > > > > (1) DDL support and system catalog > > > > Simon suggested that DDL command should be supported to track custom- > > plan providers being installed, and to avoid nonsense hook calls if it > > is an obvious case that custom-plan provider can help. It also makes > > sense to give a chance to load extensions once installed. > > (In the previous design, I assumed modules are loaded by LOAD command > > or *_preload_libraries parameters). > > > > I tried to implement the following syntax: > > > > CREATE CUSTOM PLAN <name> FOR (scan|join|any) HANDLER <func_name>; > > Thank you for exploring that thought and leading the way on this research. > I've been thinking about this also. > > What I think we need is a declarative form that expresses the linkage between > base table(s) and a related data structures that can be used to optimize > a query, while still providing accurate results. > > In other DBMS, we have concepts such as a JoinIndex or a MatView which allow > some kind of lookaside behaviour. Just for clarity, a concrete example is > Oracle's Materialized Views which can be set using ENABLE QUERY REWRITE > so that the MatView can be used as an alternative path for a query. We do > already have this concept in PostgreSQL, where an index can be used to > perform an IndexOnlyScan rather than accessing the heap itself. > > We have considerable evidence that the idea of alternate data structures > results in performance gains. > * KaiGai's work - https://wiki.postgresql.org/wiki/PGStrom > * http://www.postgresql.org/message-id/52C59858.9090500@garret.ru > * http://citusdata.github.io/cstore_fdw/ > * University of Manchester - exploring GPUs as part of the AXLE project > * Barcelona SuperComputer Centre - exploring FPGAs, as part of the AXLE > project > * Some other authors have also cited gains using GPU technology in databases > > So I would like to have a mechanism that provides a *generic* Lookaside > for a table or foreign table. > > Tom and Kevin have previously expressed that MatViews would represent a > planning problem, in the general case. One way to solve that planning issue > is to link structures directly together, in the same way that an index and > a table are linked. We can then process the lookaside in the same way we > handle a partial index - check prerequisites and if usable, calculate a > cost for the alternate path. > We need not add planning time other than to the tables that might benefit > from that. > > Roughly, I'm thinking of this... > > CREATE LOOKASIDE ON foo > TO foo_mat_view; > > and also this... > > CREATE LOOKASIDE ON foo > TO foo_as_a_foreign_table /* e.g. PGStrom */ > > This would allow the planner to consider alternate plans for foo_mv during > set_plain_rel_pathlist() similarly to the way it considers index paths, > in one of the common cases that the mat view covers just one table. > > This concept is similar to ENABLE QUERY REWRITE in Oracle, but this thought > goes much further, to include any generic user-defined data structure or > foreign table. > Let me clarify. This mechanism allows to add alternative scan/join paths including built-in ones, not only custom enhanced plan/exec node, isn't it? Probably, it is a variation of above proposition if we install a handler function that proposes built-in path nodes towards the request for scan/join. > Do we need this? For MVs, we *might* be able to deduce that the MV is > rewritable for "foo", but that is not deducible for Foreign Tables, by > current definition, so I prefer the explicit definition of objects that > are linked - since doing this for indexes is already familiar to people. > > Having an explicit linkage between data structures allows us to enhance > an existing application by transaparently adding new structures, just as > we already do with indexes. Specifically, that we allow more than one > lookaside structure on any one table. > Not only alternative data structure, alternative method to scan/join towards same data structure is also important, isn't it? > Forget the exact name, thats not important. But I think the requirements > here are... > > * Explicit definition that we are attaching an alternate path onto a table > (conceptually similar to adding an index) > I think the syntax allows "tables", not only a particular table. It will inform the core planner this lookaside/customplan (name is not important, anyway this feature...) can provide alternative path towards the set of relations; being considered. So, it allows to reduce number of function calls on planner stage. > * Ability to check that the alternate path is viable (similar to the way > we validate use of partial indexes prior to usage) > Checks on columns(SELECT), rows(WHERE), aggregations(GROUP) > I never deny it... but do you think this feature from the initial version?? > * Ability to consider access cost for both normal table and alternate path > (like an index) - this allows the alternate path to *not* be chosen when > we are performing some operation that is sub-optimal (for whatever reason). > It is an usual job of existing planner, isn't it? > * There may be some need to define operator classes that are implemented > via the alternate path > > which works for single tables, but a later requirement would then be > > * allows the join of one or more tables to be replaced with a single lookaside > It's higher priority for me, and I guess it is same in MatView usage. > Hopefully, we won't need a "Custom Plan" at all, just the ability to > lookaside when useful. > Probably, lookaside is a special case in the scenario that custom-plan can provide. I also think it is an attractive use case if we can redirect a particular complicated join into a MatView reference. So, it makes sense to bundle a handler function to replace join by matview reference. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
pgsql-hackers by date: