Re: PassDownLimitBound for ForeignScan/CustomScan - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: PassDownLimitBound for ForeignScan/CustomScan |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F8012245CC@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: PassDownLimitBound for ForeignScan/CustomScan (Jeevan Chalke <jeevan.chalke@enterprisedb.com>) |
Responses |
Re: PassDownLimitBound for ForeignScan/CustomScan
|
List | pgsql-hackers |
> On Mon, Aug 29, 2016 at 7:25 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > > > Hello, > > The attached patch adds an optional callback to support special optimization > if ForeignScan/CustomScan are located under the Limit node in plan-tree. > > Our sort node wisely switches the behavior when we can preliminary know > exact number of rows to be produced, because all the Sort node has to > return is the top-k rows when it is located under the Limit node. > It is much lightweight workloads than sorting of entire input rows when > nrows is not small. > > In my case, this information is very useful because GPU can complete its > sorting operations mostly on L1-grade memory if we can preliminary know > the top-k value is enough small and fits to size of the fast memory. > > Probably, it is also valuable for Fujita-san's case because this information > allows to attach "LIMIT k" clause on the remote query of postgres_fdw. > It will reduce amount of the network traffic and remote CPU consumption > once we got support of sort pushdown. > > > > One thing we need to pay attention is cost estimation on the planner stage. > In the existing code, only create_ordered_paths() and > create_merge_append_path() > considers the limit clause for cost estimation of sorting. They use the > 'limit_tuples' of PlannerInfo; we can reference the structure when extension > adds ForeignPath/CustomPath, so I think we don't need a special enhancement > on the planner stage. > Thanks for your comments. > I believe this hook is gets called at execution time. > So to push LIMIT clause like you said above we should use "limit_tuples" at the time > of planning and then use this hook to optimize at runtime, right? > Yes. For more correctness, a valid "limit_tuples" of PlannerInfo is set only when LIMIT clause takes constant values; it is true for most of use case. Then, the hook I added shall be called at execution time for more exact optimization. If FDW/CSP cannot accept uncertain number of rows to generate on planning time, it is not a duty to provide its own path which is optimized for small number of LIMIT clause. > Apart from that, attached patch applies cleanly on latest sources and found no issues > with make or with regressions. > > However this patch is an infrastructure for any possible optimization when > foreign/customscan is under LIMIT. > > So look good to me. > > I quickly tried adding a hook support in postgres_fdw, and it gets called correctly > when we have foreignscan with LIMIT (limit being evaluated on local server). > > So code wise no issue. Also add this hook details in documentation. > OK, I'll try to write up some detailed documentation stuff; not only API specification. Best regards, > > Thanks > > > > Thanks, > -- > NEC Business Creation Division / PG-Strom Project > KaiGai Kohei <kaigai@ak.jp.nec.com> > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > <http://www.postgresql.org/mailpref/pgsql-hackers> > > > > > > > -- > > Jeevan B Chalke > Principal Software Engineer, Product Development > EnterpriseDB Corporation > The Enterprise PostgreSQL Company >
pgsql-hackers by date: