Thread: ctidscan as an example of custom-scan (Re: [v9.5] Custom Plan API)
This attached patch adds a code example of custom-scan interface. This custom-scan provider ("ctidscan") performs almost same as built-in SeqScan plan, but can produce same results with less page scan in case when qualifier of relation scan has inequality operators (>, >=, < or <=) on "ctid" system column, therefore, range to be scanned is less than case of full-table scan. Below is an example: postgres=# EXPLAIN SELECT * FROM t1 WHERE ctid > '(10,0)'::tid AND b like '%abc%'; QUERY PLAN ------------------------------------------------------------- Seq Scan on t1 (cost=0.00..10.00 rows=1 width=37) Filter: ((ctid > '(10,0)'::tid) AND (b ~~ '%abc%'::text)) (2 rows) Once ctidscan is loaded, it can provide cheaper cost to scan the "t1" table than SeqScan, so planner chooses the custom logic. postgres=# LOAD 'ctidscan'; LOAD postgres=# EXPLAIN SELECT * FROM t1 WHERE ctid > '(10,0)'::tid AND b like '%abc%'; QUERY PLAN ----------------------------------------------------------------- Custom Scan (ctidscan) on t1 (cost=0.00..5.00 rows=1 width=37) Filter: ((ctid > '(10,0)'::tid) AND (b ~~ '%abc%'::text)) ctid quals: (ctid > '(10,0)'::tid) (3 rows) Like other query execution logic, it also provides "enable_ctidscan" parameter to turn on/off this feature. I'm not certain whether we should have this functionality in contrib from the perspective of workload that can help, but its major worth is for an example of custom-scan interface. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com> > -----Original Message----- > From: Robert Haas [mailto:robertmhaas@gmail.com] > Sent: Thursday, December 11, 2014 11:58 AM > To: Simon Riggs > Cc: Kaigai Kouhei(海外 浩平); Thom Brown; Kohei KaiGai; Tom Lane; Alvaro > Herrera; Shigeru Hanada; Stephen Frost; Andres Freund; PgHacker; Jim > Mlodgenski; Peter Eisentraut > Subject: ##freemail## Re: [HACKERS] [v9.5] Custom Plan API > > On Tue, Dec 9, 2014 at 3:24 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > Feedback I am receiving is that the API is unusable. That could be > > because it is impenetrable, or because it is unusable. I'm not sure it > > matters which. > > It would be nice to here what someone is trying to use it for and what problems > that person is encountering. Without that, it's pretty much impossible > for anyone to fix anything. > > As for sample code, KaiGai had a working example, which of course got broken > when Tom changed the API, but it didn't look to me like Tom's changes would > have made anything impossible that was possible before. > I'm frankly kind of astonished by the tenor of this entire conversation; > there is certainly plenty of code in the backend that is less > self-documenting than this is; and KaiGai did already put up a wiki page > with documentation as you requested. From his response, it sounds like > he has updated the ctidscan code, too. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL > Company
Attachment
On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > I'm not certain whether we should have this functionality in contrib > from the perspective of workload that can help, but its major worth is > for an example of custom-scan interface. worker_spi is now in src/test/modules. We may add it there as well, no? -- Michael
> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > > I'm not certain whether we should have this functionality in contrib > > from the perspective of workload that can help, but its major worth is > > for an example of custom-scan interface. > worker_spi is now in src/test/modules. We may add it there as well, no? > Hmm, it makes sense for me. Does it also make sense to add a test-case to the core regression test cases? Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
> > On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> > wrote: > > > I'm not certain whether we should have this functionality in contrib > > > from the perspective of workload that can help, but its major worth > > > is for an example of custom-scan interface. > > worker_spi is now in src/test/modules. We may add it there as well, no? > > > Hmm, it makes sense for me. Does it also make sense to add a test-case to > the core regression test cases? > The attached patch adds ctidscan module at test/module instead of contrib. Basic portion is not changed from the previous post, but file locations and test cases in regression test are changed. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
Attachment
On 12/24/14, 7:54 PM, Kouhei Kaigai wrote: >>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> >> wrote: >>>> I'm not certain whether we should have this functionality in contrib >>>> from the perspective of workload that can help, but its major worth >>>> is for an example of custom-scan interface. >>> worker_spi is now in src/test/modules. We may add it there as well, no? >>> >> Hmm, it makes sense for me. Does it also make sense to add a test-case to >> the core regression test cases? >> > The attached patch adds ctidscan module at test/module instead of contrib. > Basic portion is not changed from the previous post, but file locations > and test cases in regression test are changed. First, I'm glad for this. It will be VERY valuable for anyone trying to clean up the end of a majorly bloated table. Here's a partial review... +++ b/src/test/modules/ctidscan/ctidscan.c +/* missing declaration in pg_proc.h */ +#ifndef TIDGreaterOperator +#define TIDGreaterOperator 2800 ... If we're calling that out here, should we have a corresponding comment in pg_proc.h, in case these ever get renumbered? +CTidQualFromExpr(Node *expr, int varno) ... + if (IsCTIDVar(arg1, varno)) + other = arg2; + else if (IsCTIDVar(arg2, varno)) + other = arg1; + else + return NULL; + if (exprType(other) != TIDOID) + return NULL; /* should not happen */ + /* The other argument must be a pseudoconstant */ + if (!is_pseudo_constant_clause(other)) + return NULL; I think this needs some additional blank lines... + if (IsCTIDVar(arg1, varno)) + other = arg2; + else if (IsCTIDVar(arg2, varno)) + other = arg1; + else + return NULL; + + if (exprType(other) != TIDOID) + return NULL; /* should not happen */ + + /* The other argument must be a pseudoconstant */ + if (!is_pseudo_constant_clause(other)) + return NULL; + * CTidEstimateCosts + * + * It estimates cost to scan the target relation according to the given + * restriction clauses. Its logic to scan relations are almost same as + * SeqScan doing, because it uses regular heap_getnext(), except for + * the number of tuples to be scanned if restriction clauses work well. <grammar>That should read "same as what SeqScan is doing"... however, what actual function are you talking about? I couldn'tfind SeqScanEstimateCosts (or anything ending EstimateCosts). BTW, there's other grammar issues but it'd be best to handle those all at once after all the code stuff is done. + opno = get_commutator(op->opno); What happens if there's no commutator? Perhaps best to Assert(opno != InvalidOid) at the end of that if block. Though, it seems things will fall apart anyway if ctid_quals isn't exactly what we're expecting; I don't know if that's OKor not. + /* disk costs --- assume each tuple on a different page */ + run_cost += spc_random_page_cost * ntuples; Isn't that extremely pessimistic? I'm not familiar enough with the custom-scan stuff to really comment past this point, and I could certainly be missing somedetails about planning and execution. I do have some concerns about the regression test, but perhaps I'm just being paranoid: - The EXPLAIN tests don't cover >. They do cover <= and >= via BETWEEN. - Nothing tests the case of '...'::tid op ctid; only lvalue cases are tested. Also, it seems that we don't handle joining on a ctid qual... is that intentional? I know that sounds silly, but you'd probablywant to do that if you're trying to move tuples off the end of a bloated table. You could work around it by constructinga dynamic SQL string, but it'd be easier to do something like: UPDATE table1 SET ... WHERE ctid >= (SELECT '(' || relpages || ',0)' FROM pg_class WHERE oid = 'table1'::regclass) ; in some kind of loop. Obviously better to only handle what you already are then not get this in at all though... :) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Jim, Thanks for your reviewing the patch. The attached patch is revised one according to your suggestion, and also includes bug fix I could found. * Definitions of TIDXXXXOperator was moved to pg_operator.h as other operator doing. * Support the case of "ctid (operator) Param" expression. * Add checks if commutator of TID was not found. * Qualifiers gets extracted on plan stage, not path stage. * Adjust cost estimation logic to fit SeqScan manner. * Add some new test cases for regression test. > On 12/24/14, 7:54 PM, Kouhei Kaigai wrote: > >>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai > >>> <kaigai@ak.jp.nec.com> > >> wrote: > >>>> I'm not certain whether we should have this functionality in > >>>> contrib from the perspective of workload that can help, but its > >>>> major worth is for an example of custom-scan interface. > >>> worker_spi is now in src/test/modules. We may add it there as well, > no? > >>> > >> Hmm, it makes sense for me. Does it also make sense to add a > >> test-case to the core regression test cases? > >> > > The attached patch adds ctidscan module at test/module instead of contrib. > > Basic portion is not changed from the previous post, but file > > locations and test cases in regression test are changed. > > First, I'm glad for this. It will be VERY valuable for anyone trying to > clean up the end of a majorly bloated table. > > Here's a partial review... > > +++ b/src/test/modules/ctidscan/ctidscan.c > > +/* missing declaration in pg_proc.h */ > +#ifndef TIDGreaterOperator > +#define TIDGreaterOperator 2800 > ... > If we're calling that out here, should we have a corresponding comment in > pg_proc.h, in case these ever get renumbered? > It was a quick hack when I moved this module out of the tree. Yep, I should revert when I submit this patch again. > +CTidQualFromExpr(Node *expr, int varno) > ... > + if (IsCTIDVar(arg1, varno)) > + other = arg2; > + else if (IsCTIDVar(arg2, varno)) > + other = arg1; > + else > + return NULL; > + if (exprType(other) != TIDOID) > + return NULL; /* should not happen */ > + /* The other argument must be a pseudoconstant */ > + if (!is_pseudo_constant_clause(other)) > + return NULL; > > I think this needs some additional blank lines... > OK. And, I also noticed the coding style around this function is different from other built-in plans, so I redefined the role of this function just to check whether the supplied RestrictInfo is OpExpr that involves TID inequality operator, or not. Expression node shall be extracted when Plan node is created from Path node. > + if (IsCTIDVar(arg1, varno)) > + other = arg2; > + else if (IsCTIDVar(arg2, varno)) > + other = arg1; > + else > + return NULL; > + > + if (exprType(other) != TIDOID) > + return NULL; /* should not happen */ > + > + /* The other argument must be a pseudoconstant */ > + if (!is_pseudo_constant_clause(other)) > + return NULL; > > + * CTidEstimateCosts > + * > + * It estimates cost to scan the target relation according to the given > + * restriction clauses. Its logic to scan relations are almost same as > + * SeqScan doing, because it uses regular heap_getnext(), except for > + * the number of tuples to be scanned if restriction clauses work well. > > <grammar>That should read "same as what SeqScan is doing"... however, what > actual function are you talking about? I couldn't find SeqScanEstimateCosts > (or anything ending EstimateCosts). > It is cost_seqscan(). But I don't put a raw function name in the source code comments in other portion, because nobody can guarantee it is up to date in the future... > BTW, there's other grammar issues but it'd be best to handle those all at > once after all the code stuff is done. > Yep. Help by native English speaker is very helpful for us. > + opno = get_commutator(op->opno); > What happens if there's no commutator? Perhaps best to Assert(opno != > InvalidOid) at the end of that if block. > Usually, commutator operator of TID is defined on initdb time, however, nobody can guarantee a mad superuser doesn't drop it. So, I added elog(ERROR,...) here. > Though, it seems things will fall apart anyway if ctid_quals isn't exactly > what we're expecting; I don't know if that's OK or not. > No worry, it was already checked on planning stage. > + /* disk costs --- assume each tuple on a different page */ > + run_cost += spc_random_page_cost * ntuples; > Isn't that extremely pessimistic? > Probably. I follow the manner of SeqScan. > I'm not familiar enough with the custom-scan stuff to really comment past > this point, and I could certainly be missing some details about planning > and execution. > > I do have some concerns about the regression test, but perhaps I'm just > being paranoid: > > - The EXPLAIN tests don't cover >. They do cover <= and >= via BETWEEN. > - Nothing tests the case of '...'::tid op ctid; only lvalue cases are tested. > Both are added. > Also, it seems that we don't handle joining on a ctid qual... is that > intentional? > Yep. This module does not intend to handle joining, because custom-/ foreign-join interface is still under the discussion. https://commitfest.postgresql.org/action/patch_view?id=1653 Hanada-san makes an enhancement of postgres_fdw on this enhancement. https://commitfest.postgresql.org/action/patch_view?id=1674 > I know that sounds silly, but you'd probably want to do that > if you're trying to move tuples off the end of a bloated table. You could > work around it by constructing a dynamic SQL string, but it'd be easier > to do something like: > > UPDATE table1 SET ... > WHERE ctid >= (SELECT '(' || relpages || ',0)' FROM pg_class WHERE oid > = 'table1'::regclass) ; > This example noticed me that the previous version didn't support the case of "ctid (operator) Param". So, I enhanced to support above case, and update the regression test also. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
Attachment
<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Tue, Jan 6, 2015 at 10:51 PM, Kouhei Kaigai<span dir="ltr"><<a href="mailto:kaigai@ak.jp.nec.com" target="_blank">kaigai@ak.jp.nec.com</a>></span> wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Jim, Thanksfor your reviewing the patch.<br /><br /> The attached patch is revised one according to your suggestion,<br /> andalso includes bug fix I could found.<br /><br /> * Definitions of TIDXXXXOperator was moved to pg_operator.h<br /> asother operator doing.<br /> * Support the case of "ctid (operator) Param" expression.<br /> * Add checks if commutatorof TID was not found.<br /> * Qualifiers gets extracted on plan stage, not path stage.<br /> * Adjust cost estimationlogic to fit SeqScan manner.<br /> * Add some new test cases for regression test.<br /><span class=""><br /> >On 12/24/14, 7:54 PM, Kouhei Kaigai wrote:<br /> > >>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai<br/> > >>> <<a href="mailto:kaigai@ak.jp.nec.com">kaigai@ak.jp.nec.com</a>><br /> > >>wrote:<br /> > >>>> I'm not certain whether we should have this functionality in<br /> > >>>>contrib from the perspective of workload that can help, but its<br /> > >>>> major worth isfor an example of custom-scan interface.<br /> > >>> worker_spi is now in src/test/modules. We may add it thereas well,<br /> > no?<br /> > >>><br /> > >> Hmm, it makes sense for me. Does it also make senseto add a<br /> > >> test-case to the core regression test cases?<br /> > >><br /> > > The attachedpatch adds ctidscan module at test/module instead of contrib.<br /> > > Basic portion is not changed from theprevious post, but file<br /> > > locations and test cases in regression test are changed.<br /> ><br /> >First, I'm glad for this. It will be VERY valuable for anyone trying to<br /> > clean up the end of a majorly bloatedtable.<br /> ><br /> > Here's a partial review...<br /> ><br /> > +++ b/src/test/modules/ctidscan/ctidscan.c<br/> ><br /> > +/* missing declaration in pg_proc.h */<br /> > +#ifndef TIDGreaterOperator<br/> > +#define TIDGreaterOperator 2800<br /> > ...<br /> > If we're calling that outhere, should we have a corresponding comment in<br /> > pg_proc.h, in case these ever get renumbered?<br /> ><br/></span>It was a quick hack when I moved this module out of the tree.<br /> Yep, I should revert when I submit thispatch again.<br /><span class=""><br /> > +CTidQualFromExpr(Node *expr, int varno)<br /> > ...<br /> > + if (IsCTIDVar(arg1, varno))<br /> > + other = arg2;<br /> > + else if (IsCTIDVar(arg2,varno))<br /> > + other = arg1;<br /> > + else<br /> > + return NULL;<br /> > + if (exprType(other) != TIDOID)<br /> > + returnNULL; /* should not happen */<br /> > + /* The other argument must be a pseudoconstant */<br />> + if (!is_pseudo_constant_clause(other))<br /> > + return NULL;<br /> ><br />> I think this needs some additional blank lines...<br /> ><br /></span>OK. And, I also noticed the coding stylearound this function is<br /> different from other built-in plans, so I redefined the role of<br /> this function justto check whether the supplied RestrictInfo is<br /> OpExpr that involves TID inequality operator, or not.<br /><br />Expression node shall be extracted when Plan node is created<br /> from Path node.<br /><span class=""><br /> > + if (IsCTIDVar(arg1, varno))<br /> > + other = arg2;<br /> > + else if (IsCTIDVar(arg2,varno))<br /> > + other = arg1;<br /> > + else<br /> > + return NULL;<br /> > +<br /> > + if (exprType(other) != TIDOID)<br /> > + return NULL; /* should not happen */<br /> > +<br /> > + /* The other argument must be a pseudoconstant*/<br /> > + if (!is_pseudo_constant_clause(other))<br /> > + returnNULL;<br /> ><br /> > + * CTidEstimateCosts<br /> > + *<br /> > + * It estimates cost to scan the targetrelation according to the given<br /> > + * restriction clauses. Its logic to scan relations are almost same as<br/> > + * SeqScan doing, because it uses regular heap_getnext(), except for<br /> > + * the number of tuples tobe scanned if restriction clauses work well.<br /> ><br /> > <grammar>That should read "same as what SeqScanis doing"... however, what<br /> > actual function are you talking about? I couldn't find SeqScanEstimateCosts<br/> > (or anything ending EstimateCosts).<br /> ><br /></span>It is cost_seqscan(). But I don'tput a raw function name in the source<br /> code comments in other portion, because nobody can guarantee it is up to<br/> date in the future...<br /><span class=""><br /> > BTW, there's other grammar issues but it'd be best to handlethose all at<br /> > once after all the code stuff is done.<br /> ><br /></span>Yep. Help by native English speakeris very helpful for us.<br /><span class=""><br /> > + opno = get_commutator(op->opno);<br/> > What happens if there's no commutator? Perhaps best to Assert(opno !=<br /> > InvalidOid)at the end of that if block.<br /> ><br /></span>Usually, commutator operator of TID is defined on initdb time,however,<br /> nobody can guarantee a mad superuser doesn't drop it.<br /> So, I added elog(ERROR,...) here.<br /><spanclass=""><br /><br /> > Though, it seems things will fall apart anyway if ctid_quals isn't exactly<br /> > whatwe're expecting; I don't know if that's OK or not.<br /> ><br /></span>No worry, it was already checked on planningstage.<br /><span class=""><br /> > + /* disk costs --- assume each tuple on a different page */<br /> >+ run_cost += spc_random_page_cost * ntuples;<br /> > Isn't that extremely pessimistic?<br /> ><br /></span>Probably.I follow the manner of SeqScan.<br /><span class=""><br /> > I'm not familiar enough with the custom-scanstuff to really comment past<br /> > this point, and I could certainly be missing some details about planning<br/> > and execution.<br /> ><br /> > I do have some concerns about the regression test, but perhaps I'mjust<br /> > being paranoid:<br /> ><br /> > - The EXPLAIN tests don't cover >. They do cover <= and >=via BETWEEN.<br /> > - Nothing tests the case of '...'::tid op ctid; only lvalue cases are tested.<br /> ><br/></span>Both are added.<br /><span class=""><br /> > Also, it seems that we don't handle joining on a ctid qual...is that<br /> > intentional?<br /> ><br /></span>Yep. This module does not intend to handle joining, becausecustom-/<br /> foreign-join interface is still under the discussion.<br /> <a href="https://commitfest.postgresql.org/action/patch_view?id=1653" target="_blank">https://commitfest.postgresql.org/action/patch_view?id=1653</a><br/><br /> Hanada-san makes an enhancementof postgres_fdw on this enhancement.<br /> <a href="https://commitfest.postgresql.org/action/patch_view?id=1674" target="_blank">https://commitfest.postgresql.org/action/patch_view?id=1674</a><br/><span class=""><br /> > I know thatsounds silly, but you'd probably want to do that<br /> > if you're trying to move tuples off the end of a bloatedtable. You could<br /> > work around it by constructing a dynamic SQL string, but it'd be easier<br /> > todo something like:<br /> ><br /> > UPDATE table1 SET ...<br /> > WHERE ctid >= (SELECT '(' || relpages ||',0)' FROM pg_class WHERE oid<br /> > = 'table1'::regclass) ;<br /> ><br /></span>This example noticed me that theprevious version didn't support the case<br /> of "ctid (operator) Param".<br /> So, I enhanced to support above case,and update the regression test also.<br /></blockquote></div><br /></div><div class="gmail_extra">Moved this patch tonext CF 2015-02 because of lack of review(ers).<br />-- <br /><div class="gmail_signature">Michael<br /></div></div></div>
Folks, > Moved this patch to next CF 2015-02 because of lack of review(ers). > Do we still need this patch as contrib module? It was originally required it as example of custom-scan interface last summer, however, here was no strong requirement after that, then, it was bumped to v9.6 development cycle. If somebody still needs it, I'll rebase and adjust the patch towards the latest custom-scan interface. However, I cannot be motivated for the feature nobody wants. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com> > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier > Sent: Friday, February 13, 2015 4:41 PM > To: Kaigai Kouhei(海外 浩平) > Cc: Jim Nasby; Robert Haas; Simon Riggs; Kohei KaiGai; PgHacker > Subject: Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom > Plan API) > > > > On Tue, Jan 6, 2015 at 10:51 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > > > Jim, Thanks for your reviewing the patch. > > The attached patch is revised one according to your suggestion, > and also includes bug fix I could found. > > * Definitions of TIDXXXXOperator was moved to pg_operator.h > as other operator doing. > * Support the case of "ctid (operator) Param" expression. > * Add checks if commutator of TID was not found. > * Qualifiers gets extracted on plan stage, not path stage. > * Adjust cost estimation logic to fit SeqScan manner. > * Add some new test cases for regression test. > > > On 12/24/14, 7:54 PM, Kouhei Kaigai wrote: > > >>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai > > >>> <kaigai@ak.jp.nec.com> > > >> wrote: > > >>>> I'm not certain whether we should have this functionality in > > >>>> contrib from the perspective of workload that can help, but its > > >>>> major worth is for an example of custom-scan interface. > > >>> worker_spi is now in src/test/modules. We may add it there as well, > > no? > > >>> > > >> Hmm, it makes sense for me. Does it also make sense to add a > > >> test-case to the core regression test cases? > > >> > > > The attached patch adds ctidscan module at test/module instead of > contrib. > > > Basic portion is not changed from the previous post, but file > > > locations and test cases in regression test are changed. > > > > First, I'm glad for this. It will be VERY valuable for anyone trying > to > > clean up the end of a majorly bloated table. > > > > Here's a partial review... > > > > +++ b/src/test/modules/ctidscan/ctidscan.c > > > > +/* missing declaration in pg_proc.h */ > > +#ifndef TIDGreaterOperator > > +#define TIDGreaterOperator 2800 > > ... > > If we're calling that out here, should we have a corresponding comment > in > > pg_proc.h, in case these ever get renumbered? > > > It was a quick hack when I moved this module out of the tree. > Yep, I should revert when I submit this patch again. > > > +CTidQualFromExpr(Node *expr, int varno) > > ... > > + if (IsCTIDVar(arg1, varno)) > > + other = arg2; > > + else if (IsCTIDVar(arg2, varno)) > > + other = arg1; > > + else > > + return NULL; > > + if (exprType(other) != TIDOID) > > + return NULL; /* should not happen */ > > + /* The other argument must be a pseudoconstant */ > > + if (!is_pseudo_constant_clause(other)) > > + return NULL; > > > > I think this needs some additional blank lines... > > > OK. And, I also noticed the coding style around this function is > different from other built-in plans, so I redefined the role of > this function just to check whether the supplied RestrictInfo is > OpExpr that involves TID inequality operator, or not. > > Expression node shall be extracted when Plan node is created > from Path node. > > > + if (IsCTIDVar(arg1, varno)) > > + other = arg2; > > + else if (IsCTIDVar(arg2, varno)) > > + other = arg1; > > + else > > + return NULL; > > + > > + if (exprType(other) != TIDOID) > > + return NULL; /* should not happen */ > > + > > + /* The other argument must be a pseudoconstant */ > > + if (!is_pseudo_constant_clause(other)) > > + return NULL; > > > > + * CTidEstimateCosts > > + * > > + * It estimates cost to scan the target relation according to the given > > + * restriction clauses. Its logic to scan relations are almost same > as > > + * SeqScan doing, because it uses regular heap_getnext(), except for > > + * the number of tuples to be scanned if restriction clauses work well. > > > > <grammar>That should read "same as what SeqScan is doing"... however, > what > > actual function are you talking about? I couldn't find > SeqScanEstimateCosts > > (or anything ending EstimateCosts). > > > It is cost_seqscan(). But I don't put a raw function name in the source > code comments in other portion, because nobody can guarantee it is up > to > date in the future... > > > BTW, there's other grammar issues but it'd be best to handle those all > at > > once after all the code stuff is done. > > > Yep. Help by native English speaker is very helpful for us. > > > + opno = get_commutator(op->opno); > > What happens if there's no commutator? Perhaps best to Assert(opno != > > InvalidOid) at the end of that if block. > > > Usually, commutator operator of TID is defined on initdb time, however, > nobody can guarantee a mad superuser doesn't drop it. > So, I added elog(ERROR,...) here. > > > > Though, it seems things will fall apart anyway if ctid_quals isn't > exactly > > what we're expecting; I don't know if that's OK or not. > > > No worry, it was already checked on planning stage. > > > + /* disk costs --- assume each tuple on a different page */ > > + run_cost += spc_random_page_cost * ntuples; > > Isn't that extremely pessimistic? > > > Probably. I follow the manner of SeqScan. > > > I'm not familiar enough with the custom-scan stuff to really comment > past > > this point, and I could certainly be missing some details about planning > > and execution. > > > > I do have some concerns about the regression test, but perhaps I'm just > > being paranoid: > > > > - The EXPLAIN tests don't cover >. They do cover <= and >= via BETWEEN. > > - Nothing tests the case of '...'::tid op ctid; only lvalue cases are > tested. > > > Both are added. > > > Also, it seems that we don't handle joining on a ctid qual... is that > > intentional? > > > Yep. This module does not intend to handle joining, because custom-/ > foreign-join interface is still under the discussion. > https://commitfest.postgresql.org/action/patch_view?id=1653 > > Hanada-san makes an enhancement of postgres_fdw on this enhancement. > https://commitfest.postgresql.org/action/patch_view?id=1674 > > > I know that sounds silly, but you'd probably want to do that > > if you're trying to move tuples off the end of a bloated table. You > could > > work around it by constructing a dynamic SQL string, but it'd be easier > > to do something like: > > > > UPDATE table1 SET ... > > WHERE ctid >= (SELECT '(' || relpages || ',0)' FROM pg_class WHERE > oid > > = 'table1'::regclass) ; > > > This example noticed me that the previous version didn't support the case > of "ctid (operator) Param". > So, I enhanced to support above case, and update the regression test also. > > > > Moved this patch to next CF 2015-02 because of lack of review(ers). > -- > > Michael
Re: ctidscan as an example of custom-scan (Re: [v9.5] Custom Plan API)
From
Heikki Linnakangas
Date:
On 07/02/2015 08:21 AM, Kouhei Kaigai wrote: > Folks, > >> Moved this patch to next CF 2015-02 because of lack of review(ers). >> > Do we still need this patch as contrib module? > It was originally required it as example of custom-scan interface last > summer, however, here was no strong requirement after that, then, it > was bumped to v9.6 development cycle. > > If somebody still needs it, I'll rebase and adjust the patch towards > the latest custom-scan interface. However, I cannot be motivated for > the feature nobody wants. Robert, can you weigh in on this? Do we currently have anything in the tree that tests the Custom Scan interface? If not, would this be helpful for that purpose? - Heikki
On Fri, Jul 10, 2015 at 8:55 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> If somebody still needs it, I'll rebase and adjust the patch towards >> the latest custom-scan interface. However, I cannot be motivated for >> the feature nobody wants. > > Robert, can you weigh in on this? Do we currently have anything in the > tree that tests the Custom Scan interface? If not, would this be helpful > for that purpose? We don't have anything that currently tests the Custom Scan interface in the tree. The question is how important that is, and whether it's worth having what's basically a toy implementation just to demonstrate that the feature can work. If so, I think ctidscan is as good a toy example as any; in the interest of full disclosure, I was the one who suggested it in the first place. But I am not entirely sure it's a good idea to saddle ourselves with that maintenance effort. It would be a lot more interesting if we had an example that figured to be generally useful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > We don't have anything that currently tests the Custom Scan interface > in the tree. The question is how important that is, and whether it's > worth having what's basically a toy implementation just to demonstrate > that the feature can work. If so, I think ctidscan is as good a toy > example as any; in the interest of full disclosure, I was the one who > suggested it in the first place. But I am not entirely sure it's a > good idea to saddle ourselves with that maintenance effort. It would > be a lot more interesting if we had an example that figured to be > generally useful. As a general principle, I think it's a good idea to have a module that's mostly just a skeleton that guides people into writing something real to use whatever API is being tested. It needs to be simple enough that not much need to be deleted when writing the real thing, and complex enough to cover the parts that need covering. If whatever replaces ctidscan is too complex, it will not serve that purpose. My guess is that something whose only purpose is to test the custom scan interface for coverage purposes can be simpler than this module. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jul 14, 2015 at 3:07 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> We don't have anything that currently tests the Custom Scan interface >> in the tree. The question is how important that is, and whether it's >> worth having what's basically a toy implementation just to demonstrate >> that the feature can work. If so, I think ctidscan is as good a toy >> example as any; in the interest of full disclosure, I was the one who >> suggested it in the first place. But I am not entirely sure it's a >> good idea to saddle ourselves with that maintenance effort. It would >> be a lot more interesting if we had an example that figured to be >> generally useful. > > As a general principle, I think it's a good idea to have a module that's > mostly just a skeleton that guides people into writing something real to > use whatever API is being tested. It needs to be simple enough that not > much need to be deleted when writing the real thing, and complex enough > to cover the parts that need covering. If whatever replaces ctidscan is > too complex, it will not serve that purpose. > > My guess is that something whose only purpose is to test the custom scan > interface for coverage purposes can be simpler than this module. See, I actually think the opposite: I think we've been accumulating a reasonable amount of test code that actually serves no really useful purpose and is just cruft. Stuff like test_shm_mq and test_decoding seem like they actually catches bugs, so I like that, but I think stuff like worker_spi is actually TOO simple to be useful in building anything real, and it provides no useful test coverage, either. But this is all a matter of opinion, of course, and I'll defer to whatever the consensus is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Jul 14, 2015 at 3:07 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> As a general principle, I think it's a good idea to have a module that's >> mostly just a skeleton that guides people into writing something real to >> use whatever API is being tested. It needs to be simple enough that not >> much need to be deleted when writing the real thing, and complex enough >> to cover the parts that need covering. If whatever replaces ctidscan is >> too complex, it will not serve that purpose. >> >> My guess is that something whose only purpose is to test the custom scan >> interface for coverage purposes can be simpler than this module. > See, I actually think the opposite: I think we've been accumulating a > reasonable amount of test code that actually serves no really useful > purpose and is just cruft. Stuff like test_shm_mq and test_decoding > seem like they actually catches bugs, so I like that, but I think > stuff like worker_spi is actually TOO simple to be useful in building > anything real, and it provides no useful test coverage, either. But > this is all a matter of opinion, of course, and I'll defer to whatever > the consensus is. I think this ties into my core unhappiness with the customscan stuff, which is that I don't believe it's *possible* to do anything of very great interest with it. I think anything really useful will require core code modifications and/or hooks that don't exist now. So a finger exercise like ctidscan, even though it might have some marginal use, doesn't do much to alleviate that concern. It certainly doesn't seem like it's a suitable placeholder proving that we aren't breaking any actual use cases for the feature. (BTW, if we care about the use cases this has, such as data recovery from partially-corrupt tables, it would make way more sense and take way less net new code to just teach TidScan about it.) regards, tom lane
On Tue, Jul 14, 2015 at 4:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think this ties into my core unhappiness with the customscan stuff, > which is that I don't believe it's *possible* to do anything of very > great interest with it. I think anything really useful will require > core code modifications and/or hooks that don't exist now. So a finger > exercise like ctidscan, even though it might have some marginal use, > doesn't do much to alleviate that concern. It certainly doesn't seem > like it's a suitable placeholder proving that we aren't breaking any > actual use cases for the feature. Both you and Andres have articulated the concern that CustomScan isn't actually useful, but I still don't really understand why not. KaiGai has got working code (under a GPL license) that shows that it can be used for what he calls GpuScan and GpuJoin, and the speedups are actually pretty cool if you happen to have the right sort of query to take advantage of it. That code may be buggy and the license precludes us using it anyway, but FWICT it does seem to work. I'd be entirely amenable to improving the infrastructure such that it could be used for a broader array of purposes, and I'm also amenable to adding more hooks if we need more hooks to make it useful, but I'm not clear at all on what you think is missing. I'm curious, for example, whether CustomScan would have been sufficient to build TABLESAMPLE, and if not, why not. Obviously the syntax has to be in core, but why couldn't the syntax just call an extension-provided callback that returns a custom scan, instead of having a node just for TABLESAMPLE? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Both you and Andres have articulated the concern that CustomScan isn't > actually useful, but I still don't really understand why not. > I'm curious, for example, whether CustomScan would have been > sufficient to build TABLESAMPLE, and if not, why not. Obviously the > syntax has to be in core, ... so you just made the point ... > but why couldn't the syntax just call an > extension-provided callback that returns a custom scan, instead of > having a node just for TABLESAMPLE? Because that only works for small values of "work". As far as TABLESAMPLE goes, I intend to hold Simon's feet to the fire until there's a less cheesy answer to the problem of scan reproducibility. Assuming we're going to allow sample methods that can't meet the reproducibility requirement, we need something to prevent them from producing visibly broken query results. Ideally, the planner would avoid putting such a scan on the inside of a nestloop. A CustomScan-based implementation could not possibly arrange such a thing; we'd have to teach the core planner about the concern. Or, taking the example of a GpuScan node, it's essentially impossible to persuade the planner to delegate any expensive function calculations, aggregates, etc to such a node; much less teach it that that way is cheaper than doing such things the usual way. So yeah, KaiGai-san may have a module that does a few things with a GPU, but it's far from doing all or even very much of what one would want. Now, as part of the upper-planner-rewrite business that I keep hoping to get to when I'm not riding herd on bad patches, it's likely that we might have enough new infrastructure soon that that particular problem could be solved. But there would just be another problem after that; a likely example is not having adequate statistics, or sufficiently fine-grained function cost estimates, to be able to make valid choices once there's more than one way to do such calculations. (I'm not really impressed by "the GPU is *always* faster" approaches.) Significant improvements of that sort are going to take core-code changes. Even worse, if there do get to be any popular custom-scan extensions, we'll break them anytime we make any nontrivial planner changes, because there is no arms-length API there. A trivial example is that even adding or changing any fields in struct Path will necessarily break custom scan providers, because they build Paths for themselves with no interposed API. In large part this is the same as my core concern about the TABLESAMPLE patch: exposing dubiously-designed APIs is soon going to force us to make choices between breaking those APIs or not being able to make changes we need to make. In the case of custom scans, I will not be particularly sad when (not if) we break custom scan providers; but in other cases such tradeoffs are going to be harder to make. regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Wednesday, July 15, 2015 5:47 AM > To: Robert Haas > Cc: Alvaro Herrera; hlinnaka; Kaigai Kouhei(海外 浩平); Michael Paquier; Jim > Nasby; Kohei KaiGai; PgHacker; Simon Riggs > Subject: Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom > Plan API) > > Robert Haas <robertmhaas@gmail.com> writes: > > On Tue, Jul 14, 2015 at 3:07 PM, Alvaro Herrera > > <alvherre@2ndquadrant.com> wrote: > >> As a general principle, I think it's a good idea to have a module that's > >> mostly just a skeleton that guides people into writing something real to > >> use whatever API is being tested. It needs to be simple enough that not > >> much need to be deleted when writing the real thing, and complex enough > >> to cover the parts that need covering. If whatever replaces ctidscan is > >> too complex, it will not serve that purpose. > >> > >> My guess is that something whose only purpose is to test the custom scan > >> interface for coverage purposes can be simpler than this module. > > > See, I actually think the opposite: I think we've been accumulating a > > reasonable amount of test code that actually serves no really useful > > purpose and is just cruft. Stuff like test_shm_mq and test_decoding > > seem like they actually catches bugs, so I like that, but I think > > stuff like worker_spi is actually TOO simple to be useful in building > > anything real, and it provides no useful test coverage, either. But > > this is all a matter of opinion, of course, and I'll defer to whatever > > the consensus is. > > I think this ties into my core unhappiness with the customscan stuff, > which is that I don't believe it's *possible* to do anything of very > great interest with it. I think anything really useful will require > core code modifications and/or hooks that don't exist now. So a finger > exercise like ctidscan, even though it might have some marginal use, > doesn't do much to alleviate that concern. It certainly doesn't seem > like it's a suitable placeholder proving that we aren't breaking any > actual use cases for the feature. > The ctidscan is originally designed to validate the behavior of custom-scan interface, so it is natural this module is valuable in limited cased. However, I don't think that anything valuable usually takes core code enhancement and/or new hooks, because we already have various extensions around core code that utilizes existing infrastructures (even though its specifications may be changed on major version up). At least, custom-scan enables to implement edge-features which are not easy to merge the core because of various reasons; like dependency to proprietary library, too experimental features, too large code to review as minimum valuable portion and so on. > (BTW, if we care about the use cases this has, such as data recovery from > partially-corrupt tables, it would make way more sense and take way less > net new code to just teach TidScan about it.) > What I discussed with Hanada-san before was, a custom-scan provider that replaces a particular relations join by simple scan of materialized-view transparently. It is probably one other use case. Its design is in my brain, but time for development is missing piece for me. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
> Or, taking the example of a GpuScan node, it's essentially impossible > to persuade the planner to delegate any expensive function calculations, > aggregates, etc to such a node; much less teach it that that way is cheaper > than doing such things the usual way. So yeah, KaiGai-san may have a > module that does a few things with a GPU, but it's far from doing all or > even very much of what one would want. > Why do we need to run all the functions on GPU device? PG-Strom simply gives up to inject CustomPath if required qualifier contains unsupported functions or data types, thus, these workloads are executed as usual. I don't intend 100% coverage by GPU. That's no matter. People who use massive numerical/mathematical workload will love GPU. > Now, as part of the upper-planner-rewrite business that I keep hoping to > get to when I'm not riding herd on bad patches, it's likely that we might > have enough new infrastructure soon that that particular problem could > be solved. But there would just be another problem after that; a likely > example is not having adequate statistics, or sufficiently fine-grained > function cost estimates, to be able to make valid choices once there's > more than one way to do such calculations. (I'm not really impressed by > "the GPU is *always* faster" approaches.) Significant improvements of > that sort are going to take core-code changes. > We never guarantee the interface compatibility between major version up. If we add/modify interface on v9.6, it is duty for developer of extensions to follow the new version, even not specific to custom-scan provider. If v9.6 adds support fine-grained function cost estimation, I also have to follow the feature, but it is natural. > Even worse, if there do get to be any popular custom-scan extensions, > we'll break them anytime we make any nontrivial planner changes, because > there is no arms-length API there. A trivial example is that even adding > or changing any fields in struct Path will necessarily break custom scan > providers, because they build Paths for themselves with no interposed API. > I cannot understand... If Path field gets changed, it is duty of extension to follow the core change. We usually add/modify API specifications on major version up. For example, I remember ProcessUtility_hook has been changed a few times in the last several years. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
On Wed, Jul 15, 2015 at 10:12 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > We never guarantee the interface compatibility between major version up. > If we add/modify interface on v9.6, it is duty for developer of extensions > to follow the new version, even not specific to custom-scan provider. > If v9.6 adds support fine-grained function cost estimation, I also have > to follow the feature, but it is natural. Maintaining compatibility across major versions is a best-effort and even if we sometimes break things across major versions, and sometimes even silently (take the example of 9.3's background worker that do not start with 9.4 as long as bgw_notify_pid is not set to 0), the approach is usually taken to have APIs stable and convenient able to cover a maximum set of cases for a given set of plugins, and this serves well in the long term. Now it is true that we cannot assume either that the version of a plugin API will be perfect forever and will be able to cover all the imagined test cases at first shot, still I'd like to think that things are broken only when necessary and with good reasons. A set of APIs changed at each major release tends to be proof that research lacked in the first version, and would surely demotivate its adoption by extension developers. -- Michael
> -----Original Message----- > From: Michael Paquier [mailto:michael.paquier@gmail.com] > Sent: Wednesday, July 15, 2015 3:29 PM > To: Kaigai Kouhei(海外 浩平) > Cc: Tom Lane; Robert Haas; Alvaro Herrera; hlinnaka; Jim Nasby; Kohei KaiGai; > PgHacker; Simon Riggs > Subject: Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom > Plan API) > > On Wed, Jul 15, 2015 at 10:12 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > > We never guarantee the interface compatibility between major version up. > > If we add/modify interface on v9.6, it is duty for developer of extensions > > to follow the new version, even not specific to custom-scan provider. > > If v9.6 adds support fine-grained function cost estimation, I also have > > to follow the feature, but it is natural. > > Maintaining compatibility across major versions is a best-effort and > even if we sometimes break things across major versions, and sometimes > even silently (take the example of 9.3's background worker that do not > start with 9.4 as long as bgw_notify_pid is not set to 0), the > approach is usually taken to have APIs stable and convenient able to > cover a maximum set of cases for a given set of plugins, and this > serves well in the long term. Now it is true that we cannot assume > either that the version of a plugin API will be perfect forever and > will be able to cover all the imagined test cases at first shot, still > I'd like to think that things are broken only when necessary and with > good reasons. A set of APIs changed at each major release tends to be > proof that research lacked in the first version, and would surely > demotivate its adoption by extension developers. > Yep, I also don't want to break the interface compatibility without something reasonable, and also think following-up new core features is a good reason to enhance the interface in the future version. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
On Tue, Jul 14, 2015 at 6:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Both you and Andres have articulated the concern that CustomScan isn't >> actually useful, but I still don't really understand why not. > >> I'm curious, for example, whether CustomScan would have been >> sufficient to build TABLESAMPLE, and if not, why not. Obviously the >> syntax has to be in core, > > ... so you just made the point ... If you're going to set the bar that high, there's no point in anyone ever trying to add extensibility anywhere, because there will always be some other place where there isn't enough extensibility someplace else to do everything that someone might want. The fact that the parser isn't extensible doesn't make custom scans useless any more than the fact that the non-extensibility of WAL-logging makes pg_am useless. >> but why couldn't the syntax just call an >> extension-provided callback that returns a custom scan, instead of >> having a node just for TABLESAMPLE? > > Because that only works for small values of "work". As far as TABLESAMPLE > goes, I intend to hold Simon's feet to the fire until there's a less > cheesy answer to the problem of scan reproducibility. Assuming we're > going to allow sample methods that can't meet the reproducibility > requirement, we need something to prevent them from producing visibly > broken query results. Ideally, the planner would avoid putting such a > scan on the inside of a nestloop. A CustomScan-based implementation could > not possibly arrange such a thing; we'd have to teach the core planner > about the concern. Well, I think it would be better to change the tablesample methods as you previously proposed, so that they are actually stable. But if that's not possible, then when you (or someone) makes the core planner able to consider such matters, you could add a CUSTOM_PATH_UNSTABLE flag that a custom path can use to notify the core system about the problem. Then tablesample methods that are unstable can set the flag and those that are stable are not penalized. Presumably we'll end up with such a flag in the tablesample-path anyway, and a separate flag in every kind of future scan that needs similar consideration. If we put it in some piece of infrastructure (like the custom-path stuff) that is reusable, we could avoid reinventing the wheel every time. > Or, taking the example of a GpuScan node, it's essentially impossible > to persuade the planner to delegate any expensive function calculations, > aggregates, etc to such a node; much less teach it that that way is cheaper > than doing such things the usual way. So yeah, KaiGai-san may have a > module that does a few things with a GPU, but it's far from doing all or > even very much of what one would want. It's true that there are things he can't do this way, but without the custom-scan stuff, he'd be able to do even less. > Now, as part of the upper-planner-rewrite business that I keep hoping to > get to when I'm not riding herd on bad patches, it's likely that we might > have enough new infrastructure soon that that particular problem could > be solved. But there would just be another problem after that; a likely > example is not having adequate statistics, or sufficiently fine-grained > function cost estimates, to be able to make valid choices once there's > more than one way to do such calculations. (I'm not really impressed by > "the GPU is *always* faster" approaches.) Significant improvements of > that sort are going to take core-code changes. Probably, but giving people the ability to experiment outside core, even if it's limited, often leads to good things. And when it doesn't, it doesn't really cost us anything. > Even worse, if there do get to be any popular custom-scan extensions, > we'll break them anytime we make any nontrivial planner changes, because > there is no arms-length API there. A trivial example is that even adding > or changing any fields in struct Path will necessarily break custom scan > providers, because they build Paths for themselves with no interposed API. I agree that will happen, but I don't care. This happens all the time, and every person other than yourself who has commented on this issue has said that they'd rather have an API exposed that will later get whacked around without warning than have no API exposed at all, not just with respect to custom-paths, but with a whole variety of other things as well, like sticking PGDLLIMPORT on all of the variables that back GUCs. It is really far more tedious to try to work around the lack of a PGDLLIMPORT marking (which causes a huge annoyance now) than to adjust the code if and when somebody changes something about the GUC (which causes a small annoyance later, and only if there actually are changes). > In large part this is the same as my core concern about the TABLESAMPLE > patch: exposing dubiously-designed APIs is soon going to force us to make > choices between breaking those APIs or not being able to make changes we > need to make. In the case of custom scans, I will not be particularly > sad when (not if) we break custom scan providers; but in other cases such > tradeoffs are going to be harder to make. I'll vote for meaningful forward progress over refusing to change the API in every single case. I don't think I'll be remotely alone. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jul 15, 2015 at 2:28 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Jul 15, 2015 at 10:12 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: >> We never guarantee the interface compatibility between major version up. >> If we add/modify interface on v9.6, it is duty for developer of extensions >> to follow the new version, even not specific to custom-scan provider. >> If v9.6 adds support fine-grained function cost estimation, I also have >> to follow the feature, but it is natural. > > Maintaining compatibility across major versions is a best-effort and > even if we sometimes break things across major versions, and sometimes > even silently (take the example of 9.3's background worker that do not > start with 9.4 as long as bgw_notify_pid is not set to 0), the > approach is usually taken to have APIs stable and convenient able to > cover a maximum set of cases for a given set of plugins, and this > serves well in the long term. Now it is true that we cannot assume > either that the version of a plugin API will be perfect forever and > will be able to cover all the imagined test cases at first shot, still > I'd like to think that things are broken only when necessary and with > good reasons. A set of APIs changed at each major release tends to be > proof that research lacked in the first version, and would surely > demotivate its adoption by extension developers. Maybe. If those changes are demanded by extension developers who are trying to do useful stuff with the APIs and finding problems, then changing things will probably make those extension developers happy. If the changes are necessitated by core improvements, then we might get some complaints if those core improvements are perceived to have small value compared to the API break. But I think complaints of that kind are very rare. The only thing I can think of that falls into approximately that category is a gripe about the replacement of relistemp by relpersistence. But I think that was more motivated by the way it broke SQL monitoring queries than by the way it broke C code. There may be other more recent examples; but that's the only one I can think of offhand. I just don't see it as a big issue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company