Thread: Status of FDW pushdowns
Where are we on the remaining possible pushdowns for foreign data wrappers, particularly the Postgres one? I know we do WHERE restriction pushdowns in 9.3, but what about join and aggregate pushdowns? Is anyone working on those? I know join pushdowns seem insignificant, but it helps to restrict what data must be passed back because you would only pass back joined rows. Do we document these missing features anywhere? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote: > Where are we on the remaining possible pushdowns for foreign data > wrappers, particularly the Postgres one? I know we do WHERE restriction > pushdowns in 9.3, but what about join and aggregate pushdowns? Is > anyone working on those? > > I know join pushdowns seem insignificant, but it helps to restrict what > data must be passed back because you would only pass back joined rows. By 'insignificant' you mean 'necessary to do any non-trivial real work'. Personally, I'd prefer it if FDW was extended to allow arbitrary parameterized queries like every other database connectivity API ever made ever. But in lieu of that, I'll take as much push down power as possible :-D. merlin
2013/11/21 Bruce Momjian <bruce@momjian.us>: > Where are we on the remaining possible pushdowns for foreign data > wrappers, particularly the Postgres one? I know we do WHERE restriction > pushdowns in 9.3, but what about join and aggregate pushdowns? Is > anyone working on those? > > I know join pushdowns seem insignificant, but it helps to restrict what > data must be passed back because you would only pass back joined rows. > > Do we document these missing features anywhere? > Probably, custom-scan api will provide more flexible way to push-down aggregate, sort or other stuff performing on regular tables, not only foreign tables. It allows extensions to offer alternative scan/join path on the planning stage, then executor callbacks its custom logic instead of the built-in one, if its cost is cheaper. Right now, it performs on relation scan or join only. However, we will be able to apply same concept on aggregation. For example, an aggregation node on a foreign table scan is a good candidate to push down because it can be replaced with a custom- logic that scans a materialized result of the remote aggregation query, if its cost is enough cheap than local aggregation. Probably, we need to add a hook and some logic to compare the built-in aggregation and alternative paths provided by extensions. It is also helpful for the people who want to implement something like "parallel aggregate" performing on regular tables, not only foreign table. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Kohei KaiGai <kaigai@kaigai.gr.jp> writes: > Right now, it performs on relation scan or join only. However, we will be > able to apply same concept on aggregation. > For example, an aggregation node on a foreign table scan is a good > candidate to push down because it can be replaced with a custom- > logic that scans a materialized result of the remote aggregation query, > if its cost is enough cheap than local aggregation. > Probably, we need to add a hook and some logic to compare the > built-in aggregation and alternative paths provided by extensions. Note that this is another thing that's blocked on Path-ifying the work now done in grouping_planner. We don't currently have a way to represent a local aggregation, much less a remote one, as a Path. We definitely need that before we can open up any of that logic to FDWs. regards, tom lane
Merlin Moncure <mmoncure@gmail.com> writes: > On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote: >> I know join pushdowns seem insignificant, but it helps to restrict what >> data must be passed back because you would only pass back joined rows. > By 'insignificant' you mean 'necessary to do any non-trivial real > work'. Personally, I'd prefer it if FDW was extended to allow > arbitrary parameterized queries like every other database connectivity > API ever made ever. [ shrug... ] So use dblink. For better or worse, the FDW stuff is following the SQL standard's SQL/MED design, which does not do it like that. regards, tom lane
2013/11/22 Kohei KaiGai <kaigai@kaigai.gr.jp>: > 2013/11/21 Bruce Momjian <bruce@momjian.us>: >> Where are we on the remaining possible pushdowns for foreign data >> wrappers, particularly the Postgres one? I know we do WHERE restriction >> pushdowns in 9.3, but what about join and aggregate pushdowns? Is >> anyone working on those? >> >> I know join pushdowns seem insignificant, but it helps to restrict what >> data must be passed back because you would only pass back joined rows. >> >> Do we document these missing features anywhere? >> > Probably, custom-scan api will provide more flexible way to push-down > aggregate, sort or other stuff performing on regular tables, not only > foreign tables. > It allows extensions to offer alternative scan/join path on the planning > stage, then executor callbacks its custom logic instead of the built-in > one, if its cost is cheaper. IIRC, sort push-down is already supported. We can provide sorted pathes by setting Pathkeys to additional ForeignPath. postgres_fdw doesn't support this feature because we couldn't get consensus about how to limit sort variation. One idea was to allow to define "foreign index" on foreign tables to indicate which column combination is reasonably sortable. -- Shigeru HANADA
2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>: > Merlin Moncure <mmoncure@gmail.com> writes: >> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote: >>> I know join pushdowns seem insignificant, but it helps to restrict what >>> data must be passed back because you would only pass back joined rows. > >> By 'insignificant' you mean 'necessary to do any non-trivial real >> work'. Personally, I'd prefer it if FDW was extended to allow >> arbitrary parameterized queries like every other database connectivity >> API ever made ever. > > [ shrug... ] So use dblink. For better or worse, the FDW stuff is > following the SQL standard's SQL/MED design, which does not do it > like that. Pass-through mode mentioned in SQL/MED standard might be what he wants. -- Shigeru HANADA
On Thu, Nov 21, 2013 at 10:46:14AM -0500, Tom Lane wrote: > Merlin Moncure <mmoncure@gmail.com> writes: > > On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote: > >> I know join pushdowns seem insignificant, but it helps to restrict what > >> data must be passed back because you would only pass back joined rows. > > > By 'insignificant' you mean 'necessary to do any non-trivial real > > work'. Personally, I'd prefer it if FDW was extended to allow > > arbitrary parameterized queries like every other database connectivity > > API ever made ever. > > [ shrug... ] So use dblink. Not with a non-PostgreSQL data source. > For better or worse, the FDW stuff is following the SQL standard's > SQL/MED design, which does not do it like that. What SQL/MED specifies along this line is purely a caution against making a specification without a reference implementation. If I'm reading it correctly, it's literally impossible to make what they suggest safe. Given those givens, we're free to do this in a way that's not barking-at-the-moon crazy. At least two inter-database communication links which work with PostgreSQL do this.. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada <shigeru.hanada@gmail.com> wrote: > 2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>: >> Merlin Moncure <mmoncure@gmail.com> writes: >>> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote: >>>> I know join pushdowns seem insignificant, but it helps to restrict what >>>> data must be passed back because you would only pass back joined rows. >> >>> By 'insignificant' you mean 'necessary to do any non-trivial real >>> work'. Personally, I'd prefer it if FDW was extended to allow >>> arbitrary parameterized queries like every other database connectivity >>> API ever made ever. >> >> [ shrug... ] So use dblink. For better or worse, the FDW stuff is >> following the SQL standard's SQL/MED design, which does not do it >> like that. > > Pass-through mode mentioned in SQL/MED standard might be what he wants. happen to have a link handy? merlin
On Fri, Nov 22, 2013 at 08:25:05AM -0600, Merlin Moncure wrote: > On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada > <shigeru.hanada@gmail.com> wrote: > > 2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>: > >> Merlin Moncure <mmoncure@gmail.com> writes: > >>> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote: > >>>> I know join pushdowns seem insignificant, but it helps to restrict what > >>>> data must be passed back because you would only pass back joined rows. > >> > >>> By 'insignificant' you mean 'necessary to do any non-trivial real > >>> work'. Personally, I'd prefer it if FDW was extended to allow > >>> arbitrary parameterized queries like every other database connectivity > >>> API ever made ever. > >> > >> [ shrug... ] So use dblink. For better or worse, the FDW stuff is > >> following the SQL standard's SQL/MED design, which does not do it > >> like that. > > > > Pass-through mode mentioned in SQL/MED standard might be what he wants. > > happen to have a link handy? http://www.wiscorp.com/sql20nn.zip You'll want to look at the PDF with MED in its title. Passthrough mode, which is how the standard "handles" this problem is basically a thing where you set it to be on, then everything your send until setting it to off is passed through to the remote side. The people writing the standard didn't think too much about the possibility that the remote side might speak a broader or different dialect of SQL from the local server. They also didn't imagine cases where what's being passed isn't SQL at all. In addition to breaking any possible parser, the "feature" as described in the standard is just ripe for un-patchable exploits *in its design*. Of all the misdesign-by-committee contained in the standard, this piece is far and away the stupidest I've encountered to date. We should not even vaguely attempt to implement it. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Merlin Moncure <mmoncure@gmail.com> writes: > By 'insignificant' you mean 'necessary to do any non-trivial real > work'. Personally, I'd prefer it if FDW was extended to allow > arbitrary parameterized queries like every other database connectivity > API ever made ever. But in lieu of that, I'll take as much push down > power as possible :-D. That sounds more like FOREIGN VIEW and FOREIGN FUNCTION to me, where you could have the whole control of the local/remote boundaries. I mean that when planning a query using a FOREIGN VIEW it would probably make sense to consider it as a CTE as far as the optimizer is concerned. About FOREIGN FUNCTION, that would allow to inject arbitrary parameters anywhere in the remote query when doing SQL functions. We have a very nice version of FOREIGN FUNCTION already, that's plproxy. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Hi Merlin, 2013/11/22 Merlin Moncure <mmoncure@gmail.com>: > On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada > <shigeru.hanada@gmail.com> wrote: >> 2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>: >>> Merlin Moncure <mmoncure@gmail.com> writes: >>>> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote: >>>>> I know join pushdowns seem insignificant, but it helps to restrict what >>>>> data must be passed back because you would only pass back joined rows. >>> >>>> By 'insignificant' you mean 'necessary to do any non-trivial real >>>> work'. Personally, I'd prefer it if FDW was extended to allow >>>> arbitrary parameterized queries like every other database connectivity >>>> API ever made ever. >>> >>> [ shrug... ] So use dblink. For better or worse, the FDW stuff is >>> following the SQL standard's SQL/MED design, which does not do it >>> like that. >> >> Pass-through mode mentioned in SQL/MED standard might be what he wants. > > happen to have a link handy? SQL/MED standard doesn't say much about PASS THROUGH mode, especially about interaction between client. Besides it, I think it would be nice to allow arbitrary FDW as backend of dblink interface like this: postgres=> SELECT dblink_connect('con1', 'server name of an FDW'); postgres=> SELECT * FROM dblink('con1', 'some query written in remote syntax') as t(/* record type definition */...); This provides a way to execute query without defining foreign table. -- Shigeru HANADA
Shigeru Hanada escribió: > SQL/MED standard doesn't say much about PASS THROUGH mode, especially > about interaction between client. Besides it, I think it would be > nice to allow arbitrary FDW as backend of dblink interface like this: > > postgres=> SELECT dblink_connect('con1', 'server name of an FDW'); > postgres=> SELECT * FROM dblink('con1', 'some query written in remote > syntax') as t(/* record type definition */...); > > This provides a way to execute query without defining foreign table. Seems to me that if you want to read remote tables without creating a foreign table, you could define them locally using something like the WITH syntax and then use them normally in the rest of the query. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Seems to me that if you want to read remote tables without creating a > foreign table, you could define them locally using something like the > WITH syntax and then use them normally in the rest of the query. I guess what's needed here is a kind of barrier that allows pushing a whole arbitrary subquery (with joins and quals and whatnot) down to the remote side. My current thinking about how to solve that would be to add a notion of FOREIGN VIEW in our system, which would basically implement that barrier and send the view definition on the remote, with known quals values as constants, or something like that. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Wed, Nov 27, 2013 at 4:20 AM, Shigeru Hanada <shigeru.hanada@gmail.com> wrote: > Hi Merlin, > > 2013/11/22 Merlin Moncure <mmoncure@gmail.com>: >> On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada >> <shigeru.hanada@gmail.com> wrote: >>> 2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>: >>>> Merlin Moncure <mmoncure@gmail.com> writes: >>>>> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote: >>>>>> I know join pushdowns seem insignificant, but it helps to restrict what >>>>>> data must be passed back because you would only pass back joined rows. >>>> >>>>> By 'insignificant' you mean 'necessary to do any non-trivial real >>>>> work'. Personally, I'd prefer it if FDW was extended to allow >>>>> arbitrary parameterized queries like every other database connectivity >>>>> API ever made ever. >>>> >>>> [ shrug... ] So use dblink. For better or worse, the FDW stuff is >>>> following the SQL standard's SQL/MED design, which does not do it >>>> like that. >>> >>> Pass-through mode mentioned in SQL/MED standard might be what he wants. >> >> happen to have a link handy? > > SQL/MED standard doesn't say much about PASS THROUGH mode, especially > about interaction between client. Besides it, I think it would be > nice to allow arbitrary FDW as backend of dblink interface like this: > > postgres=> SELECT dblink_connect('con1', 'server name of an FDW'); > postgres=> SELECT * FROM dblink('con1', 'some query written in remote > syntax') as t(/* record type definition */...); > > This provides a way to execute query without defining foreign table. yeah. (thanks for indulging -- this is barely on topic I guess). if it were possible to create a supporting function (say, fdw_link) that could somehow interface with a previously established server, it could probably be worked out. Then all FDW could leverage parameterization without having to copy and paste the pgsql-fdw qual push code. But that would be a fairly large break from the rest of the FDW syntax and having to define the record at each call site is admittedly a bit of a headache. Hm, another way to think about this would be to somehow abstract the qual push into a library so that it could be accessed by other FDWs if they opted in. This would address my chief complaint that only the pgsql-fdw (the only database for which we already have an in-core high quality connection api) driver could tap the excellent work you've done. If this were even possible, it would probably result in more fdw API changes. If my: SELECT * FROM big_sql_server_foreign_table WHERE id = x; was fast, that'd be pretty nice. merlin
2013/11/27 Dimitri Fontaine <dimitri@2ndquadrant.fr>: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Seems to me that if you want to read remote tables without creating a >> foreign table, you could define them locally using something like the >> WITH syntax and then use them normally in the rest of the query. > > I guess what's needed here is a kind of barrier that allows pushing a > whole arbitrary subquery (with joins and quals and whatnot) down to the > remote side. Yes, a big problem is how to skip parsing remote query in PG context. Bare query string (other than string literal) always parsed by PG parser, but remote side would have different syntax and semantics, as Dimitri says we need to pass whole of arbitrary query string to remote side as-is. > My current thinking about how to solve that would be to add a notion of > FOREIGN VIEW in our system, which would basically implement that barrier > and send the view definition on the remote, with known quals values as > constants, or something like that. I'm sorry but I don't see the point here. Do you mean that user executes CREATE FOREIGN VIEW in advance and uses the view in a subsequent query? Or, allow new syntax like WITH alias AS FOREIGN VIEW (remote query)? I think it's nice to support executing ad-hoc remote query written in the syntax which is valid only on remote data source through FDW, and at the moment dblink interface seems feasible for that purpose. -- Shigeru HANADA
Shigeru Hanada <shigeru.hanada@gmail.com> writes: > I'm sorry but I don't see the point here. Do you mean that user > executes CREATE FOREIGN VIEW in advance and uses the view in a Yes that's what I mean. > I think it's nice to support executing ad-hoc remote query written in > the syntax which is valid only on remote data source through FDW, and > at the moment dblink interface seems feasible for that purpose. I guess the view query would have to be validated by the FDW, which would just receive a text. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
> I guess the view query would have to be validated by the FDW, which > would just receive a text. +1 This is not exactly in context, but I and David Fetter discussed recently how we could do similar thing. This would work,but how can we do it for FDWs which do not parse SQL? Am I missing something here? Regards, Atri -- Regards, Atri l'apprenant
Atri Sharma <atri.jiit@gmail.com> writes: > This would work,but how can we do it for FDWs which do not parse SQL? > Am I missing something here? Worst case: CREATE FOREIGN VIEW foo AS $$ whatever syntax is accepted on the other side $$; Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Wed, Nov 27, 2013 at 11:08 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Atri Sharma <atri.jiit@gmail.com> writes: >> This would work,but how can we do it for FDWs which do not parse SQL? >> Am I missing something here? > > Worst case: > > CREATE FOREIGN VIEW foo > AS $$ > whatever syntax is accepted on the other side That doesnt sound like a very good idea. Can we add a function to the FDW API to define a SQL to foreign server side conversion? I am just musing though. Regards, Atri -- Regards, Atri l'apprenant
On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote: > Shigeru Hanada escribió: > > > SQL/MED standard doesn't say much about PASS THROUGH mode, especially > > about interaction between client. Besides it, I think it would be > > nice to allow arbitrary FDW as backend of dblink interface like this: > > > > postgres=> SELECT dblink_connect('con1', 'server name of an FDW'); > > postgres=> SELECT * FROM dblink('con1', 'some query written in remote > > syntax') as t(/* record type definition */...); > > > > This provides a way to execute query without defining foreign table. > > Seems to me that if you want to read remote tables without creating a > foreign table, you could define them locally using something like the > WITH syntax and then use them normally in the rest of the query. WITH, or SRF, or whatever, the point is that we need to be able to specify what we're sending--probably single opaque strings delimited just as we do other strings--and what we might get back--errors only, rows, [sets of] refcursors are the ones I can think of offhand. What we can't do is assume that our parser needs to, or even could, in principle, understand these things in more detail than that. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Thu, Nov 28, 2013 at 12:54 AM, David Fetter <david@fetter.org> wrote: > On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote: >> Shigeru Hanada escribió: >> >> > SQL/MED standard doesn't say much about PASS THROUGH mode, especially >> > about interaction between client. Besides it, I think it would be >> > nice to allow arbitrary FDW as backend of dblink interface like this: >> > >> > postgres=> SELECT dblink_connect('con1', 'server name of an FDW'); >> > postgres=> SELECT * FROM dblink('con1', 'some query written in remote >> > syntax') as t(/* record type definition */...); >> > >> > This provides a way to execute query without defining foreign table. >> >> Seems to me that if you want to read remote tables without creating a >> foreign table, you could define them locally using something like the >> WITH syntax and then use them normally in the rest of the query. > > WITH, or SRF, or whatever, the point is that we need to be able to > specify what we're sending--probably single opaque strings delimited > just as we do other strings--and what we might get back--errors only, > rows, [sets of] refcursors are the ones I can think of offhand. +1 The input-output formats need to be defined clearly. How about sending parse trees? Is it even possible? > What we can't do is assume that our parser needs to, or even could, in > principle, understand these things in more detail than that. Agreed. I wonder if its possible to give this task to the FDW implementing authority instead, and get FDW to translate to the required format. -- Regards, Atri l'apprenant
On Thu, Nov 28, 2013 at 01:29:46AM +0530, Atri Sharma wrote: > On Thu, Nov 28, 2013 at 12:54 AM, David Fetter <david@fetter.org> wrote: > > On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote: > >> Shigeru Hanada escribió: > >> > >> > SQL/MED standard doesn't say much about PASS THROUGH mode, especially > >> > about interaction between client. Besides it, I think it would be > >> > nice to allow arbitrary FDW as backend of dblink interface like this: > >> > > >> > postgres=> SELECT dblink_connect('con1', 'server name of an FDW'); > >> > postgres=> SELECT * FROM dblink('con1', 'some query written in remote > >> > syntax') as t(/* record type definition */...); > >> > > >> > This provides a way to execute query without defining foreign table. > >> > >> Seems to me that if you want to read remote tables without creating a > >> foreign table, you could define them locally using something like the > >> WITH syntax and then use them normally in the rest of the query. > > > > WITH, or SRF, or whatever, the point is that we need to be able to > > specify what we're sending--probably single opaque strings delimited > > just as we do other strings--and what we might get back--errors only, > > rows, [sets of] refcursors are the ones I can think of offhand. > > +1 > > The input-output formats need to be defined clearly. > > How about sending parse trees? Is it even possible? I don't see why parse trees wouldn't be something that could eventually be sent to other PostgreSQL servers, but I see that whole discussion as orthogonal to this one. My point here is that there needs to be an "escape to native" system available in SQL so people can communicate directly with the remote systems in the systems' own languages. It's a little bit analogous to making assembler available from C, or C from HLLs. > > What we can't do is assume that our parser needs to, or even could, in > > principle, understand these things in more detail than that. > > Agreed. > > I wonder if its possible to give this task to the FDW implementing > authority instead, and get FDW to translate to the required format. I don't know that the FDW would necessarily need to get involved except in the sense of "full recognition before processing." http://langsec.org/occupy/ Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Atri Sharma <atri.jiit@gmail.com> writes: > Can we add a function to the FDW API to define a SQL to foreign server > side conversion? I think that's not tenable. Even if you limit the discussion to the postgres_fdw, some queries against past version will stop working against new version (keywords changes, catalogs, default settings, etc). I don't think you want to embed a full parser of every supported FOREIGN version of PostgreSQL inside the postgres_fdw code, so I think the text of the view needs to be an opaque string. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndquadrant.fr> writes: > Atri Sharma <atri.jiit@gmail.com> writes: >> Can we add a function to the FDW API to define a SQL to foreign server >> side conversion? > I think that's not tenable. Even if you limit the discussion to the > postgres_fdw, some queries against past version will stop working > against new version (keywords changes, catalogs, default settings, etc). > I don't think you want to embed a full parser of every supported FOREIGN > version of PostgreSQL inside the postgres_fdw code, so I think the text > of the view needs to be an opaque string. I'm not real sure what this'd buy us that wouldn't be done as well or better by creating a view on the remote side. (IOW, there's nothing that says that the remote object backing a foreign table can't be a view.) regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > I'm not real sure what this'd buy us that wouldn't be done as well or > better by creating a view on the remote side. (IOW, there's nothing > that says that the remote object backing a foreign table can't be a > view.) Agreed, for those remote sides that know what a view is. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Sent from my iPad > On 28-Nov-2013, at 16:13, Dimitri Fontaine <dimitri@2ndQuadrant.fr> wrote: > > Tom Lane <tgl@sss.pgh.pa.us> writes: >> I'm not real sure what this'd buy us that wouldn't be done as well or >> better by creating a view on the remote side. (IOW, there's nothing >> that says that the remote object backing a foreign table can't be a >> view.) > > Agreed, for those remote sides that know what a view is. I agree. I agree with the overall model here, but I am not sure how it would work out for non SQL supporting remote sides. Regards, Atri
On 11/28/2013 03:24 AM, David Fetter wrote: > WITH, or SRF, or whatever, the point is that we need to be able to > specify what we're sending--probably single opaque strings delimited > just as we do other strings--and what we might get back--errors only, > rows, [sets of] refcursors are the ones I can think of offhand. So, you're thinking of something like: WITH FOREIGN somecte AS $$... foreign query ...$$ SELECT ... FROM somecte; ? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote: > On 11/28/2013 03:24 AM, David Fetter wrote: > > WITH, or SRF, or whatever, the point is that we need to be able to > > specify what we're sending--probably single opaque strings delimited > > just as we do other strings--and what we might get back--errors only, > > rows, [sets of] refcursors are the ones I can think of offhand. > > So, you're thinking of something like: > > WITH FOREIGN somecte AS $$... foreign query ...$$ > SELECT ... > FROM somecte; I was picturing something a little more like an SRF which would take one opaque string, the remote command, some descriptor, perhaps an enum, of what if anything might come back. Long ago, I implemented a similar thing in DBI-Link. It was called remote_exec_dbh(data_source_id integer, query text, returns_rows bool) That covered only some of the cases I've come up with since. My current idea is something more like remote_execute( data_source_id integer, /* This corresponds to a "handle" in SQL/MED-speak */ query text, returns enum( 'void', 'some_type', 'setof some_type', 'setof record', 'refcursor', 'setof refcursor' ) ) This could appear in a FROM or WITH clause. There might also be some aliasing and/or syntactic sugar along the lines of remote_execute_void(), remote_execute_rows(), etc. Given the possibility of cooperation with the planner and executor, we might want to extend some attributes like sortedness where applicable. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Mon, Dec 2, 2013 at 10:26 PM, David Fetter <david@fetter.org> wrote: > On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote: >> On 11/28/2013 03:24 AM, David Fetter wrote: >> > WITH, or SRF, or whatever, the point is that we need to be able to >> > specify what we're sending--probably single opaque strings delimited >> > just as we do other strings--and what we might get back--errors only, >> > rows, [sets of] refcursors are the ones I can think of offhand. >> >> So, you're thinking of something like: >> >> WITH FOREIGN somecte AS $$... foreign query ...$$ >> SELECT ... >> FROM somecte; > > I was picturing something a little more like an SRF which would take > one opaque string, the remote command, some descriptor, perhaps an > enum, of what if anything might come back. Long ago, I implemented a > similar thing in DBI-Link. It was called > > remote_exec_dbh(data_source_id integer, query text, returns_rows bool) Couple thoughts: *) Any 'pass through' API should support parameterization (the FDW may not support that, but many will and API should allow for it). Lack of parameterization is a major downside of dblink. The function could be set up to be variadic for the parameters. *) For a connectivity APIs of this style, Dblink-ish mechanic of separating command execution from data returning commands is likely the right way to go. Also, probably better to stick with SRF mechanics if we go the 'function route'. So basically we are making dblink for FDW, adding parameterization and some concept of utilizing the foreign server. All this is assuming we are adding a special remote execution function ('fdwlink'). While that would be great, it's a significant deviation from the standard into postgresql specific SRF syntax. If some of the qual pushdown deparsing functionality could be put inside the internal FDW API, then you'd get the best of both worlds. Maybe you'd still want a dblink style extension anyways, but it wouldn't be as critical. merlin
On Wed, Dec 04, 2013 at 12:43:44PM -0600, Merlin Moncure wrote: > On Mon, Dec 2, 2013 at 10:26 PM, David Fetter <david@fetter.org> wrote: > > On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote: > >> On 11/28/2013 03:24 AM, David Fetter wrote: > >> > WITH, or SRF, or whatever, the point is that we need to be able to > >> > specify what we're sending--probably single opaque strings delimited > >> > just as we do other strings--and what we might get back--errors only, > >> > rows, [sets of] refcursors are the ones I can think of offhand. > >> > >> So, you're thinking of something like: > >> > >> WITH FOREIGN somecte AS $$... foreign query ...$$ > >> SELECT ... > >> FROM somecte; > > > > I was picturing something a little more like an SRF which would take > > one opaque string, the remote command, some descriptor, perhaps an > > enum, of what if anything might come back. Long ago, I implemented a > > similar thing in DBI-Link. It was called > > > > remote_exec_dbh(data_source_id integer, query text, returns_rows bool) > > Couple thoughts: > *) Any 'pass through' API should support parameterization (the FDW may > not support that, but many will and API should allow for it). Lack > of parameterization is a major downside of dblink. The function could > be set up to be variadic for the parameters. I don't know for sure that that needs to be in version 1 of this. It definitely shouldn't block implementing the non-parameterized one. > *) For a connectivity APIs of this style, Dblink-ish mechanic of > separating command execution from data returning commands is likely > the right way to go. Also, probably better to stick with SRF > mechanics if we go the 'function route'. So basically we are making > dblink for FDW, adding parameterization and some concept of utilizing > the foreign server. Yes, modulo the above. > All this is assuming we are adding a special remote execution function > ('fdwlink'). While that would be great, it's a significant deviation > from the standard into postgresql specific SRF syntax. What the standard has is literally insane. > If some of the qual pushdown deparsing functionality could be put > inside the internal FDW API, then you'd get the best of both worlds. If this were flawless on the PostgreSQL side (i.e. our path generator understood everything perfectly including aggregates) and trivial to implement correctly in FDWs, certainly. The idea here is that such a happy situation will not obtain until much later, if ever, and meanwhile, we need a way to get things accomplished even if it's inelegant, inefficient, etc. The alternative is that those things simply will not get accomplished at all. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > The idea here is that such a happy situation will not obtain until > much later, if ever, and meanwhile, we need a way to get things > accomplished even if it's inelegant, inefficient, etc. The > alternative is that those things simply will not get accomplished at > all. If that's the argument, why not just use dblink or dbilink, and be happy? This discussion sounds a whole lot like it's trending to a conclusion of wanting one of those in core, which is not where I'd like to end up. regards, tom lane
On Wed, Dec 4, 2013 at 1:39 PM, David Fetter <david@fetter.org> wrote: > On Wed, Dec 04, 2013 at 12:43:44PM -0600, Merlin Moncure wrote: >> On Mon, Dec 2, 2013 at 10:26 PM, David Fetter <david@fetter.org> wrote: >> > On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote: >> >> On 11/28/2013 03:24 AM, David Fetter wrote: >> >> > WITH, or SRF, or whatever, the point is that we need to be able to >> >> > specify what we're sending--probably single opaque strings delimited >> >> > just as we do other strings--and what we might get back--errors only, >> >> > rows, [sets of] refcursors are the ones I can think of offhand. >> >> >> >> So, you're thinking of something like: >> >> >> >> WITH FOREIGN somecte AS $$... foreign query ...$$ >> >> SELECT ... >> >> FROM somecte; >> > >> > I was picturing something a little more like an SRF which would take >> > one opaque string, the remote command, some descriptor, perhaps an >> > enum, of what if anything might come back. Long ago, I implemented a >> > similar thing in DBI-Link. It was called >> > >> > remote_exec_dbh(data_source_id integer, query text, returns_rows bool) >> >> Couple thoughts: >> *) Any 'pass through' API should support parameterization (the FDW may >> not support that, but many will and API should allow for it). Lack >> of parameterization is a major downside of dblink. The function could >> be set up to be variadic for the parameters. > > I don't know for sure that that needs to be in version 1 of this. It > definitely shouldn't block implementing the non-parameterized one. I'm not making the case it should be version anything. But, if you went dblink style, you'd want to go variadic. It's not really any extra work and you can always embed the string if the FDW driver doesn't support parameterization. > What the standard has is literally insane. Not sure I agree. The guiding principle of the standard implementation AIUI is that it wants to connectivity management via syntax and keep the DML abstractions clean (minus some un-implementable things like RI triggers). In other words, you write exactly the same queries for native and foreign tables. This makes things much easier for people who just want to write SQL the classical way and not get into funky vendor specific APIs. The downside of SQL-MED, particularly the way postgres implemented the driver API, is that each driver is responsible for for all optimization efforts and I think this is bad. So I'm openly wondering if the FDW API should expose optional query rewriting hooks. The odbc-fdw and jdbc-fdw drivers for example could then benefit from those hooks so that qual pushdown could be implemented with far less code duplication and effort and a *much* broader set of problems could be addressed by FDW. For non- or exotic- SQL implementations those hooks could be implemented locally by the driver or disabled if doesn't make sense to use them. merlin
Merlin Moncure <mmoncure@gmail.com> writes: > The downside of SQL-MED, particularly the way postgres implemented the > driver API, is that each driver is responsible for for all > optimization efforts and I think this is bad. There was never any intention that that would be the final state of things. All the FDW APIs are quite experimental at this point, and subject to change, and one of the reasons for change is going to be to improve the optimization situation. At the same time, it's hard to say what might constitute optimization for FDWs that aren't backed by a remote SQL database. There are always going to be reasons why an FDW will have to do some of that work for itself. regards, tom lane
On Wed, Dec 04, 2013 at 03:04:31PM -0500, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > The idea here is that such a happy situation will not obtain until > > much later, if ever, and meanwhile, we need a way to get things > > accomplished even if it's inelegant, inefficient, etc. The > > alternative is that those things simply will not get accomplished > > at all. > > If that's the argument, why not just use dblink or dbilink, and be > happy? This discussion sounds a whole lot like it's trending to a > conclusion of wanting one of those in core, which is not where I'd > like to end up. Telling people who've already installed and configured an FDW that for perfectly ordinary expected functionality they'll need to install yet another piece of software, configure it, keep its configuration in sync with the FDW configuration, etc., is just a ridiculous. So yes, we do need this functionality and it does need to be part of our FDW implementation. Just exactly where we draw the line between built-ins and APIs is the conversation I thought we were having. The minimal thing would be providing database handles per SQL/MED and a few tools to manipulate same. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > On Wed, Dec 04, 2013 at 03:04:31PM -0500, Tom Lane wrote: >> If that's the argument, why not just use dblink or dbilink, and be >> happy? This discussion sounds a whole lot like it's trending to a >> conclusion of wanting one of those in core, which is not where I'd >> like to end up. > Telling people who've already installed and configured an FDW that for > perfectly ordinary expected functionality they'll need to install yet > another piece of software, configure it, keep its configuration in > sync with the FDW configuration, etc., is just a ridiculous. Perfectly ordinary expected functionality according to who? Not the SQL standard, for sure. regards, tom lane