Thread: Status of FDW pushdowns

Status of FDW pushdowns

From

Bruce Momjian

Date:

21 November 2013, 15:05:25

Where are we on the remaining possible pushdowns for foreign data
wrappers, particularly the Postgres one?  I know we do WHERE restriction
pushdowns in 9.3, but what about join and aggregate pushdowns?  Is
anyone working on those?

I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.

Do we document these missing features anywhere?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: Status of FDW pushdowns

From

Merlin Moncure

Date:

21 November 2013, 15:17:54

On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
> Where are we on the remaining possible pushdowns for foreign data
> wrappers, particularly the Postgres one?  I know we do WHERE restriction
> pushdowns in 9.3, but what about join and aggregate pushdowns?  Is
> anyone working on those?
>
> I know join pushdowns seem insignificant, but it helps to restrict what
> data must be passed back because you would only pass back joined rows.

By 'insignificant' you mean 'necessary to do any non-trivial real
work'.   Personally, I'd prefer it if FDW was extended to allow
arbitrary parameterized queries like every other database connectivity
API ever made ever.  But in lieu of that, I'll take as much push down
power as possible :-D.

merlin

Re: Status of FDW pushdowns

From

Kohei KaiGai

Date:

21 November 2013, 15:35:57

2013/11/21 Bruce Momjian <bruce@momjian.us>:
> Where are we on the remaining possible pushdowns for foreign data
> wrappers, particularly the Postgres one?  I know we do WHERE restriction
> pushdowns in 9.3, but what about join and aggregate pushdowns?  Is
> anyone working on those?
>
> I know join pushdowns seem insignificant, but it helps to restrict what
> data must be passed back because you would only pass back joined rows.
>
> Do we document these missing features anywhere?
>
Probably, custom-scan api will provide more flexible way to push-down
aggregate, sort or other stuff performing on regular tables, not only
foreign tables.
It allows extensions to offer alternative scan/join path on the planning
stage, then executor callbacks its custom logic instead of the built-in
one, if its cost is cheaper.

Right now, it performs on relation scan or join only. However, we will be
able to apply same concept on aggregation.
For example, an aggregation node on a foreign table scan is a good
candidate to push down because it can be replaced with a custom-
logic that scans a materialized result of the remote aggregation query,
if its cost is enough cheap than local aggregation.
Probably, we need to add a hook and some logic to compare the
built-in aggregation and alternative paths provided by extensions.
It is also helpful for the people who want to implement something like
"parallel aggregate" performing on regular tables, not only foreign table.

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: Status of FDW pushdowns

From

Tom Lane

Date:

21 November 2013, 15:44:46

Kohei KaiGai <kaigai@kaigai.gr.jp> writes:
> Right now, it performs on relation scan or join only. However, we will be
> able to apply same concept on aggregation.
> For example, an aggregation node on a foreign table scan is a good
> candidate to push down because it can be replaced with a custom-
> logic that scans a materialized result of the remote aggregation query,
> if its cost is enough cheap than local aggregation.
> Probably, we need to add a hook and some logic to compare the
> built-in aggregation and alternative paths provided by extensions.

Note that this is another thing that's blocked on Path-ifying the work
now done in grouping_planner.  We don't currently have a way to represent
a local aggregation, much less a remote one, as a Path.  We definitely
need that before we can open up any of that logic to FDWs.
        regards, tom lane

Re: Status of FDW pushdowns

From

Tom Lane

Date:

21 November 2013, 15:46:38

Merlin Moncure <mmoncure@gmail.com> writes:
> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> I know join pushdowns seem insignificant, but it helps to restrict what
>> data must be passed back because you would only pass back joined rows.

> By 'insignificant' you mean 'necessary to do any non-trivial real
> work'.   Personally, I'd prefer it if FDW was extended to allow
> arbitrary parameterized queries like every other database connectivity
> API ever made ever.

[ shrug... ]  So use dblink.  For better or worse, the FDW stuff is
following the SQL standard's SQL/MED design, which does not do it
like that.
        regards, tom lane

Re: Status of FDW pushdowns

From

Shigeru Hanada

Date:

22 November 2013, 00:41:18

2013/11/22 Kohei KaiGai <kaigai@kaigai.gr.jp>:
> 2013/11/21 Bruce Momjian <bruce@momjian.us>:
>> Where are we on the remaining possible pushdowns for foreign data
>> wrappers, particularly the Postgres one?  I know we do WHERE restriction
>> pushdowns in 9.3, but what about join and aggregate pushdowns?  Is
>> anyone working on those?
>>
>> I know join pushdowns seem insignificant, but it helps to restrict what
>> data must be passed back because you would only pass back joined rows.
>>
>> Do we document these missing features anywhere?
>>
> Probably, custom-scan api will provide more flexible way to push-down
> aggregate, sort or other stuff performing on regular tables, not only
> foreign tables.
> It allows extensions to offer alternative scan/join path on the planning
> stage, then executor callbacks its custom logic instead of the built-in
> one, if its cost is cheaper.

IIRC, sort push-down is already supported.  We can provide sorted
pathes by setting Pathkeys to additional ForeignPath.  postgres_fdw
doesn't support this feature because we couldn't get consensus about
how to limit sort variation. One idea was to allow to define "foreign
index" on foreign tables to indicate which column combination is
reasonably sortable.
-- 
Shigeru HANADA

Re: Status of FDW pushdowns

From

Shigeru Hanada

Date:

22 November 2013, 00:43:22

2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
>>> I know join pushdowns seem insignificant, but it helps to restrict what
>>> data must be passed back because you would only pass back joined rows.
>
>> By 'insignificant' you mean 'necessary to do any non-trivial real
>> work'.   Personally, I'd prefer it if FDW was extended to allow
>> arbitrary parameterized queries like every other database connectivity
>> API ever made ever.
>
> [ shrug... ]  So use dblink.  For better or worse, the FDW stuff is
> following the SQL standard's SQL/MED design, which does not do it
> like that.

Pass-through mode mentioned in SQL/MED standard might be what he wants.

-- 
Shigeru HANADA

Re: Status of FDW pushdowns

From

David Fetter

Date:

22 November 2013, 01:39:12

On Thu, Nov 21, 2013 at 10:46:14AM -0500, Tom Lane wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
> > On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
> >> I know join pushdowns seem insignificant, but it helps to restrict what
> >> data must be passed back because you would only pass back joined rows.
> 
> > By 'insignificant' you mean 'necessary to do any non-trivial real
> > work'.   Personally, I'd prefer it if FDW was extended to allow
> > arbitrary parameterized queries like every other database connectivity
> > API ever made ever.
> 
> [ shrug... ]  So use dblink.

Not with a non-PostgreSQL data source.

> For better or worse, the FDW stuff is following the SQL standard's
> SQL/MED design, which does not do it like that.

What SQL/MED specifies along this line is purely a caution against
making a specification without a reference implementation.  If I'm
reading it correctly, it's literally impossible to make what they
suggest safe.

Given those givens, we're free to do this in a way that's not
barking-at-the-moon crazy.  At least two inter-database communication
links which work with PostgreSQL do this..

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Status of FDW pushdowns

From

Merlin Moncure

Date:

22 November 2013, 14:25:14

On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:
> 2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>:
>> Merlin Moncure <mmoncure@gmail.com> writes:
>>> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
>>>> I know join pushdowns seem insignificant, but it helps to restrict what
>>>> data must be passed back because you would only pass back joined rows.
>>
>>> By 'insignificant' you mean 'necessary to do any non-trivial real
>>> work'.   Personally, I'd prefer it if FDW was extended to allow
>>> arbitrary parameterized queries like every other database connectivity
>>> API ever made ever.
>>
>> [ shrug... ]  So use dblink.  For better or worse, the FDW stuff is
>> following the SQL standard's SQL/MED design, which does not do it
>> like that.
>
> Pass-through mode mentioned in SQL/MED standard might be what he wants.

happen to have a link handy?

merlin

Re: Status of FDW pushdowns

From

David Fetter

Date:

22 November 2013, 18:46:15

On Fri, Nov 22, 2013 at 08:25:05AM -0600, Merlin Moncure wrote:
> On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
> <shigeru.hanada@gmail.com> wrote:
> > 2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>:
> >> Merlin Moncure <mmoncure@gmail.com> writes:
> >>> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
> >>>> I know join pushdowns seem insignificant, but it helps to restrict what
> >>>> data must be passed back because you would only pass back joined rows.
> >>
> >>> By 'insignificant' you mean 'necessary to do any non-trivial real
> >>> work'.   Personally, I'd prefer it if FDW was extended to allow
> >>> arbitrary parameterized queries like every other database connectivity
> >>> API ever made ever.
> >>
> >> [ shrug... ]  So use dblink.  For better or worse, the FDW stuff is
> >> following the SQL standard's SQL/MED design, which does not do it
> >> like that.
> >
> > Pass-through mode mentioned in SQL/MED standard might be what he wants.
> 
> happen to have a link handy?

http://www.wiscorp.com/sql20nn.zip

You'll want to look at the PDF with MED in its title.

Passthrough mode, which is how the standard "handles" this problem is
basically a thing where you set it to be on, then everything your send
until setting it to off is passed through to the remote side.  The
people writing the standard didn't think too much about the
possibility that the remote side might speak a broader or different
dialect of SQL from the local server.  They also didn't imagine cases
where what's being passed isn't SQL at all.

In addition to breaking any possible parser, the "feature" as
described in the standard is just ripe for un-patchable exploits *in
its design*.

Of all the misdesign-by-committee contained in the standard, this
piece is far and away the stupidest I've encountered to date.  We
should not even vaguely attempt to implement it.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Status of FDW pushdowns

From

Dimitri Fontaine

Date:

25 November 2013, 09:17:33

Merlin Moncure <mmoncure@gmail.com> writes:
> By 'insignificant' you mean 'necessary to do any non-trivial real
> work'.   Personally, I'd prefer it if FDW was extended to allow
> arbitrary parameterized queries like every other database connectivity
> API ever made ever.  But in lieu of that, I'll take as much push down
> power as possible :-D.

That sounds more like FOREIGN VIEW and FOREIGN FUNCTION to me, where you
could have the whole control of the local/remote boundaries.

I mean that when planning a query using a FOREIGN VIEW it would probably
make sense to consider it as a CTE as far as the optimizer is concerned.

About FOREIGN FUNCTION, that would allow to inject arbitrary parameters
anywhere in the remote query when doing SQL functions. We have a very
nice version of FOREIGN FUNCTION already, that's plproxy.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Status of FDW pushdowns

From

Shigeru Hanada

Date:

27 November 2013, 10:20:22

Hi Merlin,

2013/11/22 Merlin Moncure <mmoncure@gmail.com>:
> On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
> <shigeru.hanada@gmail.com> wrote:
>> 2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>:
>>> Merlin Moncure <mmoncure@gmail.com> writes:
>>>> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
>>>>> I know join pushdowns seem insignificant, but it helps to restrict what
>>>>> data must be passed back because you would only pass back joined rows.
>>>
>>>> By 'insignificant' you mean 'necessary to do any non-trivial real
>>>> work'.   Personally, I'd prefer it if FDW was extended to allow
>>>> arbitrary parameterized queries like every other database connectivity
>>>> API ever made ever.
>>>
>>> [ shrug... ]  So use dblink.  For better or worse, the FDW stuff is
>>> following the SQL standard's SQL/MED design, which does not do it
>>> like that.
>>
>> Pass-through mode mentioned in SQL/MED standard might be what he wants.
>
> happen to have a link handy?

SQL/MED standard doesn't say much about PASS THROUGH mode, especially
about interaction between client.  Besides it, I think it would be
nice to allow arbitrary FDW as backend of dblink interface like this:

postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
postgres=> SELECT * FROM dblink('con1', 'some query written in remote
syntax') as t(/* record type definition */...);

This provides a way to execute query without defining foreign table.
-- 
Shigeru HANADA

Re: Status of FDW pushdowns

From

Alvaro Herrera

Date:

27 November 2013, 13:29:42

Shigeru Hanada escribió:

> SQL/MED standard doesn't say much about PASS THROUGH mode, especially
> about interaction between client.  Besides it, I think it would be
> nice to allow arbitrary FDW as backend of dblink interface like this:
> 
> postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
> postgres=> SELECT * FROM dblink('con1', 'some query written in remote
> syntax') as t(/* record type definition */...);
> 
> This provides a way to execute query without defining foreign table.

Seems to me that if you want to read remote tables without creating a
foreign table, you could define them locally using something like the
WITH syntax and then use them normally in the rest of the query.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Status of FDW pushdowns

From

Dimitri Fontaine

Date:

27 November 2013, 13:59:35

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Seems to me that if you want to read remote tables without creating a
> foreign table, you could define them locally using something like the
> WITH syntax and then use them normally in the rest of the query.

I guess what's needed here is a kind of barrier that allows pushing a
whole arbitrary subquery (with joins and quals and whatnot) down to the
remote side.

My current thinking about how to solve that would be to add a notion of
FOREIGN VIEW in our system, which would basically implement that barrier
and send the view definition on the remote, with known quals values as
constants, or something like that.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Status of FDW pushdowns

From

Merlin Moncure

Date:

27 November 2013, 16:05:39

On Wed, Nov 27, 2013 at 4:20 AM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:
> Hi Merlin,
>
> 2013/11/22 Merlin Moncure <mmoncure@gmail.com>:
>> On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
>> <shigeru.hanada@gmail.com> wrote:
>>> 2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>:
>>>> Merlin Moncure <mmoncure@gmail.com> writes:
>>>>> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
>>>>>> I know join pushdowns seem insignificant, but it helps to restrict what
>>>>>> data must be passed back because you would only pass back joined rows.
>>>>
>>>>> By 'insignificant' you mean 'necessary to do any non-trivial real
>>>>> work'.   Personally, I'd prefer it if FDW was extended to allow
>>>>> arbitrary parameterized queries like every other database connectivity
>>>>> API ever made ever.
>>>>
>>>> [ shrug... ]  So use dblink.  For better or worse, the FDW stuff is
>>>> following the SQL standard's SQL/MED design, which does not do it
>>>> like that.
>>>
>>> Pass-through mode mentioned in SQL/MED standard might be what he wants.
>>
>> happen to have a link handy?
>
> SQL/MED standard doesn't say much about PASS THROUGH mode, especially
> about interaction between client.  Besides it, I think it would be
> nice to allow arbitrary FDW as backend of dblink interface like this:
>
> postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
> postgres=> SELECT * FROM dblink('con1', 'some query written in remote
> syntax') as t(/* record type definition */...);
>
> This provides a way to execute query without defining foreign table.

yeah.  (thanks for indulging -- this is barely on topic I guess).

if it were possible to create a supporting function (say, fdw_link)
that could somehow interface with a previously established server, it
could probably be worked out.   Then all FDW could leverage
parameterization without having to copy and paste the pgsql-fdw qual
push code.  But that would be a fairly large break from the rest of
the FDW syntax and having to define the record at each call site is
admittedly a bit of a headache.

Hm, another way to think about this would be to somehow abstract the
qual push into a library so that it could be accessed by other FDWs if
they opted in.  This would address my chief complaint that only the
pgsql-fdw (the only database for which we already have an in-core high
quality connection api) driver could tap the excellent work you've
done.  If this were even possible, it would probably result in more
fdw API changes.

If my:

SELECT * FROM big_sql_server_foreign_table WHERE id = x;

was fast, that'd be pretty nice.

merlin

Re: Status of FDW pushdowns

From

Shigeru Hanada

Date:

27 November 2013, 16:53:05

2013/11/27 Dimitri Fontaine <dimitri@2ndquadrant.fr>:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Seems to me that if you want to read remote tables without creating a
>> foreign table, you could define them locally using something like the
>> WITH syntax and then use them normally in the rest of the query.
>
> I guess what's needed here is a kind of barrier that allows pushing a
> whole arbitrary subquery (with joins and quals and whatnot) down to the
> remote side.

Yes, a big problem is how to skip parsing remote query in PG context.
Bare query string (other than string literal) always parsed by PG
parser, but remote side would have different syntax and semantics, as
Dimitri says we need to pass whole of arbitrary query string to remote
side as-is.

> My current thinking about how to solve that would be to add a notion of
> FOREIGN VIEW in our system, which would basically implement that barrier
> and send the view definition on the remote, with known quals values as
> constants, or something like that.

I'm sorry but I don't see the point here.  Do you mean that user
executes CREATE FOREIGN VIEW in advance and uses the view in a
subsequent query? Or, allow new syntax like WITH alias AS FOREIGN VIEW
(remote query)?

I think it's nice to support executing ad-hoc remote query written in
the syntax which is valid only on remote data source through FDW, and
at the moment dblink interface seems feasible for that purpose.

-- 
Shigeru HANADA

Re: Status of FDW pushdowns

From

Dimitri Fontaine

Date:

27 November 2013, 17:22:13

Shigeru Hanada <shigeru.hanada@gmail.com> writes:
> I'm sorry but I don't see the point here.  Do you mean that user
> executes CREATE FOREIGN VIEW in advance and uses the view in a

Yes that's what I mean.

> I think it's nice to support executing ad-hoc remote query written in
> the syntax which is valid only on remote data source through FDW, and
> at the moment dblink interface seems feasible for that purpose.

I guess the view query would have to be validated by the FDW, which
would just receive a text.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Status of FDW pushdowns

From

Atri Sharma

Date:

27 November 2013, 17:35:27

> I guess the view query would have to be validated by the FDW, which
> would just receive a text.

+1

This is not exactly in context, but I and David Fetter discussed
recently how we could do similar thing.

This would work,but how can we do it for FDWs which do not parse SQL?

Am I missing something here?

Regards,

Atri


-- 
Regards,

Atri
l'apprenant

Re: Status of FDW pushdowns

From

Dimitri Fontaine

Date:

27 November 2013, 17:38:21

Atri Sharma <atri.jiit@gmail.com> writes:
> This would work,but how can we do it for FDWs which do not parse SQL?
> Am I missing something here?

Worst case:
 CREATE FOREIGN VIEW foo AS $$   whatever syntax is accepted on the other side $$;

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Status of FDW pushdowns

From

Atri Sharma

Date:

27 November 2013, 17:40:13

On Wed, Nov 27, 2013 at 11:08 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Atri Sharma <atri.jiit@gmail.com> writes:
>> This would work,but how can we do it for FDWs which do not parse SQL?
>> Am I missing something here?
>
> Worst case:
>
>   CREATE FOREIGN VIEW foo
>   AS $$
>     whatever syntax is accepted on the other side


That doesnt sound like a very good idea.

Can we add a function to the FDW API to define a SQL to foreign server
side conversion?

I am just musing though.

Regards,

Atri



-- 
Regards,

Atri
l'apprenant

Re: Status of FDW pushdowns

From

David Fetter

Date:

27 November 2013, 19:24:14

On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote:
> Shigeru Hanada escribió:
> 
> > SQL/MED standard doesn't say much about PASS THROUGH mode, especially
> > about interaction between client.  Besides it, I think it would be
> > nice to allow arbitrary FDW as backend of dblink interface like this:
> > 
> > postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
> > postgres=> SELECT * FROM dblink('con1', 'some query written in remote
> > syntax') as t(/* record type definition */...);
> > 
> > This provides a way to execute query without defining foreign table.
> 
> Seems to me that if you want to read remote tables without creating a
> foreign table, you could define them locally using something like the
> WITH syntax and then use them normally in the rest of the query.

WITH, or SRF, or whatever, the point is that we need to be able to
specify what we're sending--probably single opaque strings delimited
just as we do other strings--and what we might get back--errors only,
rows, [sets of] refcursors are the ones I can think of offhand.

What we can't do is assume that our parser needs to, or even could, in
principle, understand these things in more detail than that.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Status of FDW pushdowns

From

Atri Sharma

Date:

27 November 2013, 19:59:52

On Thu, Nov 28, 2013 at 12:54 AM, David Fetter <david@fetter.org> wrote:
> On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote:
>> Shigeru Hanada escribió:
>>
>> > SQL/MED standard doesn't say much about PASS THROUGH mode, especially
>> > about interaction between client.  Besides it, I think it would be
>> > nice to allow arbitrary FDW as backend of dblink interface like this:
>> >
>> > postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
>> > postgres=> SELECT * FROM dblink('con1', 'some query written in remote
>> > syntax') as t(/* record type definition */...);
>> >
>> > This provides a way to execute query without defining foreign table.
>>
>> Seems to me that if you want to read remote tables without creating a
>> foreign table, you could define them locally using something like the
>> WITH syntax and then use them normally in the rest of the query.
>
> WITH, or SRF, or whatever, the point is that we need to be able to
> specify what we're sending--probably single opaque strings delimited
> just as we do other strings--and what we might get back--errors only,
> rows, [sets of] refcursors are the ones I can think of offhand.

+1

The input-output formats need to be defined clearly.

How about sending parse trees? Is it even possible?

> What we can't do is assume that our parser needs to, or even could, in
> principle, understand these things in more detail than that.

Agreed.

I wonder if its possible to give this task to the FDW implementing
authority instead, and get FDW to translate to the required format.




--
Regards,

Atri
l'apprenant

Re: Status of FDW pushdowns

From

David Fetter

Date:

27 November 2013, 21:05:32

On Thu, Nov 28, 2013 at 01:29:46AM +0530, Atri Sharma wrote:
> On Thu, Nov 28, 2013 at 12:54 AM, David Fetter <david@fetter.org> wrote:
> > On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote:
> >> Shigeru Hanada escribió:
> >>
> >> > SQL/MED standard doesn't say much about PASS THROUGH mode, especially
> >> > about interaction between client.  Besides it, I think it would be
> >> > nice to allow arbitrary FDW as backend of dblink interface like this:
> >> >
> >> > postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
> >> > postgres=> SELECT * FROM dblink('con1', 'some query written in remote
> >> > syntax') as t(/* record type definition */...);
> >> >
> >> > This provides a way to execute query without defining foreign table.
> >>
> >> Seems to me that if you want to read remote tables without creating a
> >> foreign table, you could define them locally using something like the
> >> WITH syntax and then use them normally in the rest of the query.
> >
> > WITH, or SRF, or whatever, the point is that we need to be able to
> > specify what we're sending--probably single opaque strings delimited
> > just as we do other strings--and what we might get back--errors only,
> > rows, [sets of] refcursors are the ones I can think of offhand.
> 
> +1
> 
> The input-output formats need to be defined clearly.
> 
> How about sending parse trees? Is it even possible?

I don't see why parse trees wouldn't be something that could
eventually be sent to other PostgreSQL servers, but I see that whole
discussion as orthogonal to this one.

My point here is that there needs to be an "escape to native" system
available in SQL so people can communicate directly with the remote
systems in the systems' own languages.  It's a little bit analogous to
making assembler available from C, or C from HLLs.

> > What we can't do is assume that our parser needs to, or even could, in
> > principle, understand these things in more detail than that.
> 
> Agreed.
> 
> I wonder if its possible to give this task to the FDW implementing
> authority instead, and get FDW to translate to the required format.

I don't know that the FDW would necessarily need to get involved
except in the sense of "full recognition before processing."
http://langsec.org/occupy/

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Status of FDW pushdowns

From

Dimitri Fontaine

Date:

27 November 2013, 23:49:31

Atri Sharma <atri.jiit@gmail.com> writes:
> Can we add a function to the FDW API to define a SQL to foreign server
> side conversion?

I think that's not tenable. Even if you limit the discussion to the
postgres_fdw, some queries against past version will stop working
against new version (keywords changes, catalogs, default settings, etc).

I don't think you want to embed a full parser of every supported FOREIGN
version of PostgreSQL inside the postgres_fdw code, so I think the text
of the view needs to be an opaque string.

-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Status of FDW pushdowns

From

Tom Lane

Date:

28 November 2013, 00:17:32

Dimitri Fontaine <dimitri@2ndquadrant.fr> writes:
> Atri Sharma <atri.jiit@gmail.com> writes:
>> Can we add a function to the FDW API to define a SQL to foreign server
>> side conversion?

> I think that's not tenable. Even if you limit the discussion to the
> postgres_fdw, some queries against past version will stop working
> against new version (keywords changes, catalogs, default settings, etc).

> I don't think you want to embed a full parser of every supported FOREIGN
> version of PostgreSQL inside the postgres_fdw code, so I think the text
> of the view needs to be an opaque string.

I'm not real sure what this'd buy us that wouldn't be done as well or
better by creating a view on the remote side.  (IOW, there's nothing
that says that the remote object backing a foreign table can't be a
view.)
        regards, tom lane

Re: Status of FDW pushdowns

From

Dimitri Fontaine

Date:

28 November 2013, 10:43:54

Tom Lane <tgl@sss.pgh.pa.us> writes:
> I'm not real sure what this'd buy us that wouldn't be done as well or
> better by creating a view on the remote side.  (IOW, there's nothing
> that says that the remote object backing a foreign table can't be a
> view.)

Agreed, for those remote sides that know what a view is.
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Status of FDW pushdowns

From

Atri Sharma

Date:

29 November 2013, 14:57:57


Sent from my iPad

> On 28-Nov-2013, at 16:13, Dimitri Fontaine <dimitri@2ndQuadrant.fr> wrote:
>
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> I'm not real sure what this'd buy us that wouldn't be done as well or
>> better by creating a view on the remote side.  (IOW, there's nothing
>> that says that the remote object backing a foreign table can't be a
>> view.)
>
> Agreed, for those remote sides that know what a view is.

I agree.

I agree with the overall model here, but I am not sure how it would work out for non SQL supporting remote sides.

Regards,
Atri

Re: Status of FDW pushdowns

From

Craig Ringer

Date:

03 December 2013, 03:16:08

On 11/28/2013 03:24 AM, David Fetter wrote:
> WITH, or SRF, or whatever, the point is that we need to be able to
> specify what we're sending--probably single opaque strings delimited
> just as we do other strings--and what we might get back--errors only,
> rows, [sets of] refcursors are the ones I can think of offhand.

So, you're thinking of something like:

WITH FOREIGN somecte AS $$... foreign query ...$$
SELECT ...
FROM somecte;

?


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Status of FDW pushdowns

From

David Fetter

Date:

03 December 2013, 04:26:40

On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote:
> On 11/28/2013 03:24 AM, David Fetter wrote:
> > WITH, or SRF, or whatever, the point is that we need to be able to
> > specify what we're sending--probably single opaque strings delimited
> > just as we do other strings--and what we might get back--errors only,
> > rows, [sets of] refcursors are the ones I can think of offhand.
> 
> So, you're thinking of something like:
> 
> WITH FOREIGN somecte AS $$... foreign query ...$$
> SELECT ...
> FROM somecte;

I was picturing something a little more like an SRF which would take
one opaque string, the remote command, some descriptor, perhaps an
enum, of what if anything might come back.  Long ago, I implemented a
similar thing in DBI-Link.  It was called

remote_exec_dbh(data_source_id integer, query text, returns_rows bool)

That covered only some of the cases I've come up with since.  My
current idea is something more like

remote_execute(   data_source_id integer, /* This corresponds to a "handle" in SQL/MED-speak */   query text,   returns
enum(      'void',       'some_type',       'setof some_type',       'setof record',       'refcursor',       'setof
refcursor'  )

)

This could appear in a FROM or WITH clause.  There might also be some
aliasing and/or syntactic sugar along the lines of remote_execute_void(),
remote_execute_rows(), etc.  Given the possibility of cooperation with 
the planner and executor, we might want to extend some attributes like
sortedness where applicable.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Status of FDW pushdowns

From

Merlin Moncure

Date:

04 December 2013, 18:43:50

On Mon, Dec 2, 2013 at 10:26 PM, David Fetter <david@fetter.org> wrote:
> On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote:
>> On 11/28/2013 03:24 AM, David Fetter wrote:
>> > WITH, or SRF, or whatever, the point is that we need to be able to
>> > specify what we're sending--probably single opaque strings delimited
>> > just as we do other strings--and what we might get back--errors only,
>> > rows, [sets of] refcursors are the ones I can think of offhand.
>>
>> So, you're thinking of something like:
>>
>> WITH FOREIGN somecte AS $$... foreign query ...$$
>> SELECT ...
>> FROM somecte;
>
> I was picturing something a little more like an SRF which would take
> one opaque string, the remote command, some descriptor, perhaps an
> enum, of what if anything might come back.  Long ago, I implemented a
> similar thing in DBI-Link.  It was called
>
> remote_exec_dbh(data_source_id integer, query text, returns_rows bool)

Couple thoughts:
*) Any 'pass through' API should support parameterization (the FDW may
not support that, but many will and API should allow for it).   Lack
of parameterization is a major downside of dblink.  The function could
be set up to be variadic for the parameters.

*) For a connectivity APIs of this style, Dblink-ish mechanic of
separating command execution from data returning commands is likely
the right way to go.  Also, probably better to stick with SRF
mechanics if we go the 'function route'.  So basically we are making
dblink for FDW, adding parameterization and some concept of utilizing
the foreign server.

All this is assuming we are adding a special remote execution function
('fdwlink').  While that would be great, it's a significant deviation
from the standard into postgresql specific SRF syntax.   If some of
the qual pushdown deparsing functionality could be put inside the
internal FDW API, then you'd get the best of both worlds.  Maybe you'd
still want a dblink style extension anyways, but it wouldn't be as
critical.

merlin

Re: Status of FDW pushdowns

From

David Fetter

Date:

04 December 2013, 19:39:33

On Wed, Dec 04, 2013 at 12:43:44PM -0600, Merlin Moncure wrote:
> On Mon, Dec 2, 2013 at 10:26 PM, David Fetter <david@fetter.org> wrote:
> > On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote:
> >> On 11/28/2013 03:24 AM, David Fetter wrote:
> >> > WITH, or SRF, or whatever, the point is that we need to be able to
> >> > specify what we're sending--probably single opaque strings delimited
> >> > just as we do other strings--and what we might get back--errors only,
> >> > rows, [sets of] refcursors are the ones I can think of offhand.
> >>
> >> So, you're thinking of something like:
> >>
> >> WITH FOREIGN somecte AS $$... foreign query ...$$
> >> SELECT ...
> >> FROM somecte;
> >
> > I was picturing something a little more like an SRF which would take
> > one opaque string, the remote command, some descriptor, perhaps an
> > enum, of what if anything might come back.  Long ago, I implemented a
> > similar thing in DBI-Link.  It was called
> >
> > remote_exec_dbh(data_source_id integer, query text, returns_rows bool)
> 
> Couple thoughts:
> *) Any 'pass through' API should support parameterization (the FDW may
> not support that, but many will and API should allow for it).   Lack
> of parameterization is a major downside of dblink.  The function could
> be set up to be variadic for the parameters.

I don't know for sure that that needs to be in version 1 of this.  It
definitely shouldn't block implementing the non-parameterized one.

> *) For a connectivity APIs of this style, Dblink-ish mechanic of
> separating command execution from data returning commands is likely
> the right way to go.  Also, probably better to stick with SRF
> mechanics if we go the 'function route'.  So basically we are making
> dblink for FDW, adding parameterization and some concept of utilizing
> the foreign server.

Yes, modulo the above.

> All this is assuming we are adding a special remote execution function
> ('fdwlink').  While that would be great, it's a significant deviation
> from the standard into postgresql specific SRF syntax.

What the standard has is literally insane.

> If some of the qual pushdown deparsing functionality could be put
> inside the internal FDW API, then you'd get the best of both worlds.

If this were flawless on the PostgreSQL side (i.e. our path generator
understood everything perfectly including aggregates) and trivial to
implement correctly in FDWs, certainly.

The idea here is that such a happy situation will not obtain until
much later, if ever, and meanwhile, we need a way to get things
accomplished even if it's inelegant, inefficient, etc.  The
alternative is that those things simply will not get accomplished at
all.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Status of FDW pushdowns

From

Tom Lane

Date:

04 December 2013, 20:04:55

David Fetter <david@fetter.org> writes:
> The idea here is that such a happy situation will not obtain until
> much later, if ever, and meanwhile, we need a way to get things
> accomplished even if it's inelegant, inefficient, etc.  The
> alternative is that those things simply will not get accomplished at
> all.

If that's the argument, why not just use dblink or dbilink, and be
happy?  This discussion sounds a whole lot like it's trending to a
conclusion of wanting one of those in core, which is not where
I'd like to end up.
        regards, tom lane

Re: Status of FDW pushdowns

From

Merlin Moncure

Date:

04 December 2013, 20:18:28

On Wed, Dec 4, 2013 at 1:39 PM, David Fetter <david@fetter.org> wrote:
> On Wed, Dec 04, 2013 at 12:43:44PM -0600, Merlin Moncure wrote:
>> On Mon, Dec 2, 2013 at 10:26 PM, David Fetter <david@fetter.org> wrote:
>> > On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote:
>> >> On 11/28/2013 03:24 AM, David Fetter wrote:
>> >> > WITH, or SRF, or whatever, the point is that we need to be able to
>> >> > specify what we're sending--probably single opaque strings delimited
>> >> > just as we do other strings--and what we might get back--errors only,
>> >> > rows, [sets of] refcursors are the ones I can think of offhand.
>> >>
>> >> So, you're thinking of something like:
>> >>
>> >> WITH FOREIGN somecte AS $$... foreign query ...$$
>> >> SELECT ...
>> >> FROM somecte;
>> >
>> > I was picturing something a little more like an SRF which would take
>> > one opaque string, the remote command, some descriptor, perhaps an
>> > enum, of what if anything might come back.  Long ago, I implemented a
>> > similar thing in DBI-Link.  It was called
>> >
>> > remote_exec_dbh(data_source_id integer, query text, returns_rows bool)
>>
>> Couple thoughts:
>> *) Any 'pass through' API should support parameterization (the FDW may
>> not support that, but many will and API should allow for it).   Lack
>> of parameterization is a major downside of dblink.  The function could
>> be set up to be variadic for the parameters.
>
> I don't know for sure that that needs to be in version 1 of this.  It
> definitely shouldn't block implementing the non-parameterized one.

I'm not making the case it should be version anything.  But, if you
went dblink style, you'd want to go variadic.  It's not really any
extra work and you can always embed the string if the FDW driver
doesn't support parameterization.

> What the standard has is literally insane.

Not sure I agree.  The guiding principle of the standard
implementation AIUI is that it wants to connectivity management via
syntax and keep the DML abstractions clean (minus some
un-implementable things like RI triggers).  In other words, you write
exactly the same queries for native and foreign tables.  This makes
things much easier for people who just want to write SQL the classical
way and not get into funky vendor specific APIs.

The downside of SQL-MED, particularly the way postgres implemented the
driver API, is that each driver is responsible for for all
optimization efforts and I think this is bad.  So I'm openly wondering
if the FDW API should expose optional query rewriting hooks.  The
odbc-fdw and jdbc-fdw drivers for example could then benefit from
those hooks so that qual pushdown could be implemented with far less
code duplication and effort and a *much* broader set of problems could
be addressed by FDW.  For non- or exotic- SQL implementations those
hooks could be implemented locally by the driver or disabled if
doesn't make sense to use them.

merlin

Re: Status of FDW pushdowns

From

Tom Lane

Date:

04 December 2013, 20:23:46

Merlin Moncure <mmoncure@gmail.com> writes:
> The downside of SQL-MED, particularly the way postgres implemented the
> driver API, is that each driver is responsible for for all
> optimization efforts and I think this is bad.

There was never any intention that that would be the final state of
things.  All the FDW APIs are quite experimental at this point, and
subject to change, and one of the reasons for change is going to be
to improve the optimization situation.

At the same time, it's hard to say what might constitute optimization
for FDWs that aren't backed by a remote SQL database.  There are always
going to be reasons why an FDW will have to do some of that work for
itself.
        regards, tom lane

Re: Status of FDW pushdowns

From

David Fetter

Date:

05 December 2013, 00:28:42

On Wed, Dec 04, 2013 at 03:04:31PM -0500, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > The idea here is that such a happy situation will not obtain until
> > much later, if ever, and meanwhile, we need a way to get things
> > accomplished even if it's inelegant, inefficient, etc.  The
> > alternative is that those things simply will not get accomplished
> > at all.
> 
> If that's the argument, why not just use dblink or dbilink, and be
> happy?  This discussion sounds a whole lot like it's trending to a
> conclusion of wanting one of those in core, which is not where I'd
> like to end up.

Telling people who've already installed and configured an FDW that for
perfectly ordinary expected functionality they'll need to install yet
another piece of software, configure it, keep its configuration in
sync with the FDW configuration, etc., is just a ridiculous.  So yes,
we do need this functionality and it does need to be part of our FDW
implementation.

Just exactly where we draw the line between built-ins and APIs is the
conversation I thought we were having.  The minimal thing would be
providing database handles per SQL/MED and a few tools to manipulate
same.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Status of FDW pushdowns

From

Tom Lane

Date:

05 December 2013, 01:02:55

David Fetter <david@fetter.org> writes:
> On Wed, Dec 04, 2013 at 03:04:31PM -0500, Tom Lane wrote:
>> If that's the argument, why not just use dblink or dbilink, and be
>> happy?  This discussion sounds a whole lot like it's trending to a
>> conclusion of wanting one of those in core, which is not where I'd
>> like to end up.

> Telling people who've already installed and configured an FDW that for
> perfectly ordinary expected functionality they'll need to install yet
> another piece of software, configure it, keep its configuration in
> sync with the FDW configuration, etc., is just a ridiculous.

Perfectly ordinary expected functionality according to who?  Not the
SQL standard, for sure.
        regards, tom lane