Re: Federated Postgresql architecture ? - Mailing list pgsql-performance

From Chris Browne
Subject Re: Federated Postgresql architecture ?
Date
Msg-id 60zlp6zqsm.fsf@dba2.int.libertyrms.com
Whole thread Raw
In response to Federated Postgresql architecture ?  (kevin kempter <kevin@kevinkempterllc.com>)
Responses Re: Federated Postgresql architecture ?
List pgsql-performance
josh@agliodbs.com (Josh Berkus) writes:
> Jonah,
>
>> Hmm, I didn't think the Skype tools could really provide federated
>> database functionality without a good amount of custom work.  Or, am I
>> mistaken?
>
> Sure, what do you think pl/proxy is for?

Ah, but the thing is, it changes the model from a relational one,
where you can have fairly arbitrary "where clauses," to one where
parameterization of queries must be predetermined.

The "hard part" of federated database functionality at this point is
the [parenthesized portion] of...

  select * from table@node [where criterion = x];

What we'd like to be able to do is to ascertain that [where criterion
= x] portion, and run it on the remote DBMS, so that only the relevant
tuples would come back.

Consider...

What if table@node is a remote table with 200 million tuples, and
[where criterion = x] restricts the result set to 200 of those.

If you *cannot* push the "where clause" down to the remote node, then
you're stuck with pulling all 200 million tuples, and filtering out,
on the "local" node, the 200 tuples that need to be kept.

To do better, with pl/proxy, requires having a predetermined function
that would do that filtering, and if it's missing, you're stuck
pulling 200M tuples, and throwing out nearly all of them.

In contrast, with the work David Fetter's looking at, the [where
criterion = x] clause would get pushed to the node which the data is
being drawn from, and so the query, when running on "table@node,"
could use indices, and return only the 200 tuples that are of
interest.

It's a really big win, if it works.
--
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://cbbrowne.com/info/lisp.html
"The avalanche has started, it is too late for the pebbles to vote"
-- Kosh, Vorlon Ambassador to Babylon 5

pgsql-performance by date:

Previous
From: "Sérgio R F Oliveira"
Date:
Subject: Sources of information about sizing of hardwares to run PostgreSQL
Next
From: Ulrich
Date:
Subject: Subquery WHERE IN or WHERE EXISTS faster?