Re: FDW for PostgreSQL - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: FDW for PostgreSQL |
Date | |
Msg-id | 22766.1360859657@sss.pgh.pa.us Whole thread Raw |
In response to | Re: FDW for PostgreSQL (Shigeru Hanada <shigeru.hanada@gmail.com>) |
Responses |
Re: FDW for PostgreSQL
|
List | pgsql-hackers |
Shigeru Hanada <shigeru.hanada@gmail.com> writes: > On Thu, Feb 14, 2013 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> * AFAICT, the patch expects to use a single connection for all >> operations initiated under one foreign server + user mapping pair. >> I don't think this can possibly be workable. For instance, we don't >> really want postgresIterateForeignScan executing the entire remote query >> to completion and stashing the results locally -- what if that's many >> megabytes? > It uses single-row-mode of libpq and TuplestoreState to keep result > locally, so it uses limited memory at a time. If the result is larger > than work_mem, overflowed tuples are written to temp file. I think > this is similar to materializing query results. Well, yeah, but that doesn't make it an acceptable solution. Consider for instance "SELECT * FROM huge_foreign_table LIMIT 10". People are not going to be satisfied if that pulls back the entire foreign table before handing them the 10 rows. Comparable performance problems can arise even without LIMIT, for instance in handling of nestloop inner scans. >> I think we'd better be prepared to allow multiple similar connections. > Main reason to use single connection is to make multiple results > retrieved from same server in a local query consistent. Hmm. That could be a good argument, although the current patch pretty much destroys any such advantage by being willing to use READ COMMITTED mode on the far end --- with that, you lose any right to expect snapshot-consistent data anyway. I'm inclined to think that maybe we should always use at least REPEATABLE READ mode, rather than blindly copying the local transaction mode. Or maybe this should be driven by a foreign-server option instead of looking at the local mode at all? Anyway, it does seem like maybe we need to use cursors so that we can have several active scans that we are pulling back just a few rows at a time from. I'm not convinced that that gets us out of the woods though WRT needing only one connection. Consider a query that is scanning some foreign table, and it calls a plpgsql function, and that function (inside an EXCEPTION block) does a query that scans another foreign table on the same server. This second query gets an error on the remote side. If the error is caught via the exception block, and the outer query continues, what then? We could imagine adding enough infrastructure to establish a remote savepoint for each local subtransaction and clean things up on failure, but no such logic is in the patch now, and I think it wouldn't be too simple either. The least painful way to make this scenario work, given what's in the patch, is to allow such a subtransaction to use a separate connection. In any case, I'm pretty well convinced that the connection-bookkeeping logic needs a major rewrite to have any hope of working in subtransactions. I'm going to work on that first and see where it leads. >> * I find postgres_fdw_get_connections() and postgres_fdw_disconnect() >> to be a bad idea altogether. > I agree that separate the issue from FDW core. OK, so we'll drop these from the current version of the patch and revisit the problem of closing connections later. regards, tom lane
pgsql-hackers by date: