Re: FDW for PostgreSQL - Mailing list pgsql-hackers
From | Shigeru Hanada |
---|---|
Subject | Re: FDW for PostgreSQL |
Date | |
Msg-id | CAEZqfEeFmamNB03enZbhuJEp3j4TpTtbHdgBDmeS2rJ+h5dAHQ@mail.gmail.com Whole thread Raw |
In response to | Re: FDW for PostgreSQL (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: FDW for PostgreSQL
Re: FDW for PostgreSQL |
List | pgsql-hackers |
On Thu, Feb 14, 2013 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > * The code seems to always use GetOuterUserId() to select the foreign > user mapping to use. This seems entirely wrong. For instance it will > do the wrong thing inside a SECURITY DEFINER function, where surely the > relevant privileges should be those of the function owner, not the > session user. I would also argue that if Alice has access to a foreign > table owned by Bob, and Alice creates a view that selects from that > table and grants select privilege on the view to Charlie, then when > Charlie selects from the view the user mapping to use ought to be > Alice's. (If anyone thinks differently about that, speak up!) Agreed that OuterUserId is wrong for user mapping. Also agreed that Charlie doesn't need his own mapping for the server, if he is accessing via a valid view. > To implement that for queries, we need code similar to what > ExecCheckRTEPerms does, ie "rte->checkAsUser ? rte->checkAsUser : > GetUserId()". It's a bit of a pain to get hold of the RTE from > postgresGetForeignRelSize or postgresBeginForeignScan, but it's doable. > (Should we modify the APIs for these functions to make that easier?) This issue seems not specific to postgres_fdw. Currently GetUserMapping takes userid and server's oid as parameters, but we would able to hide the complex rule by replacing userid with RTE or something. > I think possibly postgresAcquireSampleRowsFunc should use the foreign > table's owner regardless of the current user ID - if the user has > permission to run ANALYZE then we don't really want the command to > succeed or fail depending on exactly who the user is. That's perhaps > debatable, anybody have another theory? +1. This allows non-owner to ANALYZE foreign tables without having per-user mapping, though public mapping also solves this issue. In implementation level, postgresAcquireSampleRowsFunc has Relation of the target table, so we can get owner's oid by reading rd_rel->relowner. > * AFAICT, the patch expects to use a single connection for all > operations initiated under one foreign server + user mapping pair. > I don't think this can possibly be workable. For instance, we don't > really want postgresIterateForeignScan executing the entire remote query > to completion and stashing the results locally -- what if that's many > megabytes? It uses single-row-mode of libpq and TuplestoreState to keep result locally, so it uses limited memory at a time. If the result is larger than work_mem, overflowed tuples are written to temp file. I think this is similar to materializing query results. > It ought to be pulling the rows back a few at a time, and > that's not going to work well if multiple scans are sharing the same > connection. (We might be able to dodge that by declaring a cursor > for each scan, but I'm not convinced that such a solution will scale up > to writable foreign tables, nested queries, subtransactions, etc.) Indeed the FDW used CURSOR in older versions. Sorry for that I have not looked writable foreign table patch closely yet, but it would require (may be multiple) remote update query executions during scanning? > I think we'd better be prepared to allow multiple similar connections. > The main reason I'm bringing this up now is that it breaks the > assumption embodied in postgres_fdw_get_connections() and > postgres_fdw_disconnect() that foreign server + user mapping can > constitute a unique key for identifying connections. However ... Main reason to use single connection is to make multiple results retrieved from same server in a local query consistent. Shared snapshot might be helpful for this consistency issue, but I've not tried that with FDW. > * I find postgres_fdw_get_connections() and postgres_fdw_disconnect() > to be a bad idea altogether. These connections ought to be a hidden > implementation matter, not something that the user has a view of, much > less control over. Aside from the previous issue, I believe it's a > trivial matter to crash the patch as it now stands by applying > postgres_fdw_disconnect() to a connection that's in active use. I can > see the potential value in being able to shut down connections when a > session has stopped using them, but this is a pretty badly-designed > approach to that. I suggest that we just drop these functions for now > and revisit that problem later. (One idea is some sort of GUC setting > to control how many connections can be held open speculatively for > future use.) Actually these functions follows dblink's similar functions, but having them is a bad decision because FDW can't connect explicitly. As you mentioned, postgres_fdw_disconnect is provided for clean shutdown on remote side (I needed it in my testing). I agree that separate the issue from FDW core. > * deparse.c contains a depressingly large amount of duplication of logic > from ruleutils.c, and can only need more as we expand the set of > constructs that can be pushed to the remote end. This doesn't seem like > a maintainable approach. Was there a specific reason not to try to use > ruleutils.c for this? I'd much rather tweak ruleutils to expose some > additional APIs, if that's what it takes, than have all this redundant > logic. I got a comment about that issue from you, but I might have misunderstood. (2012/03/09 1:18), Tom Lane wrote: > I've been looking at this patch a little bit over the past day or so. > I'm pretty unhappy with deparse.c --- it seems like a real kluge, > inefficient and full of corner-case bugs. After some thought I believe > that you're ultimately going to have to abandon depending on ruleutils.c > for reverse-listing services, and it would be best to bite that bullet > now and rewrite this code from scratch. I thought that writing ruleutils-free SQL constructor in postgres_fdw is better approach because ruleutils might be changed from its own purpose in future. Besides that, as Kaigai-san mentioned in upthread, ruleutils seems to have insufficient capability for building remote PostgreSQL query. -- Shigeru HANADA
pgsql-hackers by date: