Re: IMPORT FOREIGN SCHEMA statement - Mailing list pgsql-hackers
From | David Fetter |
---|---|
Subject | Re: IMPORT FOREIGN SCHEMA statement |
Date | |
Msg-id | 20140525194118.GB32355@fetter.org Whole thread Raw |
In response to | IMPORT FOREIGN SCHEMA statement (Ronan Dunklau <ronan.dunklau@dalibo.com>) |
Responses |
Re: IMPORT FOREIGN SCHEMA statement
Re: IMPORT FOREIGN SCHEMA statement |
List | pgsql-hackers |
On Fri, May 23, 2014 at 10:08:06PM +0200, Ronan Dunklau wrote: > Hello, > > Since my last proposal didn't get any strong rebuttal, please find attached a > more complete version of the IMPORT FOREIGN SCHEMA statement. Thanks! Please to send future patches to this thread so people can track them in their mail. > I tried to follow the SQL-MED specification as closely as possible. > > This adds discoverability to foreign servers. The structure of the statement > as I understand it is simple enough: > > IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO | > EXCEPT) table_list ] INTO local_schema. > > The import_foreign_schema patch adds the infrastructure, and a new FDW > routine: > > typedef List *(*ImportForeignSchema_function) (ForeignServer *server, > ImportForeignSchemaStmt * parsetree); > > This routine must return a list of CreateForeignTableStmt mirroring whatever > tables were found on the remote side, which will then be executed. > > The import_foreign_schema_postgres_fdw patch proposes an implementation of > this API for postgres_fdw. It will import a foreign schema using the right > types as well as nullable information. In the case of PostgreSQL, "the right types" are obvious until there's a user-defined one. What do you plan to do in that case? > Regarding documentation, I don't really know where it should have been put. If > I missed something, let me know and I'll try to correct it. It's not exactly something you missed, but I need to bring it up anyway before we go too far. The standard missed two crucial concepts when this part of it was written: 1. No single per-database-type universal type mapping can be correct. People will have differing goals for type mapping, and writing a whole new FDW for each of those goals is, to put it mildly, wasteful. I will illustrate with a concrete and common example. MySQL's datetime type encourages usages which PostgreSQL's corresponding type, timestamptz, simply disallows, namely '0000-00-00 00:00:00' as its idea of UNKNOWN or NULL. One way PostgreSQL's mapping could work is to map it to TEXT, which would preserve the strings exactly and be in some sense an identity map. It would also make the type somewhat useless in its original intended form. Another one would map the type is to a composite data type mysql_datetime(tstz timestamptz, is_wacky boolean) which would capture, for example, ('2014-04-01 00:00:00+00', false) for the UTC start of April Fools' Day this year, and (NULL, true) for '0000-00-00 00:00:00'. There are doubtless others, and there is no principled way to assign any one of them as universally correct. This brings me to the next crucial concept the standard missed: 2. The correct mapping may not be the identity, and furthermore, the inbound and outbound mappings might in general not be mere inversions of each other. MySQL (no aspersions intended) again provides a concrete example with its unsigned integer types. Yes, it's possible to create a domain over INT8 which simulates UINT4, a domain over NUMERIC which simulates UINT8, etc., but again this process's correctness depends on circumstances. To address these problems, I propose the following: - We make type mappings settable at the level of: - FDW - Instance (a.k.a. cluster) - Database - Schema - Table - Column using the existing ALTER command and some way of spelling out how a remote type maps to a local type. This wouldconsist of: - The remote type - The local type to which it maps - The inbound transformation (default identity) - The outbound transformation (default identity) At any given level, the remote type would need to be unique. To communicate this to the system, we either invent newsyntax, with all the hazards attendant thereto, or we could use JSON or similar serialization. ALTER FOREIGN TABLE foo ADD TYPE MAPPING FROM "datetime" TO TEXT WITH ( INBOUND TRANSFORMATIONIDENTITY, OUTBOUND TRANSFORMATION IDENTITY ) /* Ugh!!! */ vs. ALTER FOREIGN TABLE foo ADD (mapping '{ "datetime": "text", "inbound": "IDENTITY", outbound: "IDENTITY" }') Each FDW would have some set of default mappings and some way to override them as above. What say? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
pgsql-hackers by date: