FDW-based dblink - Mailing list pgsql-hackers
From | Itagaki Takahiro |
---|---|
Subject | FDW-based dblink |
Date | |
Msg-id | 20090813184143.7714.52131E4D@oss.ntt.co.jp Whole thread Raw |
Responses |
Re: FDW-based dblink
Re: FDW-based dblink |
List | pgsql-hackers |
Here is a proposal to integrate contrib/dblink and SQL/MED (foreign data wrapper). Dblink manages connections and transactions by itself at the moment, but there are some benefits to split database connectors into FDW. Dblink will uses those multiple connectors. For example, we will be able to retrieve data from Oracle into PostgreSQL directly if we had Oracle-connector. New syntax in SQL ------------------- CREATE FOREIGN DATA WRAPPER postgres CONNECTOR pg_catalog.dblink_postgres; or CREATE FOREIGN DATA WRAPPER postgres OPTIONS (connector 'pg_catalog.dblink_postgres') We don't have to modify gram.y if we take the latter syntax, but need to modify VALIDATORs to distinguish 'connector' and other options. The 'connector' option should not be passed as connection string. New interface in C -------------------- pg_catalog.dblink_postgres is a function that havs folloing prototype: Connection *connector(List *defElems); The argument 'defElems' is a concatenated connection options in FDW, server, and user-mapping. Also new two interfaces will be introduced: interface Connection /* represents PGconn */ { void disconnect(self); Cursor *open(self, query, fetchsize);/* for SELECT */ int64 exec(self, query); /* for UPDATE, INSERT, DELETE */ bool transaction_command(self,type); } interface Cursor /* represents PGresult and server-side cursor */ { bool fetch(self, OUT values); voidclose(self); } They have some methods implemented with function pointers. The benefit of using function pointers is that we only have to export one connector function to pg_proc. The Cursor interface represents both result-set and server-side cursor. PostgreSQL has SQL-level cursor, but there are some database that have protocol-level cursor. This abstraction layer is needed for dblink to handle connectors to other databases. Other features ---------------- Present dblink is a thin wrapper of libpq, but some of my customers want automatic transaction managements. Remote transactions are committed with 2PC when the local transaction is committed. To achieve it, I think we need on-commit trigger is needed, but it is hard to implement with current infrastructure. (That is one of the reason I proposed to merge dblink into core.) Other considerations ---------------------- The proposed method is a SQL-based connector. There might be another approach -- ScanKey-based connector. It is similar to the index access method interface (pg_am). It takes relation id and scankeys instead of SQL text. The scanKey-based approach will work better if we try to pass WHERE-clause to an external database. However, I think we need SQL-based interface in any case. ScanKey will be converted to SQL and passed to an external database. I have a prototype of the feature. I'd like to submit it for 8.5. Comments welcome. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
pgsql-hackers by date: