Thread: Re: [GENERAL] dblink - custom datatypes don't work
[ moving to HACKERS ] Mark Gibson wrote: > Joe Conway wrote: >> Mark Gibson wrote: >>> [custom datatype oid mismatch between local and remote side of >>> dblink ] >> >> Without actually having tried it, I think you could hack >> pgresultGetTupleDesc() in dblink.c. Replace the line: atttypid = >> PQftype(res, i); with an SPI based local lookup using attname. >> >> But I'm not sure if it is really a good idea in general to assume >> that the same name for a datatype in two databases means that they >> are actually the same datatype. It would be nice if there was a way >> to register a "signature" of some kind for custom datatypes that >> would be unique enough to be sure same named types were actually >> the same. >> > But matching datatypes by name would surely be safer than matching by > oid - it's not impossible that different datatypes on each side of > the dblink share the same oid. Infact, would it not be more likely > that two datatypes with the same name are actually the same datatype, > than two types with the same oid, not counting internal types. But > then again, are internal types guaranteed to share the same oid > between PostgreSQL versions? Sounds reasonable -- any other thoughts out there? The extra SPI lookup does add cost to every use of the function though. We'd want to figure out how to cache the results of the lookup. > Could a config option be added somewhere to switch between oid and > name matching. With name matching it could be a policy of dblink > usage that datatype names must remain consistent between db's - but I > think this would generally be a good thing away :) I'd be inclined to say that if matching on type name is thought to be better, we ought to just go that way wholesale. Joe
Joe Conway wrote: >>> Without actually having tried it, I think you could hack >>> pgresultGetTupleDesc() in dblink.c. Replace the line: atttypid = >>> PQftype(res, i); with an SPI based local lookup using attname. >> Hi, Well I've been adventuring into the realms of SPI and dblink over the last couple of days. I've taken the approach of having a table that maps remote oids to local oids: CREATE TABLE public.dblink_oid_map ( db_id name, -- unique identifier for a remote database (see notes below) remote_oid oid, local_oid oid, typname name, -- name of the datatype on remote database (aids error reporting) PRIMARY KEY (db_id,remote_oid) ) WITHOUT OIDS; I've modified dblink.c to look up the local oid from this table (see attached patch, against 7.4.1). By default all dblink functions will use this table. The table can be built using dblink itself, but to avoid a chicken-egg situation I've created a new variation on the dblink function: CREATE OR REPLACE FUNCTION public.dblink (text,text,bool) RETURNS setof record AS '$libdir/dblink','dblink_record' LANGUAGE 'C' WITH (isstrict); The last arg can be set to FALSE to disable the oid map. This allows the following function to build the oid map for a remote db: (The arg is a connection string passed to dblink) CREATE OR REPLACE FUNCTION public.dblink_update_oid_map(text) RETURNS void AS ' DECLARE v_connstr ALIAS FOR $1; v_db_id name := ''test''; BEGIN DELETE FROM dblink_oid_map WHERE db_id = v_db_id; INSERT INTO dblink_oid_map (db_id, remote_oid, local_oid, typname) SELECT v_db_id, r.oid, l.oid, r.typname FROM dblink(v_connstr, ''SELECT oid, typname FROM pg_type WHERE typtype IN (''''b'''',''''d'''')'', false ) AS r (oid oid, typname name) LEFT JOIN pg_type l ON (l.typname = r.typname); RETURN; END; ' LANGUAGE 'plpgsql'; I wasn't sure what to do about datatypes in different schemas, this approach allows an admin to customise the map. I usually have all custom types in 'public' so it isn't a problem for me. All datatypes from the remote database are included in the table, so in some cases 'local_oid' may be NULL - the patch handles this by reporting an error informing the user that the datatype doesn't exist locally, hence the 'typname' column. We need to find a way to uniquely identify a remote database, and create a consistent id algorythm for the 'db_id' column. A hash of the connection string attributes: 'host'/'hostaddr' + 'dbname' + 'port' ??? For testing i've used an db_id of type 'name' with the value: 'test'. Unfortunately, it doesn't work using the oid the map, whether custom types are involved or not. All I get is the following message: ERROR: unsupported byval length: nnnn SPI is very new to me (like 2 days old ;). Any suggestions where I've gone wrong? Cheers -- Mark Gibson <gibsonm |AT| cromwell |DOT| co |DOT| uk> Web Developer & Database Admin Cromwell Tools Ltd. Leicester, England. --- dblink.c-741 2003-11-28 05:03:48.000000000 +0000 +++ dblink.c 2004-02-10 16:31:11.305534520 +0000 @@ -82,7 +82,7 @@ static int16 get_attnum_pk_pos(int16 *pkattnums, int16 pknumatts, int16 key); static HeapTuple get_tuple_of_interest(Oid relid, int16 *pkattnums, int16 pknumatts, char **src_pkattvals); static Oid get_relid_from_relname(text *relname_text); -static TupleDesc pgresultGetTupleDesc(PGresult *res); +static TupleDesc pgresultGetTupleDesc(PGresult *res, const bool use_oid_map); static char *generate_relation_name(Oid relid); /* Global */ @@ -457,7 +457,7 @@ if (functyptype == 'c') tupdesc = TypeGetTupleDesc(functypeid, NIL); else if (functyptype == 'p' && functypeid == RECORDOID) - tupdesc = pgresultGetTupleDesc(res); + tupdesc = pgresultGetTupleDesc(res, true); else /* shouldn't happen */ elog(ERROR, "return type must be a row type"); @@ -550,6 +550,7 @@ char *sql = NULL; char *conname = NULL; remoteConn *rcon = NULL; + bool use_oid_map = true; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); @@ -559,20 +560,25 @@ * calls */ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); - - if (PG_NARGS() == 2) - { + + switch (PG_NARGS()) { + case 3: + if (!PG_ARGISNULL(2)) + use_oid_map = PG_GETARG_BOOL(2); + case 2: DBLINK_GET_CONN; sql = GET_STR(PG_GETARG_TEXT_P(1)); - } - else if (PG_NARGS() == 1) - { + break; + + case 1: conn = persistent_conn; sql = GET_STR(PG_GETARG_TEXT_P(0)); - } - else + break; + + default: /* shouldn't happen */ elog(ERROR, "wrong number of arguments"); + } if (!conn) DBLINK_CONN_NOT_AVAIL; @@ -620,7 +626,7 @@ if (functyptype == 'c') tupdesc = TypeGetTupleDesc(functypeid, NIL); else if (functyptype == 'p' && functypeid == RECORDOID) - tupdesc = pgresultGetTupleDesc(res); + tupdesc = pgresultGetTupleDesc(res, use_oid_map); else /* shouldn't happen */ elog(ERROR, "return type must be a row type"); @@ -1802,7 +1808,7 @@ } static TupleDesc -pgresultGetTupleDesc(PGresult *res) +pgresultGetTupleDesc(PGresult *res, const bool use_oid_map) { int natts; AttrNumber attnum; @@ -1813,6 +1819,14 @@ bool attisset; Oid atttypid; int i; + + void *plan = NULL; + Oid arg[2] = {NAMEOID,OIDOID}; + int ret = 0; + Datum values[2]; + Datum local_oid; + char *typname; + bool isnull; /* * allocate a new tuple descriptor @@ -1824,6 +1838,28 @@ desc = CreateTemplateTupleDesc(natts, false); + if (use_oid_map) { + /* + * Connect to SPI manager + */ + if ((ret = SPI_connect()) < 0) + /* internal error */ + elog(ERROR, "SPI_connect failure - returned %d", ret); + + /* + * Create query plan for mapping remote oid to local oid + */ + plan = SPI_prepare("SELECT local_oid, typname FROM dblink_oid_map WHERE db_id = $1 AND remote_oid = $2", 2, arg); + if (plan == NULL) + /* internal error */ + elog(ERROR, "SPI_prepare returned %d", SPI_result); + + /* + * TODO: Determine unique id of remote database (instead of "test") + */ + values[0] = PointerGetDatum("test"); + } + attnum = 0; for (i = 0; i < natts; i++) @@ -1839,6 +1875,33 @@ atttypid = PQftype(res, i); atttypmod = PQfmod(res, i); + if (use_oid_map) { + values[1] = ObjectIdGetDatum(atttypid); + + if ((ret = SPI_execp(plan, values, NULL, 1)) < 0) + elog(ERROR, "SPI_execp returned %d", ret); + + if (SPI_processed < 1) + ereport(ERROR, + (errcode(ERRCODE_MOST_SPECIFIC_TYPE_MISMATCH), + errmsg("unknown remote datatype"), + errdetail("Remote datatype oid \"%d\" is not in oid map", + atttypid))); + + local_oid = SPI_getbinval(SPI_tuptable->vals[0], SPI_tuptable->tupdesc, 1, &isnull); + + if (isnull) { + typname = SPI_getvalue(SPI_tuptable->vals[0], SPI_tuptable->tupdesc, 2); + ereport(ERROR, + (errcode(ERRCODE_MOST_SPECIFIC_TYPE_MISMATCH), + errmsg("unknown local datatype"), + errdetail("Remote datatype \"%s\" is not mapped " \ + "to a local datatype", (char *) DatumGetPointer(typname)))); + } + + atttypid = DatumGetObjectId(local_oid); + } + if (PQfsize(res, i) != get_typlen(atttypid)) ereport(ERROR, (errcode(ERRCODE_MOST_SPECIFIC_TYPE_MISMATCH), @@ -1855,6 +1918,10 @@ atttypmod, attdim, attisset); } + if (use_oid_map) { + SPI_finish(); + } + return desc; }
Mark Gibson <gibsonm@cromwell.co.uk> writes: > Unfortunately, it doesn't work using the oid the map, whether custom types > are involved or not. All I get is the following message: > ERROR: unsupported byval length: nnnn > SPI is very new to me (like 2 days old ;). > Any suggestions where I've gone wrong? Not offhand. Try setting a breakpoint at errfinish() so you can get a stack trace back from the point of the error; that should help. regards, tom lane
Tom Lane wrote: >Mark Gibson <gibsonm@cromwell.co.uk> writes: > > >>Unfortunately, it doesn't work using the oid the map, whether custom types >>are involved or not. All I get is the following message: >>ERROR: unsupported byval length: nnnn >>SPI is very new to me (like 2 days old ;). >>Any suggestions where I've gone wrong? >> >> >Not offhand. Try setting a breakpoint at errfinish() so you can get a >stack trace back from the point of the error; that should help. > > Well, I've re-compiled PostgreSQL 7.4.1 with --enable-debug (was previously working with a Gentoo ebuild of it). And strangely I get a different error now: ERROR: query-specified return row and actual function return row do not match After placing a few elog statements around pgresultGetTupleDesc, i've found the error happens after this function. I've been trying to use gdb to trace the problem, but I can't work out how to attach gdb to a backend process. (I've never tried to debug a multi process before) Are there any docs on debugging the backend processes? I've searched www.postgresql.org without luck :( -- Mark Gibson <gibsonm |AT| cromwell |DOT| co |DOT| uk> Web Developer & Database Admin Cromwell Tools Ltd. Leicester, England.
Tom Lane wrote: > It's not hard. Start psql, then in another window use ps to determine > the PID of the connected backend. (The pg_stat_activity view might help > too.) Then > > gdb /path/to/postgres-executable PID > gdb> b whereever > gdb> cont > > and away you go. I'd add that when working with shared libraries, you need to make sure they are loaded before you attach to the process (I think there is a way to deal with it after the fact, but I'm not sure of the details). To do that, either execute a function in the library (e.g. select dblink_connect...) or use the LOAD command like this: regression=# load '$libdir/dblink'; LOAD HTH, Joe
Mark Gibson <gibsonm@cromwell.co.uk> writes: > I've been trying to use gdb to trace the problem, but I can't work out > how to attach gdb to a backend process. It's not hard. Start psql, then in another window use ps to determine the PID of the connected backend. (The pg_stat_activity view might help too.) Then gdb /path/to/postgres-executable PIDgdb> b whereevergdb> cont and away you go. regards, tom lane
Joe Conway <mail@joeconway.com> writes: > I'd add that when working with shared libraries, you need to make sure > they are loaded before you attach to the process (I think there is a way > to deal with it after the fact, but I'm not sure of the details). On HPUX, I find that "sharedlibrary /path/to/library" works to teach gdb about symbols in a freshly loaded library. (Occasionally it seems to be needed for libraries that were already present at attach time, too.) Not sure if this applies to other OSes, but give it a try. regards, tom lane
Mark Gibson wrote: > Well, I've re-compiled PostgreSQL 7.4.1 with --enable-debug > (was previously working with a Gentoo ebuild of it). > And strangely I get a different error now: > > ERROR: query-specified return row and actual function return row do > not match > > After placing a few elog statements around pgresultGetTupleDesc, i've > found the > error happens after this function. > Thanks for the tips of using gdb. I've found that dblink works without the oid map, ie: dblink(... , ... , false) but returns the error when enabled, ie: dblink(... , ... , true) The only part of dblink affected by this flag (use_oid_map) is within 'pgresultGetTupleDesc' and the only variable affected is 'atttypid' - which is left alone if use_oid_map is false. When use_oid_map is true, SPI is used to prepare a query, and execute it for each attribute - atttypid is then assigned the value retreived from the query. So when no custom types are used, the same values are passed to 'TupleDescInitEntry' whether use_oid_map is true or false. I just can't work it out! -- Mark Gibson <gibsonm |AT| cromwell |DOT| co |DOT| uk> Web Developer & Database Admin Cromwell Tools Ltd. Leicester, England.
> Mark Gibson wrote: > > I've found that dblink works without the oid map, ie: dblink(... , ... > , false) > but returns the error when enabled, ie: dblink(... , ... , true) > Hello, I've found the problem, although I'm still a bit confused by it. When the call to SPI_finish() at the end of pgresultGetTupleDesc in my patch is removed, it works fine, custom datatypes and all :) But I still don't understand why this is. Is this safe? Also, I was thinking about how we could uniquely identify a database. Are there any variables we could retreive from the remote db after connecting to it that could form a consistent unique id for that db? Is there a way we can determine whether 'pg_type' in the remote database has been modified? We could automatically update the oid map in that case. -- Mark Gibson <gibsonm |AT| cromwell |DOT| co |DOT| uk> Web Developer & Database Admin Cromwell Tools Ltd. Leicester, England.
Mark Gibson wrote: > I've found the problem, although I'm still a bit confused by it. I hate to do this to you now, but after some thought I think I have a better approach -- I'd be interested in opinions on that assessment. The attached eliminates pgresultGetTupleDesc() entirely and instead depends on the TupleDesc passed as rsinfo->expectedDesc from the executor. What this means is that the string representation of the remote value (from the "out" function on the remote side, as provided by libpq) will get fed into the "in" function corresponding to the local type you assign in your SQL statement. Assuming the types on the two sides are the same (or at least compatible), it should work well. Please give this a try and let me know what you think. Joe Index: contrib/dblink/dblink.c =================================================================== RCS file: /opt/src/cvs/pgsql-server/contrib/dblink/dblink.c,v retrieving revision 1.29 diff -c -r1.29 dblink.c *** contrib/dblink/dblink.c 28 Nov 2003 05:03:01 -0000 1.29 --- contrib/dblink/dblink.c 13 Feb 2004 18:23:49 -0000 *************** *** 82,88 **** static int16 get_attnum_pk_pos(int16 *pkattnums, int16 pknumatts, int16 key); static HeapTuple get_tuple_of_interest(Oid relid, int16 *pkattnums, int16 pknumatts, char **src_pkattvals); static Oid get_relid_from_relname(text *relname_text); - static TupleDesc pgresultGetTupleDesc(PGresult *res); static char *generate_relation_name(Oid relid); /* Global */ --- 82,87 ---- *************** *** 395,400 **** --- 394,400 ---- StringInfo str = makeStringInfo(); char *curname = NULL; int howmany = 0; + ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo; if (PG_NARGS() == 3) { *************** *** 457,463 **** if (functyptype == 'c') tupdesc = TypeGetTupleDesc(functypeid, NIL); else if (functyptype == 'p' && functypeid == RECORDOID) ! tupdesc = pgresultGetTupleDesc(res); else /* shouldn't happen */ elog(ERROR, "return type must be a row type"); --- 457,472 ---- if (functyptype == 'c') tupdesc = TypeGetTupleDesc(functypeid, NIL); else if (functyptype == 'p' && functypeid == RECORDOID) ! { ! if (!rsinfo) ! ereport(ERROR, ! (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("returning setof record is not " \ ! "allowed in this context"))); ! ! /* get the requested return tuple description */ ! tupdesc = CreateTupleDescCopy(rsinfo->expectedDesc); ! } else /* shouldn't happen */ elog(ERROR, "return type must be a row type"); *************** *** 550,555 **** --- 559,565 ---- char *sql = NULL; char *conname = NULL; remoteConn *rcon = NULL; + ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); *************** *** 620,626 **** if (functyptype == 'c') tupdesc = TypeGetTupleDesc(functypeid, NIL); else if (functyptype == 'p' && functypeid == RECORDOID) ! tupdesc = pgresultGetTupleDesc(res); else /* shouldn't happen */ elog(ERROR, "return type must be a row type"); --- 630,645 ---- if (functyptype == 'c') tupdesc = TypeGetTupleDesc(functypeid, NIL); else if (functyptype == 'p' && functypeid == RECORDOID) ! { ! if (!rsinfo) ! ereport(ERROR, ! (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("returning setof record is not " \ ! "allowed in this context"))); ! ! /* get the requested return tuple description */ ! tupdesc = CreateTupleDescCopy(rsinfo->expectedDesc); ! } else /* shouldn't happen */ elog(ERROR, "return type must be a row type"); *************** *** 1801,1863 **** relation_close(rel, AccessShareLock); return relid; - } - - static TupleDesc - pgresultGetTupleDesc(PGresult *res) - { - int natts; - AttrNumber attnum; - TupleDesc desc; - char *attname; - int32 atttypmod; - int attdim; - bool attisset; - Oid atttypid; - int i; - - /* - * allocate a new tuple descriptor - */ - natts = PQnfields(res); - if (natts < 1) - /* shouldn't happen */ - elog(ERROR, "cannot create a description for empty results"); - - desc = CreateTemplateTupleDesc(natts, false); - - attnum = 0; - - for (i = 0; i < natts; i++) - { - /* - * for each field, get the name and type information from the - * query result and have TupleDescInitEntry fill in the attribute - * information we need. - */ - attnum++; - - attname = PQfname(res, i); - atttypid = PQftype(res, i); - atttypmod = PQfmod(res, i); - - if (PQfsize(res, i) != get_typlen(atttypid)) - ereport(ERROR, - (errcode(ERRCODE_MOST_SPECIFIC_TYPE_MISMATCH), - errmsg("field size mismatch"), - errdetail("Size of remote field \"%s\" does not match " \ - "size of local type \"%s\".", attname, - format_type_with_typemod(atttypid, - atttypmod)))); - - attdim = 0; - attisset = false; - - TupleDescInitEntry(desc, attnum, attname, atttypid, - atttypmod, attdim, attisset); - } - - return desc; } /* --- 1820,1825 ----
Joe Conway wrote: > Mark Gibson wrote: > >> I've found the problem, although I'm still a bit confused by it. > > > I hate to do this to you now, but after some thought I think I have a > better approach -- I'd be interested in opinions on that assessment. > > The attached eliminates pgresultGetTupleDesc() entirely and instead > depends on the TupleDesc passed as rsinfo->expectedDesc from the > executor. What this means is that the string representation of the > remote value (from the "out" function on the remote side, as provided > by libpq) will get fed into the "in" function corresponding to the > local type you assign in your SQL statement. Assuming the types on the > two sides are the same (or at least compatible), it should work well. > > Please give this a try and let me know what you think. Fantastic, works perfectly (well I've only actually tested it with the 'txtidx' datatype). That's so much better than my idea, i didn't like having the oid map much anyway. Oh well, I least I've learnt I little about PostgreSQL internals in the process. I can get back to what I was supposed to be doing now ;) I'm going to give this a much through testing now. Cheers -- Mark Gibson <gibsonm |AT| cromwell |DOT| co |DOT| uk> Web Developer & Database Admin Cromwell Tools Ltd. Leicester, England.