Thread: Re: [HACKERS] getting composite types info from libpq
On Wed, Dec 15, 2010 at 6:56 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Wed, Dec 15, 2010 at 1:25 PM, Daniele Varrazzo > <daniele.varrazzo@gmail.com> wrote: >> Hello, >> >> when a query returns a composite type, the libpq PQftype() function >> reports the oid of the "record" type. In psycopg: >> >> >>> cur.execute("select (1,2)") >> >>> cur.description >> (('row', 2249, None, -1, None, None, None),) >> >> test=# select typname from pg_type where oid = 2249; >> typname >> --------- >> record >> >> Is there a way to recursively retrieve the types for the record components? > > not without talking to the server, unless you had previously pulled > pg_attribute data. > > select * from pg_attribute where attrelid = 2249; No, there is no such info in pg_attribute: 2249 is the oid for the type of a "generic record", not for a specific type. > This question is more appropriate for -general, but what are you trying to do? Added -general in copy: please remove -hackers in your reply if you think this thread is out of place. I'm hacking on psycopg. Currently it uses PQftype, PQfname and related functions to inspect the PQresult received after a query in order to build the python representation of the record. But the inspection is "flat": if the record contains a composite structure it is currently returned as an unparsed string: >>> cur.execute("select ('1'::int, current_date), current_date") # the date outside the record is easily parsed, for the one inside the record >>> cur.fetchone() ('(1,2010-12-16)', datetime.date(2010, 12, 16)) >>> cur.description # name and oid are the first two fields (('row', 2249, None, -1, None, None, None), ('date', 1082, None, 4, None, None, None)) As the record is created on the fly, I assume there is no structure left in the catalog for it. If I instead explicitly create the type I see how to inspect it: test=> create type intdate as (an_int integer, a_date date); CREATE TYPE >>> cur.execute("select (1, current_date)::intdate, current_date") >>> cur.fetchone() ('(1,2010-12-16)', datetime.date(2010, 12, 16)) >>> cur.description (('row', 650308, None, -1, None, None, None), ('date', 1082, None, 4, None, None, None)) test=> select attname, atttypid from pg_attribute where attrelid = 650306; attname | atttypid ---------+---------- an_int | 23 a_date | 1082 but even in this case it seems it would take a second query to inspect the type and even here It doesn't seem I could use PQgetvalue/PQgetlength to read the internal components of the composite values. The goal would be to have the query above translated into e.g. a nested tuple in python: ((1, datetime.date(2010, 12, 16), datetime.date(2010, 12, 16)) and I'd like to know: 1. do I get enough info in the PGresult to inspect anonymous composite types? 2. do I get such info for composite types for which I have schema info in the catalog, without issuing a second query? (which I don't feel it is a driver's job) 3. is there any libpq facility to split the string returned after a composite types into its single components, without having to write a parser to deal with commas and quotes? >>> cur.execute("select ('a'::text, 'b,c'::text, 'd''e'::text, 'f\"g'::text)") >>> print cur.fetchone()[0] (a,"b,c",d'e,"f""g") 4. are by any chance those info passed on the network, maybe available in an internal libpq structure, but then not accessible from the libpq interface? Thank you very much. -- Daniele
On Dec16, 2010, at 02:51 , Daniele Varrazzo wrote: > 1. do I get enough info in the PGresult to inspect anonymous composite types? You just get the composite value, as you discovered. In text mode, that means only the composite string value, which contains no information about the individual field's types. In binary mode, however, the structure of such a composite value appears to be <number of fields: 4 bytes> [for each field] <OID of field's type: sizeof(Oid) bytes> [if value is NULL] <-1: 4 bytes> [else] <length of value: 4 bytes> <value: <length> bytes> [end if] [end for] according to a quick glance over record_send() in src/backend/utils/rowtypes.c. You'll want to double-check this, it really was a *very* quick glance ;-) The field's values are, again, in binary format, not text! AFAIK you *can* decide whether to use text for binary mode on a per-field basis when you execute a query, but once you request a field of type "record" to be transferred as binary, you'll have to be able to deal with arbitrary types sent as binary since you won't know which types the record might contain. Which isn't easy, because the binary representation of some types (like float I think) is machine-dependent :-( > 2. do I get such info for composite types for which I have schema info > in the catalog, without issuing a second query? (which I don't feel it > is a driver's job) No. Your only option is probably to query this information once and cache it. Knowing when to invalidate that cache isn't easy, though - but since type's probably don't change too often, some compromise will hopefully do. > 3. is there any libpq facility to split the string returned after a > composite types into its single components, without having to write a > parser to deal with commas and quotes? Not that I'd know of. There is, however, a project called libpqtypes which I think deal with things like that. I've never used it, though, so I can't say whether it fits your needs or not. > 4. are by any chance those info passed on the network, maybe available > in an internal libpq structure, but then not accessible from the libpq > interface? Don't think so. FYI, There has been some discussion about providing SQL-accessible functions to inspect and modify field of arbitrary records. There are two implementations of such a thing that I know of One was written by me, and is available at https://github.com/fgp/pg_record_inspect The other was written by Pavel Stehule and is described in his blob here http://okbob.blogspot.com/2010/12/iteration-over-record-in-plpgsql.html Neither of these helps much with doing things on the driver level, though, unless you're willing to tell your users to select record_inspect.fieldinfo(some_record) as "some_record.fieldinfo" alongside "some_record" if they want the record to be represented as other than one large string. best regards, Florian Pflug
On Thu, Dec 16, 2010 at 5:03 AM, Florian Pflug <fgp@phlo.org> wrote: > On Dec16, 2010, at 02:51 , Daniele Varrazzo wrote: >> 1. do I get enough info in the PGresult to inspect anonymous composite types? > You just get the composite value, as you discovered. In text mode, that means > only the composite string value, which contains no information about the > individual field's types. In binary mode, however, the structure of such a > composite value appears to be > > <number of fields: 4 bytes> > [for each field] > <OID of field's type: sizeof(Oid) bytes> > [if value is NULL] > <-1: 4 bytes> > [else] > <length of value: 4 bytes> > <value: <length> bytes> > [end if] > [end for] > > according to a quick glance over record_send() in > src/backend/utils/rowtypes.c. You'll want to double-check this, it really > was a *very* quick glance ;-) > > The field's values are, again, in binary format, not text! AFAIK you *can* > decide whether to use text for binary mode on a per-field basis when you > execute a query, but once you request a field of type "record" to be > transferred as binary, you'll have to be able to deal with arbitrary types > sent as binary since you won't know which types the record might contain. > Which isn't easy, because the binary representation of some types > (like float I think) is machine-dependent :-( > >> 2. do I get such info for composite types for which I have schema info >> in the catalog, without issuing a second query? (which I don't feel it >> is a driver's job) > No. Your only option is probably to query this information once and cache > it. Knowing when to invalidate that cache isn't easy, though - but since > type's probably don't change too often, some compromise will hopefully do. > >> 3. is there any libpq facility to split the string returned after a >> composite types into its single components, without having to write a >> parser to deal with commas and quotes? > Not that I'd know of. There is, however, a project called libpqtypes > which I think deal with things like that. I've never used it, though, > so I can't say whether it fits your needs or not. yeah -- what libpqtypes does is expose composites and arrays (and composites of arrays) as a 'result within a result'. You register the composite type by name, then you can create a PGresult that exposes the composite as if itself were a returned set -- then you get to use the regular libpq access functions to get the oid is null, etc. This process can nest of course. You might want to check it out. libpqtypes also always requests data in binary. this would actually be counter productive if you were to immediately convert it to a string. However, if you are moving data to some other binary structure, it's a lot faster and less work for the server. merlin