Re: VARIANT / ANYTYPE datatype - Mailing list pgsql-hackers
From | Darren Duncan |
---|---|
Subject | Re: VARIANT / ANYTYPE datatype |
Date | |
Msg-id | 4DC46E1C.3070405@darrenduncan.net Whole thread Raw |
In response to | Re: VARIANT / ANYTYPE datatype (Merlin Moncure <mmoncure@gmail.com>) |
Responses |
Re: VARIANT / ANYTYPE datatype
|
List | pgsql-hackers |
Thanks for picking up on my mentioning union types; I wasn't sure if anyone did. Merlin Moncure wrote: > On Fri, May 6, 2011 at 3:18 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> >> On 05/06/2011 04:08 PM, Alvaro Herrera wrote: >>> Excerpts from Darren Duncan's message of mié may 04 15:33:33 -0300 2011: >>> >>>> I see VARIANT/ANYTYPE as the most general case of supporting union types, >>>> which, >>>> say, could have more specific examples of "allow any number or date here >>>> but >>>> nothing else". If VARIANT is supported, unions in general ought to be >>>> also. >>> Okay, so aside from the performance (storage reduction) gained, there's >>> this argument for having variant/union types. It seems to me that this >>> is indeed possible to build. Completely general VARIANT, though, is >>> rather complex. A declared union, where you specify exactly which types >>> can be part of the union, can be catalogued, so that the system knows >>> exactly where to look when a type needs to be modified. A general >>> VARIANT however looks complex to me to solve. >>> >>> The problem is this: if an user attempts to drop a type, and this type >>> is used in a variant somewhere, we would lose the stored data. So the >>> drop needs to be aborted. Similarly, if we alter a type (easy example: >>> a composite type) used in a variant, we need to cascade to modify all >>> rows using that composite. >>> >>> If the unions that use a certain type are catalogued, we at least know >>> what tables to scan to cascade. >>> >>> In a general variant, the system catalogs do not have the information of >>> what type each variant masquerades as. We would need to examine the >>> variant's masqueraded types on each insert; if the current type is not >>> found, add it. This seems a bit expensive. >> So how is a declared union going to look and operate? Something like this? >> >> CREATE TYPE foo AS UNION (ival int, tval text, tsval timestamptz): >> CREATE TABLE bar (myunion foo); >> INSERT INTO bar (myunion) VALUES (ival=>1), (tval=>'some text'); >> UPDATE bar SET myunion.tsval = CURRENT_TIMESTAMP; >> >> >> Something like that could actually be quite nice for a number of purposes. > > using your hypothetical example, could you cast types to the union? > > select 1::int::foo; > > record types would presumably work as well? you could do a lot of > *really* neat stuff that way... Like with other respondents to this topic, I consider it much more useful and important, as well as practical, to support explicitly defined type unions than a wide-open ANYTYPE. As for how it would operate: The example above implies a union type implementation that is like C's concept of such, where you have to explicitly state how you want the value bit pattern to be interpreted, by naming ival/etc, rather than the DBMS just knowing that a particular value is of a specific type, because per-value stored metadata says so (like with SQLite). While that might be best in practice for implementation, I had envisioned something more like set unions, so you could instead do it like this: CREATE TYPE foo AS UNION (int, text, timestamptz): CREATE TABLE bar (myunion foo); INSERT INTO bar (myunion) VALUES(1), ('some text'); UPDATE bar SET myunion = CURRENT_TIMESTAMP; Unlike a record type, where multiple attributes may have the same time, presumably with a union, they would all be distinct, and so you could use the type name itself to refer to each option; you don't have to make up "ival" or whatever ... unless there are situations where types don't have names. When doing operations that are type-generic, such as equality test or assignment, especially assignment, between 2 things that are both declared to be of type foo, you could just do it with no verbosity same as if you were doing 2 int or text or whatever. When doing operations specific to int or text or whatever, or if you are assigning a foo-declared thing to an int/text-declared thing, then you'd want an explicit cast or assertion, such as "select myunion::int + 3 as answer from bar where is_a(myunion,'int')". Whether you want an explicit cast to go the other way, I would assume you don't need to, like when you have DOMAINs; eg, I would expect the 4th line above to just work, because the system knows the type of CURRENT_TIMESTAMP and it knows that this is a member of the union type of myunion. I see a UNION type as being like a DOMAIN type in reverse. -- Darren Duncan
pgsql-hackers by date: