Hi, everyone,
First time posting in pgsql-hackers.
I crafted the following rough patch which passes the target column
typmod on input functions instead of -1 for OIDs >= 16384. The intention
is to affect UDTs (user-defined types) only, not core data types. It
needs adjustments but the core idea is this:
diff --git a/src/backend/parser/parse_coerce.c
b/src/backend/parser/parse_coerce.c
index 0b5b81c7f27..b884745f7f6 100644
--- a/src/backend/parser/parse_coerce.c
+++ b/src/backend/parser/parse_coerce.c
@@ -276,7 +276,7 @@ coerce_type(ParseState *pstate, Node *node,
* or it won't be able to obey the bizarre SQL-spec
input rules. (Ugly
* as sin, but so is this part of the spec...)
*/
- if (baseTypeId == INTERVALOID)
+ if (baseTypeId == INTERVALOID || baseTypeId >= 16384)
inputTypeMod = baseTypeMod;
else
inputTypeMod = -1;
Rationale:
While developing a base UDT with a type modifier, we noticed we always
got -1 for the target typmod (input function's third argument). We
expected to get the target column typmod instead, as per the CREATE TYPE
documentation [1].
We later learned that input values go through a two-step process [2]. If
the target column is newtype(10), PostgreSQL will first pass the value
through the newtype_in with typmod = -1, and once it's newtype(-1), it
gets casted from newtype(-1) to newtype(10) through a cast function.
This is called a "sizing cast" and ties the typmod to size / length
semantics.
For UDTs the typmod might not have size or length semantics. For
example, if the typmod specifies how to encrypt a value, the two-step
process is inconvenient. The input function should make it go directly
from cstring to newtype(final_typmod).
Basic testing on our side works as expected so far:
- The input function is called only once and the sizing cast is not
called anymore.
- If the UDT does not have a typmod, it still gets -1.
- Sized domains still seem to work for INSERTs, like in CREATE DOMAIN
v5char AS varchar(5); and then trying to use it.
- Inserting into a newtype(a) column from a 'someting'::newtype(b) value
works correctly: first, newtype_in to typmod = b, then, self-cast to
typmod = a.
- The PostgreSQL test suite still passes.
That said, we don't want to break other's use cases, so I ask for your
comments, especially if you know of problematic scenarios or develop
data types or extensions, so I can work on them.
Notes:
Given how COPY works, I'd say that most or all UDTs should support
pg_dump's default settings (COPY-based). If this is the case, most or
all UDTs should be already compatible with this patch.
Setting the target typmod for all OIDs breaks the test suite.
Why 16384? It's the minimum OID assigned after a cluster is initialized.
We relied on the documentation at [3]. I considered checking the
typcategory but some core types, like bytea, have typcategory = 'U'.
Comments will be appreciated. If you deem this is a good way to go, I
will submit it as a proper patch for review.
Thanks,
Octavio.
[1] https://www.postgresql.org/docs/17/sql-createtype.html#id-1.9.3.94.5.8
[2] https://www.postgresql.org/docs/17/typeconv-query.html#TYPECONV-QUERY
[3]
https://www.postgresql.org/docs/17/system-catalog-initial-data.html#SYSTEM-CATALOG-OID-ASSIGNMENT