Proposal to adjust typmod argument on base UDT input functions - Mailing list pgsql-hackers

From Octavio Alvarez
Subject Proposal to adjust typmod argument on base UDT input functions
Date
Msg-id 50b505e0-009e-4b0e-a4ac-b60fb3786519@alvarezp.org
Whole thread Raw
Responses Re: Proposal to adjust typmod argument on base UDT input functions
List pgsql-hackers
Hi, everyone,

First time posting in pgsql-hackers.

I crafted the following rough patch which passes the target column
typmod on input functions instead of -1 for OIDs >= 16384. The intention
is to affect UDTs (user-defined types) only, not core data types. It
needs adjustments but the core idea is this:

diff --git a/src/backend/parser/parse_coerce.c 
b/src/backend/parser/parse_coerce.c
index 0b5b81c7f27..b884745f7f6 100644
--- a/src/backend/parser/parse_coerce.c
+++ b/src/backend/parser/parse_coerce.c
@@ -276,7 +276,7 @@ coerce_type(ParseState *pstate, Node *node,
                  * or it won't be able to obey the bizarre SQL-spec 
input rules. (Ugly
                  * as sin, but so is this part of the spec...)
                  */
-               if (baseTypeId == INTERVALOID)
+               if (baseTypeId == INTERVALOID || baseTypeId >= 16384)
                         inputTypeMod = baseTypeMod;
                 else
                         inputTypeMod = -1;


Rationale:

While developing a base UDT with a type modifier, we noticed we always
got -1 for the target typmod (input function's third argument). We
expected to get the target column typmod instead, as per the CREATE TYPE
documentation [1].

We later learned that input values go through a two-step process [2]. If
the target column is newtype(10), PostgreSQL will first pass the value
through the newtype_in with typmod = -1, and once it's newtype(-1), it
gets casted from newtype(-1) to newtype(10) through a cast function.
This is called a "sizing cast" and ties the typmod to size / length
semantics.

For UDTs the typmod might not have size or length semantics. For
example, if the typmod specifies how to encrypt a value, the two-step
process is inconvenient. The input function should make it go directly
from cstring to newtype(final_typmod).

Basic testing on our side works as expected so far:

- The input function is called only once and the sizing cast is not
   called anymore.

- If the UDT does not have a typmod, it still gets -1.

- Sized domains still seem to work for INSERTs, like in CREATE DOMAIN
   v5char AS varchar(5); and then trying to use it.

- Inserting into a newtype(a) column from a 'someting'::newtype(b) value
   works correctly: first, newtype_in to typmod = b, then, self-cast to
   typmod = a.

- The PostgreSQL test suite still passes.

That said, we don't want to break other's use cases, so I ask for your
comments, especially if you know of problematic scenarios or develop
data types or extensions, so I can work on them.


Notes:

Given how COPY works, I'd say that most or all UDTs should support
pg_dump's default settings (COPY-based). If this is the case, most or
all UDTs should be already compatible with this patch.

Setting the target typmod for all OIDs breaks the test suite.

Why 16384? It's the minimum OID assigned after a cluster is initialized.
We relied on the documentation at [3]. I considered checking the
typcategory but some core types, like bytea, have typcategory = 'U'.

Comments will be appreciated. If you deem this is a good way to go, I
will submit it as a proper patch for review.

Thanks,
Octavio.


[1] https://www.postgresql.org/docs/17/sql-createtype.html#id-1.9.3.94.5.8

[2] https://www.postgresql.org/docs/17/typeconv-query.html#TYPECONV-QUERY

[3] 
https://www.postgresql.org/docs/17/system-catalog-initial-data.html#SYSTEM-CATALOG-OID-ASSIGNMENT
Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Proposal: Conflict log history table for Logical Replication
Next
From: Tom Lane
Date:
Subject: Re: Proposal to adjust typmod argument on base UDT input functions