Thread: BUG #18970: Atempt to alter type of table column used in row type with check leads to assertion failure

The following bug has been logged on the website:

Bug reference:      18970
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 18beta1
Operating system:   Ubuntu 24.04
Description:

The following script:
CREATE TABLE t1(a int);
CREATE TABLE t2(b t1 CHECK ((b).a IS NOT NULL));
ALTER TABLE t1 ALTER COLUMN a TYPE numeric;
triggers
2025-06-28 06:52:21.201 UTC [2233016] LOG:  statement: ALTER TABLE t1 ALTER
COLUMN a TYPE numeric;
TRAP: failed Assert("lockmode != NoLock || IsBootstrapProcessingMode() ||
CheckRelationLockedByMe(r, AccessShareLock, true)"), File: "relation.c",
Line: 67, PID: 2233016
ExceptionalCondition at assert.c:52:13
relation_open at relation.c:72:6
transformAlterTableStmt at parse_utilcmd.c:3543:8
ATPostAlterTypeParse at tablecmds.c:15600:20
ATPostAlterTypeCleanup at tablecmds.c:15478:3
ATRewriteCatalogs at tablecmds.c:5336:11
ATController at tablecmds.c:4882:2
AlterTable at tablecmds.c:4535:1
...
with an assert-enabled build, and fails with just
ERROR:  cannot alter table "t1" because column "t2.b" uses its row type
with no asserts.
Reproduced starting from commit b04aeb0a0, which added the Assert.


On Sun, Jun 29, 2025 at 1:35 AM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference:      18970
> Logged by:          Alexander Lakhin
> Email address:      exclusion@gmail.com
> PostgreSQL version: 18beta1
> Operating system:   Ubuntu 24.04
> Description:
>
> The following script:
> CREATE TABLE t1(a int);
> CREATE TABLE t2(b t1 CHECK ((b).a IS NOT NULL));
> ALTER TABLE t1 ALTER COLUMN a TYPE numeric;
> triggers
> 2025-06-28 06:52:21.201 UTC [2233016] LOG:  statement: ALTER TABLE t1 ALTER
> COLUMN a TYPE numeric;
> TRAP: failed Assert("lockmode != NoLock || IsBootstrapProcessingMode() ||
> CheckRelationLockedByMe(r, AccessShareLock, true)"), File: "relation.c",
> Line: 67, PID: 2233016
> ExceptionalCondition at assert.c:52:13
> relation_open at relation.c:72:6
> transformAlterTableStmt at parse_utilcmd.c:3543:8
> ATPostAlterTypeParse at tablecmds.c:15600:20
> ATPostAlterTypeCleanup at tablecmds.c:15478:3
> ATRewriteCatalogs at tablecmds.c:5336:11
> ATController at tablecmds.c:4882:2
> AlterTable at tablecmds.c:4535:1
> ...
> with an assert-enabled build, and fails with just
> ERROR:  cannot alter table "t1" because column "t2.b" uses its row type
> with no asserts.
> Reproduced starting from commit b04aeb0a0, which added the Assert.
>

hi.
this bug can be triggered by exclusion constraints too.

drop table if exists t1,t2;
CREATE TABLE t1(a int);
CREATE TABLE t2(b t1);
ALTER TABLE t2 ADD CONSTRAINT xxn EXCLUDE USING btree (((b).a) WITH =);
ALTER TABLE t1 ALTER COLUMN a TYPE numeric;

in ATPostAlterTypeCleanup
        /*
         * When rebuilding an FK constraint that references the table we're
         * modifying, we might not yet have any lock on the FK's table, so get
         * one now.  We'll need AccessExclusiveLock for the DROP CONSTRAINT
         * step, so there's no value in asking for anything weaker.
         */
        if (relid != tab->relid && contype == CONSTRAINT_FOREIGN)
            LockRelationOid(relid, AccessExclusiveLock);

we can change to
        if (relid != tab->relid)
            LockRelationOid(relid, AccessExclusiveLock);
obviously, the comments need to be updated.

When altering the data type of a column in one relation causes a constraint of
another table rebuild, the other table should be locked with
AccessExclusiveLock.  but in ATPostAlterTypeCleanup, we only know that
tab->relid is locked with the specified lock mode.
RememberConstraintForRebuilding only records the constraint information—it
doesn't acquire a lock on pg_constraint.conrelid.

so in ATPostAlterTypeCleanup, we should lock the tab->changedConstraintOids
associated pg_constraint.conrelid.



On Mon, Jun 30, 2025 at 1:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > jian he <jian.universality@gmail.com> writes:
> >> we can change to
> >>      if (relid != tab->relid)
> >>              LockRelationOid(relid, AccessExclusiveLock);
> >> obviously, the comments need to be updated.
>
> > Yeah, I came to the same conclusion after studying it for awhile.
>

This issue also applies to tab->changedStatisticsOids
and tab->changedIndexOids in ATPostAlterTypeCleanup.

drop table if exists t1,t2;
CREATE TABLE t1(a int);
CREATE TABLE t2(b t1);
CREATE STATISTICS XXX ON ((b).a is not null) FROM t2;
ALTER TABLE t1 ALTER COLUMN a TYPE numeric;


drop table if exists t1,t2;
CREATE TABLE t1(a int);
CREATE TABLE t2(b t1);
CREATE INDEX XXX ON t2(((b).a));
ALTER TABLE t1 ALTER COLUMN a TYPE numeric;

We likely need to do the same as well.



jian he <jian.universality@gmail.com> writes:
> SET EXPRESSION both can have this issue.
> so i also add more test cases for stored/virtual generated columns.

This patch is a lot larger than I was expecting, and I think it's
misguided.  You argue that

+     * Changing a virtual generated column's expression is akin to altering its
+     * type, requiring a call to find_composite_type_dependencies to check if
+     * the virtual generated column is used in any table.
+     * Therefore we need add this defval to tab->newvals for virtual generated
+     * column too, so Phase3 will call find_composite_type_dependencies.

but I think that is in fact wrong.  The implementation restriction
we have is that we lack code to run around and physically change
the stored values of columns that are not top-level table columns.
However, in the case of a virtual column we don't need to change
anything about the storage: once we've fixed the catalog metadata,
we're done.  So I'm not seeing the need to add all this stuff
other than the additional locking calls.

            regards, tom lane



On Mon, Jun 30, 2025 at 11:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> jian he <jian.universality@gmail.com> writes:
> > SET EXPRESSION both can have this issue.
> > so i also add more test cases for stored/virtual generated columns.
>
> This patch is a lot larger than I was expecting, and I think it's
> misguided.  You argue that
>
> +     * Changing a virtual generated column's expression is akin to altering its
> +     * type, requiring a call to find_composite_type_dependencies to check if
> +     * the virtual generated column is used in any table.
> +     * Therefore we need add this defval to tab->newvals for virtual generated
> +     * column too, so Phase3 will call find_composite_type_dependencies.
>
> but I think that is in fact wrong.  The implementation restriction
> we have is that we lack code to run around and physically change
> the stored values of columns that are not top-level table columns.
> However, in the case of a virtual column we don't need to change
> anything about the storage: once we've fixed the catalog metadata,
> we're done.  So I'm not seeing the need to add all this stuff
> other than the additional locking calls.
>

CREATE TABLE t1(a int generated always as ('1') stored);
CREATE TABLE t2(b t1 CHECK ((b).a IS NOT NULL));
INSERT INTO t2 default values returning *;
ERROR:  new row for relation "t2" violates check constraint "t2_b_check"
DETAIL:  Failing row contains (null).
INSERT INTO t2 values '(1)' returning *;

ALTER TABLE t1 ALTER COLUMN a SET EXPRESSION AS (NULL);
Currently we disallow this "SET EXPRESSION AS".
I am wondering why we disallow it?

at that time, I was mainly initrucugged by this comment about
composite type default
in ATRewriteTables:
         * If we change column data types, the operation has to be propagated
         * to tables that use this table's rowtype as a column type.
         * tab->newvals will also be non-NULL in the case where we're adding a
         * column with a default.  We choose to forbid that case as well,
         * since composite types might eventually support defaults.


I’ve added some tests to generated_stored.sql, but not to generated_virtual.sql.

Attachment
jian he <jian.universality@gmail.com> writes:
> On Mon, Jun 30, 2025 at 11:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ... I think that is in fact wrong.  The implementation restriction
>> we have is that we lack code to run around and physically change
>> the stored values of columns that are not top-level table columns.
>> However, in the case of a virtual column we don't need to change
>> anything about the storage: once we've fixed the catalog metadata,
>> we're done.  So I'm not seeing the need to add all this stuff
>> other than the additional locking calls.

> CREATE TABLE t1(a int generated always as ('1') stored);
> CREATE TABLE t2(b t1 CHECK ((b).a IS NOT NULL));
> INSERT INTO t2 default values returning *;
> ERROR:  new row for relation "t2" violates check constraint "t2_b_check"
> DETAIL:  Failing row contains (null).
> INSERT INTO t2 values '(1)' returning *;

> ALTER TABLE t1 ALTER COLUMN a SET EXPRESSION AS (NULL);
> Currently we disallow this "SET EXPRESSION AS".
> I am wondering why we disallow it?

That is an interesting question.  AFAICT we treat the GENERATED
clause as being like a constraint.  We don't apply constraints
of a table to uses of its composite type, so in the above example
t2's b.a does not pay any attention to the GENERATED clause; it
just acts as an ordinary "int" column.  So it's not very clear
why we should forbid ALTER COLUMN a SET EXPRESSION because of t2.

If the idea was to leave the door open for a future change of
that behavior (that is, to start propagating the GENERATED clause to
uses of the composite type), I think that ship has already sailed.

So, rather than add a new restriction for VIRTUAL generated
columns, I'd be inclined to remove the one for STORED columns.
At least if the argument is to make them act the same.  If we're
okay with them working differently then we don't have to do
anything.

> at that time, I was mainly initrucugged by this comment about
> composite type default
> in ATRewriteTables:
>          * If we change column data types, the operation has to be propagated
>          * to tables that use this table's rowtype as a column type.
>          * tab->newvals will also be non-NULL in the case where we're adding a
>          * column with a default.  We choose to forbid that case as well,
>          * since composite types might eventually support defaults.

That (propagating defaults to composite types) again strikes me as a
ship that sailed long ago.  Quite aside from the implementation issues
that would arise, we'd be changing longstanding behavior.

Peter, what's your opinion?

            regards, tom lane