Thread: bug: virtual generated column can be partition key

bug: virtual generated column can be partition key

From
jian he
Date:
hi.
While trying to make the virtual generated column be part of the partition key,
I found this bug.
it also influences the stored generated column, i added a test
on generated_stored.sql.

CREATE TABLE gtest_part_key (
    f1 date NOT NULL, f2 bigint,
    f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
    PARTITION BY RANGE (f3);

ERROR:  cannot use generated column in partition key
LINE 4:     PARTITION BY RANGE (f3);
                                ^
DETAIL:  Column "f3" is a generated column.

the following is essentially the same as above, it should also fail.

CREATE TABLE gtest_part_key (
    f1 date NOT NULL, f2 bigint,
    f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
    PARTITION BY RANGE ((f3));

Attachment

Re: bug: virtual generated column can be partition key

From
Fujii Masao
Date:

On 2025/04/21 11:30, jian he wrote:
> hi.
> While trying to make the virtual generated column be part of the partition key,
> I found this bug.

I haven't looked at the patch in detail yet, but when I applied it
and ran the regression tests with RELCACHE_FORCE_RELEASE and
CATCACHE_FORCE_RELEASE enabled, the tests failed with the following diff:

----------------------------
========= Contents of ./src/test/regress/regression.diffs
diff -U3 /home/runner/work/postgresql/postgresql/src/test/regress/expected/create_table.out
/home/runner/work/postgresql/postgresql/src/test/regress/results/create_table.out
--- /home/runner/work/postgresql/postgresql/src/test/regress/expected/create_table.out    2025-04-21 07:32:03.731119788
+0000
+++ /home/runner/work/postgresql/postgresql/src/test/regress/results/create_table.out    2025-04-21 07:38:31.358134750
+0000
@@ -810,8 +810,13 @@
  LINE 1: ...TITION OF parted FOR VALUES IN ('c') PARTITION BY RANGE (c);
                                                                      ^
  CREATE TABLE part_c PARTITION OF parted (b WITH OPTIONS NOT NULL DEFAULT 0) FOR VALUES IN ('c') PARTITION BY RANGE
((b));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ULL DEFAULT 0) FOR VALUES IN ('c') PARTITION BY RANGE ((b));
+                                                                  ^
+DETAIL:  Column "b" is a generated column.
  -- create a level-2 partition
  CREATE TABLE part_c_1_10 PARTITION OF part_c FOR VALUES FROM (1) TO (10);
+ERROR:  relation "part_c" does not exist
  -- check that NOT NULL and default value are inherited correctly
  create table parted_notnull_inh_test (a int default 1, b int not null default 0) partition by list (a);
  create table parted_notnull_inh_test1 partition of parted_notnull_inh_test (a not null, b default 1) for values in
(1);
@@ -871,30 +876,8 @@
  
  -- Both partition bound and partition key in describe output
  \d+ part_c
-                             Partitioned table "public.part_c"
- Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
---------+---------+-----------+----------+---------+----------+--------------+-------------
- a      | text    |           |          |         | extended |              |
- b      | integer |           | not null | 0       | plain    |              |
-Partition of: parted FOR VALUES IN ('c')
-Partition constraint: ((a IS NOT NULL) AND (a = 'c'::text))
-Partition key: RANGE (b)
-Not-null constraints:
-    "part_c_b_not_null" NOT NULL "b" (local, inherited)
-Partitions: part_c_1_10 FOR VALUES FROM (1) TO (10)
-
  -- a level-2 partition's constraint will include the parent's expressions
  \d+ part_c_1_10
-                                Table "public.part_c_1_10"
- Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
---------+---------+-----------+----------+---------+----------+--------------+-------------
- a      | text    |           |          |         | extended |              |
- b      | integer |           | not null | 0       | plain    |              |
-Partition of: part_c FOR VALUES FROM (1) TO (10)
-Partition constraint: ((a IS NOT NULL) AND (a = 'c'::text) AND (b IS NOT NULL) AND (b >= 1) AND (b < 10))
-Not-null constraints:
-    "part_c_b_not_null" NOT NULL "b" (inherited)
-
  -- Show partition count in the parent's describe output
  -- Tempted to include \d+ output listing partitions with bound info but
  -- output could vary depending on the order in which partition oids are
@@ -906,7 +889,7 @@
   a      | text    |           |          |
   b      | integer |           | not null | 0
  Partition key: LIST (a)
-Number of partitions: 3 (Use \d+ to list them.)
+Number of partitions: 2 (Use \d+ to list them.)
  
  \d hash_parted
        Partitioned table "public.hash_parted"
----------------------------

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: bug: virtual generated column can be partition key

From
jian he
Date:
On Mon, Apr 21, 2025 at 4:02 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2025/04/21 11:30, jian he wrote:
> > hi.
> > While trying to make the virtual generated column be part of the partition key,
> > I found this bug.
>
> I haven't looked at the patch in detail yet, but when I applied it
> and ran the regression tests with RELCACHE_FORCE_RELEASE and
> CATCACHE_FORCE_RELEASE enabled, the tests failed with the following diff:
>
> ----------------------------
> ========= Contents of ./src/test/regress/regression.diffs
> diff -U3 /home/runner/work/postgresql/postgresql/src/test/regress/expected/create_table.out
/home/runner/work/postgresql/postgresql/src/test/regress/results/create_table.out
> --- /home/runner/work/postgresql/postgresql/src/test/regress/expected/create_table.out  2025-04-21 07:32:03.731119788
+0000
> +++ /home/runner/work/postgresql/postgresql/src/test/regress/results/create_table.out   2025-04-21 07:38:31.358134750
+0000
> @@ -810,8 +810,13 @@
>   LINE 1: ...TITION OF parted FOR VALUES IN ('c') PARTITION BY RANGE (c);
>                                                                       ^
>   CREATE TABLE part_c PARTITION OF parted (b WITH OPTIONS NOT NULL DEFAULT 0) FOR VALUES IN ('c') PARTITION BY RANGE
((b));
> +ERROR:  cannot use generated column in partition key
> +LINE 1: ...ULL DEFAULT 0) FOR VALUES IN ('c') PARTITION BY RANGE ((b));
> +                                                                  ^
> +DETAIL:  Column "b" is a generated column.
>   -- create a level-2 partition
>   CREATE TABLE part_c_1_10 PARTITION OF part_c FOR VALUES FROM (1) TO (10);
> +ERROR:  relation "part_c" does not exist
>   -- check that NOT NULL and default value are inherited correctly
>   create table parted_notnull_inh_test (a int default 1, b int not null default 0) partition by list (a);
>   create table parted_notnull_inh_test1 partition of parted_notnull_inh_test (a not null, b default 1) for values in
(1);
> @@ -871,30 +876,8 @@
>
>   -- Both partition bound and partition key in describe output
>   \d+ part_c
> -                             Partitioned table "public.part_c"
> - Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
> ---------+---------+-----------+----------+---------+----------+--------------+-------------
> - a      | text    |           |          |         | extended |              |
> - b      | integer |           | not null | 0       | plain    |              |
> -Partition of: parted FOR VALUES IN ('c')
> -Partition constraint: ((a IS NOT NULL) AND (a = 'c'::text))
> -Partition key: RANGE (b)
> -Not-null constraints:
> -    "part_c_b_not_null" NOT NULL "b" (local, inherited)
> -Partitions: part_c_1_10 FOR VALUES FROM (1) TO (10)
> -
>   -- a level-2 partition's constraint will include the parent's expressions
>   \d+ part_c_1_10
> -                                Table "public.part_c_1_10"
> - Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
> ---------+---------+-----------+----------+---------+----------+--------------+-------------
> - a      | text    |           |          |         | extended |              |
> - b      | integer |           | not null | 0       | plain    |              |
> -Partition of: part_c FOR VALUES FROM (1) TO (10)
> -Partition constraint: ((a IS NOT NULL) AND (a = 'c'::text) AND (b IS NOT NULL) AND (b >= 1) AND (b < 10))
> -Not-null constraints:
> -    "part_c_b_not_null" NOT NULL "b" (inherited)
> -
>   -- Show partition count in the parent's describe output
>   -- Tempted to include \d+ output listing partitions with bound info but
>   -- output could vary depending on the order in which partition oids are
> @@ -906,7 +889,7 @@
>    a      | text    |           |          |
>    b      | integer |           | not null | 0
>   Partition key: LIST (a)
> -Number of partitions: 3 (Use \d+ to list them.)
> +Number of partitions: 2 (Use \d+ to list them.)
>
>   \d hash_parted
>         Partitioned table "public.hash_parted"

Thanks for pointing it out.

i think it's related to my silly mistake:
                if (TupleDescAttr(RelationGetDescr(rel),
var->varattno)->attgenerated)
should be
                if (TupleDescAttr(RelationGetDescr(rel), var->varattno
- 1)->attgenerated)

Feel free to test it again.

Attachment

Re: bug: virtual generated column can be partition key

From
Ashutosh Bapat
Date:


On Mon, Apr 21, 2025 at 2:42 PM jian he <jian.universality@gmail.com> wrote:
On Mon, Apr 21, 2025 at 4:02 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2025/04/21 11:30, jian he wrote:
> > hi.
> > While trying to make the virtual generated column be part of the partition key,
> > I found this bug.


I noticed that the check for a generated column in the partition key expression already exists a few lines below. Had that check placed before the if .. else ... block, we wouldn't have had this problem.
if (IsA(expr, Var) && ((Var *) expr)->varattno > 0)
{
 ... 
}
else
{

I admit that pull_varattnos() + loop through bms_next_member() is a bit wasteful when the expression is just a Var. But probably it's worth taking that hit for the sake of avoiding code duplication. Further, I believe that the two loops: one for system attribute check and the other for generated attribute check can be combined, something like attached. Then it's not as wasteful as it looks and we probably save a loop.

While looking at this I realised that a generated column may end up being part of the partition key if the partition key expression contains a whole row reference. Attached patch also has a fix and a testcase for the same. PARTITION BY RANGE ((gtest_part_key is not null)) expression in the test is kinda silly, but it tests the whole-row reference as part of an expression. I haven't looked for more sensible expressions.

I have included your original tests, but ended up rewriting code. Please let me know what do you think.

--
Best Wishes,
Ashutosh Bapat
Attachment

Re: bug: virtual generated column can be partition key

From
jian he
Date:
On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
>
> While looking at this I realised that a generated column may end up being part of the partition key if the partition
keyexpression contains a whole row reference. Attached patch also has a fix and a testcase for the same. PARTITION BY
RANGE((gtest_part_key is not null)) expression in the test is kinda silly, but it tests the whole-row reference as part
ofan expression. I haven't looked for more sensible expressions. 
>

I begin to wonder if wholerow reference should be allowed.
then error occurred:

drop table if exists t4;
CREATE TABLE t4(f1 int, f2 bigint) PARTITION BY list ((t4));
create table t4_1 partition of t4 for values in ((1,2));
alter table t4 alter column f2 set data type text using f2;

insert into t4 select 1, '2';
ERROR:  invalid memory alloc request size 18446744073709551615



Re: bug: virtual generated column can be partition key

From
Ashutosh Bapat
Date:


On Tue, Apr 22, 2025 at 11:19 AM jian he <jian.universality@gmail.com> wrote:
On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
>
> While looking at this I realised that a generated column may end up being part of the partition key if the partition key expression contains a whole row reference. Attached patch also has a fix and a testcase for the same. PARTITION BY RANGE ((gtest_part_key is not null)) expression in the test is kinda silly, but it tests the whole-row reference as part of an expression. I haven't looked for more sensible expressions.
>

I begin to wonder if wholerow reference should be allowed.
then error occurred:

drop table if exists t4;
CREATE TABLE t4(f1 int, f2 bigint) PARTITION BY list ((t4));
create table t4_1 partition of t4 for values in ((1,2));
alter table t4 alter column f2 set data type text using f2;

insert into t4 select 1, '2';
ERROR:  invalid memory alloc request size 18446744073709551615

I can reproduce this error without my patc. It seems to be a different existing bug.

--
Best Wishes,
Ashutosh Bapat

Re: bug: virtual generated column can be partition key

From
jian he
Date:
On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> I have included your original tests, but ended up rewriting code. Please let me know what do you think.
>

+ if (attno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("partition key expressions cannot contain system column references")));
+
+ /*
+ * We do not check dropped columns explicitly since they will
+ * be eliminated when expanding the the whole row expression
+ * anyway.
+ */
typo: "the the".
I am confused by the above comments.
ComputePartitionAttrs only called in DefineRelation.
DefineRelation will only CREATE a table, there will be no dropped
column via DefineRelation.


+ /*
+ * transformPartitionSpec() should have already rejected
+ * subqueries, aggregates, window functions, and SRFs, based
+ * on the EXPR_KIND_ for partition expressions.
+ */
"EXPR_KIND_" can change to EXPR_KIND_PARTITION_EXPRESSION
?


Other than that, it looks good to me for fixing this bug.



Re: bug: virtual generated column can be partition key

From
jian he
Date:
On Tue, Apr 22, 2025 at 3:02 PM jian he <jian.universality@gmail.com> wrote:
> Other than that, it looks good to me for fixing this bug.

The error message seems not that intuitive.

+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.

with the attached patch. now,
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
ERROR:  cannot use generated column in partition key
LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
                                                             ^
DETAIL:  Generated column "f3" is part of the partition key of
relation "gtest_part_key"


what do you think?

Attachment

Re: bug: virtual generated column can be partition key

From
Ashutosh Bapat
Date:
Thanks Jian for your review.

On Tue, Apr 22, 2025 at 12:32 PM jian he <jian.universality@gmail.com> wrote:
On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> I have included your original tests, but ended up rewriting code. Please let me know what do you think.
>

+ if (attno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("partition key expressions cannot contain system column references")));
+
+ /*
+ * We do not check dropped columns explicitly since they will
+ * be eliminated when expanding the the whole row expression
+ * anyway.
+ */
typo: "the the".
I am confused by the above comments.
ComputePartitionAttrs only called in DefineRelation.
DefineRelation will only CREATE a table, there will be no dropped
column via DefineRelation.

You are right! Fixed.
 


+ /*
+ * transformPartitionSpec() should have already rejected
+ * subqueries, aggregates, window functions, and SRFs, based
+ * on the EXPR_KIND_ for partition expressions.
+ */
"EXPR_KIND_" can change to EXPR_KIND_PARTITION_EXPRESSION


That's an existing comment which appears to be moved in diff but it's actually untouched by the patch. Maybe you are right, IDK, but since it's not related to the fix, let's leave it untouched. I did wonder whether that comment has any relation to the pull_varattnos call which has been moved earlier since pull_varattnos doesn't expect any Query node in the tree. But pondering more, I think the comment is related to the number of rows participating in the partition key computation - there should be exactly one key for one tuple. Having SRFs, subqueries in part expression means a possibility of producing more than one set of partition keys, aggregates and window functions OTOH will produce one partition key for more than one row - all of them breaking 1:1 mapping between a tuple and a partition key. Hence I think the comment is at the right place.

PFA revised patch.

--
Best Wishes,
Ashutosh Bapat
Attachment

Re: bug: virtual generated column can be partition key

From
Ashutosh Bapat
Date:
Sorry I missed this email while sending the patches - our emails crossed in the air.

On Tue, Apr 22, 2025 at 2:15 PM jian he <jian.universality@gmail.com> wrote:
On Tue, Apr 22, 2025 at 3:02 PM jian he <jian.universality@gmail.com> wrote:
> Other than that, it looks good to me for fixing this bug.

The error message seems not that intuitive.

+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.

Yes. It's not apparent where does f3 appear in the partition key, to a lay users. Users who explicitly use whole row expression in a partition key, would know that a whole row expression contains all columns. And the error location points to the whole-row reference. So the current state isn't that bad.
 

with the attached patch. now,
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
ERROR:  cannot use generated column in partition key
LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
                                                             ^
DETAIL:  Generated column "f3" is part of the partition key of
relation "gtest_part_key"

 
The DETAIL here is just mentioning what's already known from the command, so the change in the DETAIL may not be useful.

If I understand your intention correctly, we have to mention something to the effect "partition key expression contains a whole-row reference which in turn contains a generated column." But that might be too verbose.

--
Best Wishes,
Ashutosh Bapat

Re: bug: virtual generated column can be partition key

From
jian he
Date:
On Tue, Apr 22, 2025 at 4:55 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Sorry I missed this email while sending the patches - our emails crossed in the air.
>
> On Tue, Apr 22, 2025 at 2:15 PM jian he <jian.universality@gmail.com> wrote:
>>
>> On Tue, Apr 22, 2025 at 3:02 PM jian he <jian.universality@gmail.com> wrote:
>> > Other than that, it looks good to me for fixing this bug.
>>
>> The error message seems not that intuitive.
>>
>> +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
>> GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
>> ((gtest_part_key));
>> +ERROR:  cannot use generated column in partition key
>> +LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
>> +                                                             ^
>> +DETAIL:  Column "f3" is a generated column.
>
>
> Yes. It's not apparent where does f3 appear in the partition key, to a lay users. Users who explicitly use whole row
expressionin a partition key, would know that a whole row expression contains all columns. And the error location
pointsto the whole-row reference. So the current state isn't that bad. 
>
>>
>>
>> with the attached patch. now,
>> +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
>> GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
>> ((gtest_part_key));
>> ERROR:  cannot use generated column in partition key
>> LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
>>                                                              ^
>> DETAIL:  Generated column "f3" is part of the partition key of
>> relation "gtest_part_key"
>>
>
> The DETAIL here is just mentioning what's already known from the command, so the change in the DETAIL may not be
useful.
>
> If I understand your intention correctly, we have to mention something to the effect "partition key expression
containsa whole-row reference which in turn contains a generated column." But that might be too verbose. 
>

you understand my intention correctly,
I aggrees current error message in that location is not that bad.

overall your latest patch looks good to me.



Re: bug: virtual generated column can be partition key

From
Yura Sokolov
Date:
21.04.2025 05:30, jian he пишет:
> hi.
> While trying to make the virtual generated column be part of the partition key,
> I found this bug.
> it also influences the stored generated column, i added a test
> on generated_stored.sql.
> 
> CREATE TABLE gtest_part_key (
>     f1 date NOT NULL, f2 bigint,
>     f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
>     PARTITION BY RANGE (f3);
> 
> ERROR:  cannot use generated column in partition key
> LINE 4:     PARTITION BY RANGE (f3);
>                                 ^
> DETAIL:  Column "f3" is a generated column.
> 
> the following is essentially the same as above, it should also fail.
> 
> CREATE TABLE gtest_part_key (
>     f1 date NOT NULL, f2 bigint,
>     f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
>     PARTITION BY RANGE ((f3));

I don't understand why latter should fail?

Documentation says [1]:

> PostgreSQL allows you to declare that a table is divided into partitions.
> The table that is divided is referred to as a partitioned table.
> The declaration includes the partitioning method as described above,
> plus a list of columns or expressions to be used as the partition key.

Note: "list of columns or EXPRESSIONS"!

In first case you pass list of columns (which contains single column f3). I
don't get which internal restriction forces it to fail, really, but ok:
there is restriction on COLUMNS LIST and it must be obeyed.

But in second case you pass EXPRESSION, and I don't think same restriction
should be applied.

More over, if you look into comments on restriction on GENERATED columns
[2] [3], you will find this restriction is because of nature of STORED
generated columns, and it doesn't bound to VIRTUAL ones. Indeed, there is
suggestion for future to treat GENERATED VIRTUAL columns as expressions.

So, if you want to force some restriction, you should force it only against
STORED columns, but not VIRTUAL ones.
Of cause, if VIRTUAL column depends on STORED one, it is still should be
forbidden.

Certainly, it is my opinion and I could be mistaken somewhere.
For example, you may say "it is too hard to check dependency of VIRTUAL
column at the moment, so it is simpler to forbid them as well". But then it
should be clearly stated in commit messages and code comments.

[1]
https://www.postgresql.org/docs/17/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE
[2]

https://github.com/postgres/postgres/blob/caa76b91a60681dff0bf193b64d4dcdc1014e036/src/backend/commands/tablecmds.c#L19735-L19741
[3]

https://github.com/postgres/postgres/blob/caa76b91a60681dff0bf193b64d4dcdc1014e036/src/backend/commands/tablecmds.c#L19821-L19828

-- 
regards
Yura Sokolov aka funny-falcon



Re: bug: virtual generated column can be partition key

From
jian he
Date:
On Tue, May 6, 2025 at 5:57 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
>
> 21.04.2025 05:30, jian he пишет:
> > hi.
> > While trying to make the virtual generated column be part of the partition key,
> > I found this bug.
> > it also influences the stored generated column, i added a test
> > on generated_stored.sql.
> >
> > CREATE TABLE gtest_part_key (
> >     f1 date NOT NULL, f2 bigint,
> >     f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
> >     PARTITION BY RANGE (f3);
> >
> > ERROR:  cannot use generated column in partition key
> > LINE 4:     PARTITION BY RANGE (f3);
> >                                 ^
> > DETAIL:  Column "f3" is a generated column.
> >
> > the following is essentially the same as above, it should also fail.
> >
> > CREATE TABLE gtest_part_key (
> >     f1 date NOT NULL, f2 bigint,
> >     f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
> >     PARTITION BY RANGE ((f3));
>
> I don't understand why latter should fail?
>
> Documentation says [1]:
>
> > PostgreSQL allows you to declare that a table is divided into partitions.
> > The table that is divided is referred to as a partitioned table.
> > The declaration includes the partitioning method as described above,
> > plus a list of columns or expressions to be used as the partition key.
>
> Note: "list of columns or EXPRESSIONS"!
>
> In first case you pass list of columns (which contains single column f3). I
> don't get which internal restriction forces it to fail, really, but ok:
> there is restriction on COLUMNS LIST and it must be obeyed.
>
> But in second case you pass EXPRESSION, and I don't think same restriction
> should be applied.
>
> More over, if you look into comments on restriction on GENERATED columns
> [2] [3], you will find this restriction is because of nature of STORED
> generated columns, and it doesn't bound to VIRTUAL ones. Indeed, there is
> suggestion for future to treat GENERATED VIRTUAL columns as expressions.
>

hi.
you already pointed out the problem.
As of now GENERATED VIRTUAL columns are not supported as partition keys,
therefore we need to disallow generated columns being referenced in
the partition key in any representation.


currently in a matser:
CREATE TABLE xx (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED
ALWAYS AS (f2 * 2) virtual) PARTITION BY RANGE (f3);
will yield error, however
CREATE TABLE xx (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED
ALWAYS AS (f2 * 2) virtual) PARTITION BY RANGE ((f3));
will not.
but these two form essentially the same thing.


"PARTITION BY RANGE ((f3));"  and "PARTITION BY RANGE ((whole_row));"
are the corner cases we are trying to catch.

you may also see ComputePartitionAttrs.



Re: bug: virtual generated column can be partition key

From
Yura Sokolov
Date:
06.05.2025 13:31, jian he пишет:
> On Tue, May 6, 2025 at 5:57 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
>>
>> 21.04.2025 05:30, jian he пишет:
>>> hi.
>>> While trying to make the virtual generated column be part of the partition key,
>>> I found this bug.
>>> it also influences the stored generated column, i added a test
>>> on generated_stored.sql.
>>>
>>> CREATE TABLE gtest_part_key (
>>>     f1 date NOT NULL, f2 bigint,
>>>     f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
>>>     PARTITION BY RANGE (f3);
>>>
>>> ERROR:  cannot use generated column in partition key
>>> LINE 4:     PARTITION BY RANGE (f3);
>>>                                 ^
>>> DETAIL:  Column "f3" is a generated column.
>>>
>>> the following is essentially the same as above, it should also fail.
>>>
>>> CREATE TABLE gtest_part_key (
>>>     f1 date NOT NULL, f2 bigint,
>>>     f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
>>>     PARTITION BY RANGE ((f3));
>>
>> I don't understand why latter should fail?
>>
>> Documentation says [1]:
>>
>>> PostgreSQL allows you to declare that a table is divided into partitions.
>>> The table that is divided is referred to as a partitioned table.
>>> The declaration includes the partitioning method as described above,
>>> plus a list of columns or expressions to be used as the partition key.
>>
>> Note: "list of columns or EXPRESSIONS"!
>>
>> In first case you pass list of columns (which contains single column f3). I
>> don't get which internal restriction forces it to fail, really, but ok:
>> there is restriction on COLUMNS LIST and it must be obeyed.
>>
>> But in second case you pass EXPRESSION, and I don't think same restriction
>> should be applied.
>>
>> More over, if you look into comments on restriction on GENERATED columns
>> [2] [3], you will find this restriction is because of nature of STORED
>> generated columns, and it doesn't bound to VIRTUAL ones. Indeed, there is
>> suggestion for future to treat GENERATED VIRTUAL columns as expressions.
>>
> 
> hi.
> you already pointed out the problem.
> As of now GENERATED VIRTUAL columns are not supported as partition keys,
> therefore we need to disallow generated columns being referenced in
> the partition key in any representation.

I don't see why "we need to disallow". There are no fundamental reasons.

May be it is better to allow GENERATED VIRTUAL columns? But still forbid
GENERATED STORED columns because there are reasons for.

> currently in a matser:
> CREATE TABLE xx (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED
> ALWAYS AS (f2 * 2) virtual) PARTITION BY RANGE (f3);
> will yield error, however
> CREATE TABLE xx (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED
> ALWAYS AS (f2 * 2) virtual) PARTITION BY RANGE ((f3));
> will not.
> but these two form essentially the same thing.
> 
> 
> "PARTITION BY RANGE ((f3));"  and "PARTITION BY RANGE ((whole_row));"
> are the corner cases we are trying to catch.
> 
> you may also see ComputePartitionAttrs.

I saw it. And I gave links to comments in this function. This comments
clearly state, there is no real reason to forbid GENERATED VIRTUAL columns
(in opposite to STORED). They are forbidden just because author had no
time/wish to finish their support.

-- 
regards
Yura Sokolov aka funny-falcon