Thread: Re: Doc: fix the rewrite condition when executing ALTER TABLE ADD COLUMN

Re: Doc: fix the rewrite condition when executing ALTER TABLE ADD COLUMN

From
Masahiro Ikeda
Date:
On 2025-01-03 01:25, Robert Treat wrote:
> On Tue, Dec 3, 2024 at 3:13 AM Masahiro Ikeda 
> <ikedamsh@oss.nttdata.com> wrote:
>> 
>> Hi,
>> 
>> The documentation seems to overlook the rewrite condition
>> when executing ALTER TABLE ADD COLUMN.
>> 
>> The current document states that a volatile DEFAULT will
>> trigger a rewrite of the table and its indexes. However, the
>> table and its indexes will also be rewritten when an IDENTITY
>> column is added, or when a column with a domain data type that
>> has constraints is added.
>> 
>> What do you think?
>> 
> 
> We still see a number of people asking (or confused) about table
> rewrites when adding columns, so I think the initial tip should
> remain, though I think it can be cleaned up a little.
> 
> In the second section (alter_table.sgml) I liked the idea of adding
> these additional examples, though I tweaked the wording a bit to
> (hopefully) make it a little easier to read.
> 
> Modified patch attached.

Thanks! It looks good to me with one minor comment.

Is the following intended to remove "However"? It seems that we don't
need to modify the lines if the initial tip remains.

      <para>
-     However, if the default value is volatile (e.g.,
-     <function>clock_timestamp()</function>)
+     If the default value is volatile (e.g., 
<function>clock_timestamp()</function>)
       each row will need to be updated with the value calculated at the 
time

Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION



On Mon, Jan 6, 2025 at 3:18 AM Masahiro Ikeda <ikedamsh@oss.nttdata.com> wrote:
>
> On 2025-01-03 01:25, Robert Treat wrote:
> > On Tue, Dec 3, 2024 at 3:13 AM Masahiro Ikeda
> > <ikedamsh@oss.nttdata.com> wrote:
> >>
> >> Hi,
> >>
> >> The documentation seems to overlook the rewrite condition
> >> when executing ALTER TABLE ADD COLUMN.
> >>
> >> The current document states that a volatile DEFAULT will
> >> trigger a rewrite of the table and its indexes. However, the
> >> table and its indexes will also be rewritten when an IDENTITY
> >> column is added, or when a column with a domain data type that
> >> has constraints is added.
> >>
> >> What do you think?
> >>
> >
> > We still see a number of people asking (or confused) about table
> > rewrites when adding columns, so I think the initial tip should
> > remain, though I think it can be cleaned up a little.
> >
> > In the second section (alter_table.sgml) I liked the idea of adding
> > these additional examples, though I tweaked the wording a bit to
> > (hopefully) make it a little easier to read.
> >
> > Modified patch attached.
>
> Thanks! It looks good to me with one minor comment.
>
> Is the following intended to remove "However"? It seems that we don't
> need to modify the lines if the initial tip remains.
>
>       <para>
> -     However, if the default value is volatile (e.g.,
> -     <function>clock_timestamp()</function>)
> +     If the default value is volatile (e.g.,
> <function>clock_timestamp()</function>)
>        each row will need to be updated with the value calculated at the
> time
>

Technically speaking, because we split the tip into two distinct
paragraphs, use of the word however would be considered poor grammar,
though I'll admit I only removed it because it felt superfluous.


Robert Treat
https://xzilla.net



Re: Doc: fix the rewrite condition when executing ALTER TABLE ADD COLUMN

From
Masahiro Ikeda
Date:
On 2025-01-07 06:27, Robert Treat wrote:
> On Mon, Jan 6, 2025 at 3:18 AM Masahiro Ikeda 
> <ikedamsh@oss.nttdata.com> wrote:
>> 
>> On 2025-01-03 01:25, Robert Treat wrote:
>> > On Tue, Dec 3, 2024 at 3:13 AM Masahiro Ikeda
>> > <ikedamsh@oss.nttdata.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> The documentation seems to overlook the rewrite condition
>> >> when executing ALTER TABLE ADD COLUMN.
>> >>
>> >> The current document states that a volatile DEFAULT will
>> >> trigger a rewrite of the table and its indexes. However, the
>> >> table and its indexes will also be rewritten when an IDENTITY
>> >> column is added, or when a column with a domain data type that
>> >> has constraints is added.
>> >>
>> >> What do you think?
>> >>
>> >
>> > We still see a number of people asking (or confused) about table
>> > rewrites when adding columns, so I think the initial tip should
>> > remain, though I think it can be cleaned up a little.
>> >
>> > In the second section (alter_table.sgml) I liked the idea of adding
>> > these additional examples, though I tweaked the wording a bit to
>> > (hopefully) make it a little easier to read.
>> >
>> > Modified patch attached.
>> 
>> Thanks! It looks good to me with one minor comment.
>> 
>> Is the following intended to remove "However"? It seems that we don't
>> need to modify the lines if the initial tip remains.
>> 
>>       <para>
>> -     However, if the default value is volatile (e.g.,
>> -     <function>clock_timestamp()</function>)
>> +     If the default value is volatile (e.g.,
>> <function>clock_timestamp()</function>)
>>        each row will need to be updated with the value calculated at 
>> the
>> time
>> 
> 
> Technically speaking, because we split the tip into two distinct
> paragraphs, use of the word however would be considered poor grammar,
> though I'll admit I only removed it because it felt superfluous.

OK, thanks for your comments.

Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION



Re: Doc: fix the rewrite condition when executing ALTER TABLE ADD COLUMN

From
Álvaro Herrera
Date:
Hello,

I have pushed this patch now, with some tiny changes.  (I am not a
believer of the semicolon added by commit d31e2a495b6f before the word
"and").

Also, I didn't think that changing a column type was sufficiently close
to the restrictions of column addition to belong in the same
enumeration, so the first phrase is now (note the "as will" bit at the
end):

    Adding a column with a volatile <literal>DEFAULT</literal>
    (e.g., <function>clock_timestamp()</function>), a generated column
    (e.g., <literal>GENERATED BY DEFAULT AS IDENTITY</literal>), a domain
    data type with constraints will require the entire table and its
    indexes to be rewritten, as will changing the type of an existing
    column.


Thank you!

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"It takes less than 2 seconds to get to 78% complete; that's a good sign.
A few seconds later it's at 90%, but it seems to have stuck there.  Did
somebody make percentages logarithmic while I wasn't looking?"
                http://smylers.hates-software.com/2005/09/08/1995c749.html



hi.

https://git.postgresql.org/cgit/postgresql.git/commit/?id=11bd8318602fc2282a6201f714c15461dc2009c6

+ Adding a column with a volatile <literal>DEFAULT</literal>
+ (e.g., <function>clock_timestamp()</function>), a generated column
+ (e.g., <literal>GENERATED BY DEFAULT AS IDENTITY</literal>), a domain
+ data type with constraints will require the entire table and its
+ indexes to be rewritten, as will changing the type of an existing
+ column. As an exception, when changing the type of an existing column,
+ if the <literal>USING</literal> clause does not change the column
+ contents and the old type is either binary coercible to the new type
+ or an unconstrained domain over the new type, a table rewrite is not
+ needed.

In the current development branch,
virtual generated columns still do not support the domain.
you can not change the generation expression if it contains a check
constraint on it.
so virtual generated columns don't need rewriting.

IMHO, the committed doc description didn't mention this exception.
we need some text to cover this exception?



Re: Doc: fix the rewrite condition when executing ALTER TABLE ADD COLUMN

From
Álvaro Herrera
Date:
On 2025-Mar-24, jian he wrote:

> In the current development branch,
> virtual generated columns still do not support the domain.
> you can not change the generation expression if it contains a check
> constraint on it.
> so virtual generated columns don't need rewriting.
> 
> IMHO, the committed doc description didn't mention this exception.
> we need some text to cover this exception?

I'd add a note about these two things to the open items page, and wait
to see if we get some of these limitations fixed, so that if we don't,
we remember to note this limitation in the documentation.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"All rings of power are equal,
But some rings of power are more equal than others."
                                 (George Orwell's The Lord of the Rings)



This one was briefly discussed in an RMT meeting.

On Mon, Mar 24, 2025 at 11:37:20AM +0100, Álvaro Herrera wrote:
> On 2025-Mar-24, jian he wrote:
> 
>> In the current development branch,
>> virtual generated columns still do not support the domain.
>> you can not change the generation expression if it contains a check
>> constraint on it.
>> so virtual generated columns don't need rewriting.
>> 
>> IMHO, the committed doc description didn't mention this exception.
>> we need some text to cover this exception?
> 
> I'd add a note about these two things to the open items page, and wait
> to see if we get some of these limitations fixed, so that if we don't,
> we remember to note this limitation in the documentation.

Are we still waiting on something for this, or should we proceed with the
documentation changes?  It doesn't seem tremendously urgent, but I noticed
it's been about a month since the last message on this thread.

-- 
nathan



Re: Doc: fix the rewrite condition when executing ALTER TABLE ADD COLUMN

From
Álvaro Herrera
Date:
On 2025-Apr-23, Nathan Bossart wrote:

> On Mon, Mar 24, 2025 at 11:37:20AM +0100, Álvaro Herrera wrote:

> > I'd add a note about these two things to the open items page, and wait
> > to see if we get some of these limitations fixed, so that if we don't,
> > we remember to note this limitation in the documentation.
> 
> Are we still waiting on something for this, or should we proceed with the
> documentation changes?  It doesn't seem tremendously urgent, but I noticed
> it's been about a month since the last message on this thread.

I've edited the Open Items page to disclaim my responsibility from this
item, since this comes from virtual generated columns which is not my
turf.  I think we should just document the current state of affairs; we
can come back with further code improvements during the next cycle.

Thanks,

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"This is a foot just waiting to be shot"                (Andrew Dunstan)



Re: Doc: fix the rewrite condition when executing ALTER TABLE ADD COLUMN

From
Peter Eisentraut
Date:
On 28.04.25 18:56, Álvaro Herrera wrote:
> On 2025-Apr-23, Nathan Bossart wrote:
> 
>> On Mon, Mar 24, 2025 at 11:37:20AM +0100, Álvaro Herrera wrote:
> 
>>> I'd add a note about these two things to the open items page, and wait
>>> to see if we get some of these limitations fixed, so that if we don't,
>>> we remember to note this limitation in the documentation.
>>
>> Are we still waiting on something for this, or should we proceed with the
>> documentation changes?  It doesn't seem tremendously urgent, but I noticed
>> it's been about a month since the last message on this thread.
> 
> I've edited the Open Items page to disclaim my responsibility from this
> item, since this comes from virtual generated columns which is not my
> turf.  I think we should just document the current state of affairs; we
> can come back with further code improvements during the next cycle.

Here is a proposed patch that includes some text about virtual generated 
columns and also fixes up a small mistake in the previous patch 
(confused identity and generated columns) and improves the wording and 
formatting a bit more.

Attachment
On Wed, Apr 30, 2025 at 5:15 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 28.04.25 18:56, Álvaro Herrera wrote:
> > On 2025-Apr-23, Nathan Bossart wrote:
> >
> >> On Mon, Mar 24, 2025 at 11:37:20AM +0100, Álvaro Herrera wrote:
> >
> >>> I'd add a note about these two things to the open items page, and wait
> >>> to see if we get some of these limitations fixed, so that if we don't,
> >>> we remember to note this limitation in the documentation.
> >>
> >> Are we still waiting on something for this, or should we proceed with the
> >> documentation changes?  It doesn't seem tremendously urgent, but I noticed
> >> it's been about a month since the last message on this thread.
> >
> > I've edited the Open Items page to disclaim my responsibility from this
> > item, since this comes from virtual generated columns which is not my
> > turf.  I think we should just document the current state of affairs; we
> > can come back with further code improvements during the next cycle.
>
> Here is a proposed patch that includes some text about virtual generated
> columns and also fixes up a small mistake in the previous patch
> (confused identity and generated columns) and improves the wording and
> formatting a bit more.

If I were going to quibble, I'd probably rewrite the second paragraph as

+    Changing the type of an existing column will normally cause the
entire table
+    and its indexes to be rewritten.
+    As an exception, when changing the type of an existing column,
     if the <literal>USING</literal> clause does not change the column
     contents and the old type is either binary coercible to the new type
     or an unconstrained domain over the new type, a table rewrite is not
-    needed.  However, indexes must always be rebuilt unless the system
+    needed.  However, indexes will still need to be rebuilt unless the system
     can verify that the new index would be logically equivalent to the
     existing one.  For example, if the collation for a column has been
     changed, an index rebuild is required because the new sort
     order might be different.  However, in the absence of a collation
     change, a column can be changed from <type>text</type> to
     <type>varchar</type> (or vice versa) without rebuilding the indexes
-    because these data types sort identically.  Table and/or index
+    because these data types sort identically.

But otherwise this LGTM.


Robert Treat
https://xzilla.net



On 30.04.25 17:17, Robert Treat wrote:
> On Wed, Apr 30, 2025 at 5:15 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>>
>> On 28.04.25 18:56, Álvaro Herrera wrote:
>>> On 2025-Apr-23, Nathan Bossart wrote:
>>>
>>>> On Mon, Mar 24, 2025 at 11:37:20AM +0100, Álvaro Herrera wrote:
>>>
>>>>> I'd add a note about these two things to the open items page, and wait
>>>>> to see if we get some of these limitations fixed, so that if we don't,
>>>>> we remember to note this limitation in the documentation.
>>>>
>>>> Are we still waiting on something for this, or should we proceed with the
>>>> documentation changes?  It doesn't seem tremendously urgent, but I noticed
>>>> it's been about a month since the last message on this thread.
>>>
>>> I've edited the Open Items page to disclaim my responsibility from this
>>> item, since this comes from virtual generated columns which is not my
>>> turf.  I think we should just document the current state of affairs; we
>>> can come back with further code improvements during the next cycle.
>>
>> Here is a proposed patch that includes some text about virtual generated
>> columns and also fixes up a small mistake in the previous patch
>> (confused identity and generated columns) and improves the wording and
>> formatting a bit more.
> 
> If I were going to quibble, I'd probably rewrite the second paragraph as
> 
> +    Changing the type of an existing column will normally cause the
> entire table
> +    and its indexes to be rewritten.
> +    As an exception, when changing the type of an existing column,
>       if the <literal>USING</literal> clause does not change the column
>       contents and the old type is either binary coercible to the new type
>       or an unconstrained domain over the new type, a table rewrite is not
> -    needed.  However, indexes must always be rebuilt unless the system
> +    needed.  However, indexes will still need to be rebuilt unless the system
>       can verify that the new index would be logically equivalent to the
>       existing one.  For example, if the collation for a column has been
>       changed, an index rebuild is required because the new sort
>       order might be different.  However, in the absence of a collation
>       change, a column can be changed from <type>text</type> to
>       <type>varchar</type> (or vice versa) without rebuilding the indexes
> -    because these data types sort identically.  Table and/or index
> +    because these data types sort identically.
> 
> But otherwise this LGTM.

Thanks, I have worked this into the commit.  I wanted to get away from 
phrases like "needs to be rebuilt" or "must be rebuilt" because that 
possibly sounds like you have to do it manually.  So I tweaked it a bit 
further, but I think I captured the essence.

I'm closing the open item.




On Thu, May 1, 2025 at 3:22 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 30.04.25 17:17, Robert Treat wrote:
> > But otherwise this LGTM.
>
> Thanks, I have worked this into the commit.  I wanted to get away from
> phrases like "needs to be rebuilt" or "must be rebuilt" because that
> possibly sounds like you have to do it manually.  So I tweaked it a bit
> further, but I think I captured the essence.
>

Yeah, that was my thinking as well; commit looks good.

> I'm closing the open item.
>

Thanks Peter.

Robert Treat
https://xzilla.net