Thread: Should we document how column DEFAULT expressions work?
Hello, It's possible I'm the only one who's been in this situation, but I've multiple times found myself explaining to a user how column DEFAULT expressions work: namely how the quoting on an expression following the keyword DEFAULT controls whether or not the expression is evaluated at the time of the DDL statement or at the time of an insertion. In my experience this is non-obvious to users, and the quoting makes a big difference. Is this something that we should document explicitly? I don't see it called out in the CREATE TABLE reference page, but it's possible I'm missing something. Thanks, James Coleman
James Coleman <jtc331@gmail.com> writes: > It's possible I'm the only one who's been in this situation, but I've > multiple times found myself explaining to a user how column DEFAULT > expressions work: namely how the quoting on an expression following > the keyword DEFAULT controls whether or not the expression is > evaluated at the time of the DDL statement or at the time of an > insertion. Uh ... what? I recall something about that with respect to certain features such as nextval(), but you're making it sound like there is something generic going on with DEFAULT. regards, tom lane
On Tue, Jun 25, 2024 at 4:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > James Coleman <jtc331@gmail.com> writes: > > It's possible I'm the only one who's been in this situation, but I've > > multiple times found myself explaining to a user how column DEFAULT > > expressions work: namely how the quoting on an expression following > > the keyword DEFAULT controls whether or not the expression is > > evaluated at the time of the DDL statement or at the time of an > > insertion. > > Uh ... what? I recall something about that with respect to certain > features such as nextval(), but you're making it sound like there > is something generic going on with DEFAULT. Hmm, I guess I'd never considered anything besides cases like nextval() and now(), but I see now that now() must also be special cased (when quoted) since 'date_trunc(day, now())'::timestamp doesn't work but 'now()'::timestamp does. So I guess what I'm asking about would be limited to those cases (I assume there are a few others...but I haven't gone digging through the source yet). Regards, James Coleman
James Coleman <jtc331@gmail.com> writes: > On Tue, Jun 25, 2024 at 4:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Uh ... what? I recall something about that with respect to certain >> features such as nextval(), but you're making it sound like there >> is something generic going on with DEFAULT. > Hmm, I guess I'd never considered anything besides cases like > nextval() and now(), but I see now that now() must also be special > cased (when quoted) since 'date_trunc(day, now())'::timestamp doesn't > work but 'now()'::timestamp does. Hmm, both of those behaviors are documented, but not in the same place and possibly not anywhere near where you looked for info about DEFAULT. For instance, the Tip at the bottom of section 9.9.5 https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT explains about how 'now'::timestamp isn't what to use in DEFAULT. regards, tom lane
On Tue, Jun 25, 2024 at 4:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
James Coleman <jtc331@gmail.com> writes:
> On Tue, Jun 25, 2024 at 4:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Uh ... what? I recall something about that with respect to certain
>> features such as nextval(), but you're making it sound like there
>> is something generic going on with DEFAULT.
> Hmm, I guess I'd never considered anything besides cases like
> nextval() and now(), but I see now that now() must also be special
> cased (when quoted) since 'date_trunc(day, now())'::timestamp doesn't
> work but 'now()'::timestamp does.
Hmm, both of those behaviors are documented, but not in the same place
and possibly not anywhere near where you looked for info about
DEFAULT. For instance, the Tip at the bottom of section 9.9.5
https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT
explains about how 'now'::timestamp isn't what to use in DEFAULT.
I'd suggest adding to:
DEFAULT default_expr
The DEFAULT clause assigns a default data value for the column whose column definition it appears within. The value is any variable-free expression (in particular, cross-references to other columns in the current table are not allowed). Subqueries are not allowed either. The data type of the default expression must match the data type of the column.
The default expression will be used in any insert operation that does not specify a value for the column. If there is no default for a column, then the default is null.
The DEFAULT clause assigns a default data value for the column whose column definition it appears within. The value is any variable-free expression (in particular, cross-references to other columns in the current table are not allowed). Subqueries are not allowed either. The data type of the default expression must match the data type of the column.
The default expression will be used in any insert operation that does not specify a value for the column. If there is no default for a column, then the default is null.
+ Be aware that the [special timestamp values 1] are resolved immediately, not upon insert. Use the [date/time constructor functions 2] to produce a time relative to the future insertion.
David J.
On Wed, 26 Jun 2024 at 13:31, David G. Johnston <david.g.johnston@gmail.com> wrote: > I'd suggest adding to: > > DEFAULT default_expr > The DEFAULT clause assigns a default data value for the column whose column definition it appears within. The value isany variable-free expression (in particular, cross-references to other columns in the current table are not allowed). Subqueriesare not allowed either. The data type of the default expression must match the data type of the column. > > The default expression will be used in any insert operation that does not specify a value for the column. If there is nodefault for a column, then the default is null. > > + Be aware that the [special timestamp values 1] are resolved immediately, not upon insert. Use the [date/time constructorfunctions 2] to produce a time relative to the future insertion. FWIW, I disagree that we need to write anything about that in this part of the documentation. I think any argument for doing this could equally be applied to something like re-iterating what the operator precedence rules for arithmetic are, and I don't think that should be mentioned. Also, what about all the other places where someone could use one of the special timestamp input values? Should CREATE VIEW get a memo too? How about PREPARE? If people don't properly understand these special timestamp input values, then maybe the documentation in [1] needs to be improved. At the moment the details are within parentheses. Namely "(In particular, now and related strings are converted to a specific time value as soon as they are read.)". Maybe it would be better to be more explicit there and mention that these are special values that the input function understands which are translated to actual timestamp values when the type's input function is called. That could maybe be tied into the DEFAULT clause documentation to mention that the input function for constant values is called at DML time rather than DDL time. That way, we're not adding these (unsustainable) special cases to the documentation. David [1] https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-DATETIME-SPECIAL-VALUES
David Rowley <dgrowleyml@gmail.com> writes: > If people don't properly understand these special timestamp input > values, then maybe the documentation in [1] needs to be improved. At > the moment the details are within parentheses. Namely "(In particular, > now and related strings are converted to a specific time value as soon > as they are read.)". Maybe it would be better to be more explicit > there and mention that these are special values that the input > function understands which are translated to actual timestamp values > when the type's input function is called. That could maybe be tied > into the DEFAULT clause documentation to mention that the input > function for constant values is called at DML time rather than DDL > time. That way, we're not adding these (unsustainable) special cases > to the documentation. This sounds like a reasonable approach to me for the magic-input-values issue. Do we want to do anything about nextval()? I guess if you hold your head at the correct angle, that's also a magic-input-value issue, in the sense that the question is when does regclass input get resolved. regards, tom lane
On Tue, Jun 25, 2024 at 9:50 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 26 Jun 2024 at 13:31, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> I'd suggest adding to:
>
> DEFAULT default_expr
> The DEFAULT clause assigns a default data value for the column whose column definition it appears within. The value is any variable-free expression (in particular, cross-references to other columns in the current table are not allowed). Subqueries are not allowed either. The data type of the default expression must match the data type of the column.
>
> The default expression will be used in any insert operation that does not specify a value for the column. If there is no default for a column, then the default is null.
>
> + Be aware that the [special timestamp values 1] are resolved immediately, not upon insert. Use the [date/time constructor functions 2] to produce a time relative to the future insertion.
Annoyingly even this advice isn't correct:
postgres=# create table tdts2 (ts timestamptz default 'now()');
CREATE TABLE
postgres=# \d tdts2
Table "public.tdts2"
Column | Type | Collation | Nullable | Default
--------+--------------------------+-----------+----------+-------------------------------------------
----------------
ts | timestamp with time zone | | | '2024-06-25 18:05:33.055377-07'::timestamp
with time zone
CREATE TABLE
postgres=# \d tdts2
Table "public.tdts2"
Column | Type | Collation | Nullable | Default
--------+--------------------------+-----------+----------+-------------------------------------------
----------------
ts | timestamp with time zone | | | '2024-06-25 18:05:33.055377-07'::timestamp
with time zone
I expected writing what looked like the function now() to be delayed evaluated but since I put it into quotes, the OPs complaint, it got read as the literal with ignored extra bits.
FWIW, I disagree that we need to write anything about that in this
part of the documentation. I think any argument for doing this could
equally be applied to something like re-iterating what the operator
precedence rules for arithmetic are, and I don't think that should be
mentioned.
I disagree on this equivalence. The time literals are clear deviations from expected behavior. Knowing operator precedence rules, they apply everywhere equally. And we should document the deviations directly where they happen. Even if it's just a short link back to the source that describes the deviation. I'm fine with something less verbose pointing only to the data types page, but not with nothing.
Also, what about all the other places where someone could
use one of the special timestamp input values? Should CREATE VIEW get
a memo too? How about PREPARE?
Yes.
If people don't properly understand these special timestamp input
values, then maybe the documentation in [1] needs to be improved.
Recall, and awareness, is the greater issue, not comprehension. This intends to increase the former. I don't believe the latter is an issue, though I haven't deep dived into it.
And the whole type casting happening right away just seems misleading.
postgres=# create table testboold2 (expr boolean default boolean 'false');
CREATE TABLE
postgres=# \d testboold2
Table "public.testboold2"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
expr | boolean | | | false
CREATE TABLE
postgres=# \d testboold2
Table "public.testboold2"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
expr | boolean | | | false
I would expect 'f' in the default column if the boolean casting of the literal happened sooner. Or I'd expect to see "boolean 'false'" as the default expression if it is captured as-is.
So yes, saving an expression into the default column has nuances that should be documented where default is defined.
Maybe the wording needs to be:
"If the default expression contains any constants [1] they are converted into their typed value during create table execution. Thus time constants [1] save into the default expression the time the command was executed."
I'd be happy to be pointed to other constants that resolve to an execution-time specific environment in a similar manner. If there is another one I'll rethink the wisdom of trying to document all of them in each place. But reminding people that time is special and we have these special values seems to provide meaningful reader benefit for the cost of a couple of sentences repeated in a few places. That were valid a decade ago no more or less than they are valid now.
David J.
On Tue, Jun 25, 2024 at 10:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
> If people don't properly understand these special timestamp input
> values, then maybe the documentation in [1] needs to be improved. At
> the moment the details are within parentheses. Namely "(In particular,
> now and related strings are converted to a specific time value as soon
> as they are read.)". Maybe it would be better to be more explicit
> there and mention that these are special values that the input
> function understands which are translated to actual timestamp values
> when the type's input function is called. That could maybe be tied
> into the DEFAULT clause documentation to mention that the input
> function for constant values is called at DML time rather than DDL
> time. That way, we're not adding these (unsustainable) special cases
> to the documentation.
This sounds like a reasonable approach to me for the
magic-input-values issue. Do we want to do anything about
nextval()? I guess if you hold your head at the correct
angle, that's also a magic-input-value issue, in the sense
that the question is when does regclass input get resolved.
From observations we transform constants into the: " 'value'::type " syntax which then makes it an operator resolved at execution time. For every type except time types the transformation leaves the constant as-is. The special time values are the exception whereby they get evaluated to a specific time during the transformation.
postgres=# create table tser3 (id integer not null default nextval(regclass 'tser2_id_seq'));
CREATE TABLEpostgres=# \d tser3
Table "public.tser3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+-----------------------------------
id | integer | | not null | nextval('tser2_id_seq'::regclass)
I cannot figure out how to get "early binding" into the default. I.e., nextval(9000)
Since early binding is similar to the special timestamp behavior I'd say nextval is behaving just as expected - literal transform, no evaluation. We need only document the transforms that also evaluate.
David J.
On 2024-Jun-25, David G. Johnston wrote: > On Tue, Jun 25, 2024 at 9:50 PM David Rowley <dgrowleyml@gmail.com> wrote: > > FWIW, I disagree that we need to write anything about that in this > > part of the documentation. I think any argument for doing this could > > equally be applied to something like re-iterating what the operator > > precedence rules for arithmetic are, and I don't think that should be > > mentioned. > > I disagree on this equivalence. The time literals are clear deviations > from expected behavior. Knowing operator precedence rules, they apply > everywhere equally. And we should document the deviations directly where > they happen. Even if it's just a short link back to the source that > describes the deviation. I'm fine with something less verbose pointing > only to the data types page, but not with nothing. I agree that it'd be good to have _something_ -- the other stance seems super unhelpful. "We're not going to spend two lines to explain some funny rules that determine surprising behavior here, because we assume you have read all of our other 3000 pages of almost impenetrably dense documentation" is not great from a user's point of view. The behavior of 'now' in DEFAULT clauses is something that has been asked about for decades. Arithmetic precedence is a terrible straw man argument. Let's put that aside. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "If it is not right, do not do it. If it is not true, do not say it." (Marcus Aurelius, Meditations)
On Wed, 26 Jun 2024 at 11:05, James Coleman <jtc331@gmail.com> wrote: > Hmm, I guess I'd never considered anything besides cases like > nextval() and now(), but I see now that now() must also be special > cased (when quoted) since 'date_trunc(day, now())'::timestamp doesn't > work but 'now()'::timestamp does. 'now()'::timestamp only works because we ignore trailing punctuation in ParseDateTime() during timestamp_in(). 'now!!'::timestamp works equally as well. David
On Tuesday, June 25, 2024, James Coleman <jtc331@gmail.com> wrote:
Hello,
It's possible I'm the only one who's been in this situation, but I've
multiple times found myself explaining to a user how column DEFAULT
expressions work: namely how the quoting on an expression following
the keyword DEFAULT controls whether or not the expression is
evaluated at the time of the DDL statement or at the time of an
insertion.
I don’t know if it’s worth documenting but the following sentence is implied by the syntax:
“Do not single quote the expression as a whole. Write the expression as you would in a select query.”
David J.
On Wed, 26 Jun 2024 at 17:12, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Do we want to do anything about > nextval()? I guess if you hold your head at the correct > angle, that's also a magic-input-value issue, in the sense > that the question is when does regclass input get resolved. I think I'm not understanding what's special about that. Aren't 'now'::timestamp and 'seq_name'::regclass are just casts that are evaluated during parse time in transformExpr()? David
David Rowley <dgrowleyml@gmail.com> writes: > On Wed, 26 Jun 2024 at 17:12, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Do we want to do anything about >> nextval()? I guess if you hold your head at the correct >> angle, that's also a magic-input-value issue, in the sense >> that the question is when does regclass input get resolved. > I think I'm not understanding what's special about that. Aren't > 'now'::timestamp and 'seq_name'::regclass are just casts that are > evaluated during parse time in transformExpr()? Right. But there is an example in the manual explaining how these two things act differently: 'seq_name'::regclass 'seq_name'::text::regclass The latter produces a constant of type text with a run-time cast to regclass (and hence a run-time pg_class lookup). IIRC, we document that mainly because the latter provides a way to duplicate nextval()'s old behavior of run-time lookup. Now that I think about it, there's a very parallel difference in the behavior of 'now'::timestamp 'now'::text::timestamp but I doubt that that example is shown anywhere. regards, tom lane
On Wed, 26 Jun 2024 at 17:36, David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Tue, Jun 25, 2024 at 9:50 PM David Rowley <dgrowleyml@gmail.com> wrote: >> FWIW, I disagree that we need to write anything about that in this >> part of the documentation. I think any argument for doing this could >> equally be applied to something like re-iterating what the operator >> precedence rules for arithmetic are, and I don't think that should be >> mentioned. > > > I disagree on this equivalence. The time literals are clear deviations from expected behavior. Knowing operator precedencerules, they apply everywhere equally. And we should document the deviations directly where they happen. Evenif it's just a short link back to the source that describes the deviation. I'm fine with something less verbose pointingonly to the data types page, but not with nothing. Are you able to share what the special behaviour is with DEFAULT constraints and time literals that does not apply everywhere equally? Maybe I'm slow on the uptake, but I've yet to see anything here where time literals act in a special way DEFAULT constraints. This is why I couldn't understand why we should be adding documentation about this under CREATE TABLE. I'd be happy to reconsider or retract my argument if you can show me what I'm missing. David
David Rowley <dgrowleyml@gmail.com> writes: > Maybe I'm slow on the uptake, but I've yet to see anything here where > time literals act in a special way DEFAULT constraints. This is why I > couldn't understand why we should be adding documentation about this > under CREATE TABLE. It's not that the parsing rules are any different: it's that in ordinary DML queries, it seldom matters very much whether a subexpression is evaluated at parse time versus run time. In CREATE TABLE that difference is very in-your-face, so people who haven't understood the rules clearly can get burnt. However, there are certainly other places where it matters, such as queries in plpgsql functions. So I understand your reluctance to go on about it in CREATE TABLE. At the same time, I see where David J. is coming from. Maybe we could have a discussion of this in some single spot, and link to it from CREATE TABLE and other relevant places? ISTR there is something about it in the plpgsql doco already. regards, tom lane
On Thu, 27 Jun 2024 at 12:11, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > Maybe I'm slow on the uptake, but I've yet to see anything here where > > time literals act in a special way DEFAULT constraints. This is why I > > couldn't understand why we should be adding documentation about this > > under CREATE TABLE. > > It's not that the parsing rules are any different: it's that in > ordinary DML queries, it seldom matters very much whether a > subexpression is evaluated at parse time versus run time. > In CREATE TABLE that difference is very in-your-face, so people > who haven't understood the rules clearly can get burnt. Aha, now I understand. Thanks. So, seems like CREATE TABLE is being targeted or maybe victimised here as it's probably the most common place people learn about their misuse of the timestamp special input values. > However, there are certainly other places where it matters, > such as queries in plpgsql functions. So I understand your > reluctance to go on about it in CREATE TABLE. At the same > time, I see where David J. is coming from. > > Maybe we could have a discussion of this in some single spot, > and link to it from CREATE TABLE and other relevant places? > ISTR there is something about it in the plpgsql doco already. For the special timestamp stuff, that place is probably the special timestamp table in [1]. It looks like the large caution you added in 540849814 might not be enough or perhaps wasn't done soon enough to catch the people who read that part of the manual before the caution was added. Hard to fix if it's the latter without a time machine. :-( I'm open to having some section that fleshes this stuff out a bit more with a few examples with CREATE TABLE and maybe CREATE VIEW that we can link to. Linking seems like a much more sustainable practice than adding special case documentation for non-special case behaviour. David [1] https://www.postgresql.org/docs/devel/datatype-datetime.html
On 27.06.24 02:34, David Rowley wrote: > For the special timestamp stuff, that place is probably the special > timestamp table in [1]. It looks like the large caution you added in > 540849814 might not be enough or perhaps wasn't done soon enough to > catch the people who read that part of the manual before the caution > was added. Hard to fix if it's the latter without a time machine. :-( Maybe we should really be thinking about deprecating these special values and steering users more urgently toward more robust alternatives. Imagine if 'random' were a valid input value for numeric types.
On Thu, 27 Jun 2024 at 23:57, Peter Eisentraut <peter@eisentraut.org> wrote: > Maybe we should really be thinking about deprecating these special > values and steering users more urgently toward more robust alternatives. > > Imagine if 'random' were a valid input value for numeric types. I think there are valid reasons to use the special timestamp input values. One that I can think of is for use with partition pruning. If you have a time-range partitioned table and want the planner to prune the partitions rather than the executor, you could use 'now'::timestamp in your queries to allow the planner to prune. That works providing that you never use that in combination with PREPARE and never put the query with the WHERE clause inside a VIEW. I don't have any other good examples, but I suppose that if someone needed to capture the time some statement was executed and record that somewhere, sort of like the __DATE__ and __TIME__ macros in C. Perhaps that's useful to record the last time some DDL script was executed. I'd like to know what led someone down the path of doing something like DEFAULT 'now()'::timestamp in a CREATE TABLE. Could it be a faulty migration tool that created these and people copy them thinking it's a legitimate syntax? David
On Sun, Jun 30, 2024 at 4:55 PM David Rowley <dgrowleyml@gmail.com> wrote:
I'd like to know what led someone down the path of doing something
like DEFAULT 'now()'::timestamp in a CREATE TABLE. Could it be a
faulty migration tool that created these and people copy them thinking
it's a legitimate syntax?
My thought process on this used to be: Provide a text string of the expression that is then stored within the catalog and eval'd during runtime. If the only thing you are providing is a single literal and not some compound expression it isn't that obvious that you are supposed to provide an unquoted expression - which feels like it should be immediately evaluated - versus something that is a constant. Kinda like dynamic SQL.
David J.
On Mon, 1 Jul 2024 at 12:16, David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Sun, Jun 30, 2024 at 4:55 PM David Rowley <dgrowleyml@gmail.com> wrote: >> >> >> I'd like to know what led someone down the path of doing something >> like DEFAULT 'now()'::timestamp in a CREATE TABLE. Could it be a >> faulty migration tool that created these and people copy them thinking >> it's a legitimate syntax? >> > > My thought process on this used to be: Provide a text string of the expression that is then stored within the catalogand eval'd during runtime. If the only thing you are providing is a single literal and not some compound expressionit isn't that obvious that you are supposed to provide an unquoted expression - which feels like it should be immediatelyevaluated - versus something that is a constant. Kinda like dynamic SQL. Thanks for sharing that. Any idea where that thinking came from? Maybe it was born from the fact that nothing complains when you do: 'now()'::timestamp? A quick test evaluation of that with a SELECT statement might trick someone into thinking it'll work. I wonder if there's anything else like this that might help fool people into thinking this is some valid way of getting delayed evaluation. David
On Sun, Jun 30, 2024 at 5:47 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Mon, 1 Jul 2024 at 12:16, David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Sun, Jun 30, 2024 at 4:55 PM David Rowley <dgrowleyml@gmail.com> wrote:
>>
>>
>> I'd like to know what led someone down the path of doing something
>> like DEFAULT 'now()'::timestamp in a CREATE TABLE. Could it be a
>> faulty migration tool that created these and people copy them thinking
>> it's a legitimate syntax?
>>
>
> My thought process on this used to be: Provide a text string of the expression that is then stored within the catalog and eval'd during runtime. If the only thing you are providing is a single literal and not some compound expression it isn't that obvious that you are supposed to provide an unquoted expression - which feels like it should be immediately evaluated - versus something that is a constant. Kinda like dynamic SQL.
Thanks for sharing that. Any idea where that thinking came from?
Maybe it was born from the fact that nothing complains when you do:
'now()'::timestamp? A quick test evaluation of that with a SELECT
statement might trick someone into thinking it'll work.
I wonder if there's anything else like this that might help fool
people into thinking this is some valid way of getting delayed
evaluation.
I presume the relatively new atomic SQL functions pose a similar hazard.
It probably boils down, for me, that I learned about, though never used, eval functions from javascript, and figured this is probably implemented something like that and I should thus supply a string. Internalizing that DDL can treat the unquoted content of expression in "DEFAULT expression" as basically text hadn't happened; nor that the actual difference between just treating it as text and the parsing to a standard form that really happens, is quite important. Namely that, in reverse of expectations, quoted things, which are literals, are transformed to their typed values during parse while functions, which are not quoted, don't have a meaningfully different parsed form and are indeed executed at runtime.
The fact that 'now()'::timestamp fails to fail doesn't help...
Consider this phrasing for default:
The DEFAULT clause assigns a default data value for the column whose column definition it appears within. The expression is parsed according to Section X.X.X, with the limitation that it may neither include references to other columns nor subqueries, and then stored for later evaluation of any functions it contains. The data type of the default expression must match the data type of the column.
Then in Section X.X.X we note, in part:
During parsing, all constants are immediately converted to their internal representation. In particular, the time-related literals noted in Section 8.5.1.4 get set to their date/time values.
Then, in 8.5.1.4 we should call out:
Caution:
'now' is a special time value, evaluated during parsing.
now() is a function, evaluated during execution.
'now()' is a special time value due to the quoting, PostgreSQL ignored the parentheses.
The above doesn't make the special constants particularly special in how they behave within parse-bind-execute while still noting that what they do during parsing is a bit unique since a timestamp has not representation of 'tomorrow' that is can hold but instead is a short-hand for writing the constant representing "whatever tomorrow is" at that moment.
I hope the reason for the additional caution in this framing is intuitive for everyone.
There is probably a good paragraph or two that could be added under the new Section X.X.X to centralize this for views, atomic sql, defaults, etc... to refer to and give the reader the needed framing.
David J.
On Mon, 1 Jul 2024 at 13:41, David G. Johnston <david.g.johnston@gmail.com> wrote: > I presume the relatively new atomic SQL functions pose a similar hazard. Do you have an example of this? > The fact that 'now()'::timestamp fails to fail doesn't help... If that's the case, maybe a tiny step towards what Peter proposed is just to make trailing punctuation fail for timestamp special values in v18. David
On Sun, Jun 30, 2024 at 7:52 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Mon, 1 Jul 2024 at 13:41, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> I presume the relatively new atomic SQL functions pose a similar hazard.
Do you have an example of this?
create function testnow() returns timestamptz language sql
select testnow();
select pg_sleep(5);select testnow(); -- same time as the first call
Which conforms with the documentation and expression parsing rules for literals:
"This form is parsed at function definition time, the string constant form is parsed at execution time;..."
David J.
On Sun, Jun 30, 2024 at 7:52 PM David Rowley <dgrowleyml@gmail.com> wrote:
If that's the case, maybe a tiny step towards what Peter proposed is
just to make trailing punctuation fail for timestamp special values in
v18.
I'm game. If anyone is using the ambiguous spelling it is probably to their benefit to have it break and realize they wanted a function expression, not a constant expression.
David J.
On Sun, Jun 30, 2024 at 8:16 PM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Sun, Jun 30, 2024 at 4:55 PM David Rowley <dgrowleyml@gmail.com> wrote: >> >> >> I'd like to know what led someone down the path of doing something >> like DEFAULT 'now()'::timestamp in a CREATE TABLE. Could it be a >> faulty migration tool that created these and people copy them thinking >> it's a legitimate syntax? >> > > My thought process on this used to be: Provide a text string of the expression that is then stored within the catalogand eval'd during runtime. If the only thing you are providing is a single literal and not some compound expressionit isn't that obvious that you are supposed to provide an unquoted expression - which feels like it should be immediatelyevaluated - versus something that is a constant. Kinda like dynamic SQL. I have a similar story to tell: I've honestly never thought about it deeply until I started this thread, but just through experimentation a few things were obvious: - now() as a function call gives you the current timestamp in a query - now() as a function call in a DDL DEFAULT clause sets that as a default function call - Quoting that function call (using the function call syntax is the natural thing to try, I think, if you've already done the first two) -- because some examples online show quoting it -- gives you DDL time evaluation. So I suspect -- though I've been doing this for so long I couldn't tell you for certain -- that I largely intuitive the behavior by observation. And similarly to David J. I'd then assumed -- but never had a need to test it -- that this was generalized. I think DDL is also different conceptually from SQL/DML here in a kind of insidious way: the "bare" function call in DEFAULT is *not* executed as part of the query for DDL like it is with other queries. Hope this helps explain things. James Coleman
On 01.07.24 01:54, David Rowley wrote: > On Thu, 27 Jun 2024 at 23:57, Peter Eisentraut <peter@eisentraut.org> wrote: >> Maybe we should really be thinking about deprecating these special >> values and steering users more urgently toward more robust alternatives. >> >> Imagine if 'random' were a valid input value for numeric types. > > I think there are valid reasons to use the special timestamp input > values. One that I can think of is for use with partition pruning. If > you have a time-range partitioned table and want the planner to prune > the partitions rather than the executor, you could use > 'now'::timestamp in your queries to allow the planner to prune. Yeah, but is that a good user interface? Or is that just something that happens to work now with the pieces that happened to be there, rather than a really designed interface? Hypothetically, what would need to be done to make this work with now() or current_timestamp or something similar? Do we need a new stability level that somehow encompasses this behavior, so that the function call can be evaluated at planning time? > That > works providing that you never use that in combination with PREPARE > and never put the query with the WHERE clause inside a VIEW. And this kind of thing obviously makes this interface even worse.
Peter Eisentraut <peter@eisentraut.org> writes: > On 01.07.24 01:54, David Rowley wrote: >> I think there are valid reasons to use the special timestamp input >> values. One that I can think of is for use with partition pruning. If >> you have a time-range partitioned table and want the planner to prune >> the partitions rather than the executor, you could use >> 'now'::timestamp in your queries to allow the planner to prune. > Yeah, but is that a good user interface? Or is that just something that > happens to work now with the pieces that happened to be there, rather > than a really designed interface? That's not a very useful argument to make. What percentage of the SQL language as a whole is legacy cruft that we'd do differently if we could? I think the answer is depressingly high. Adding more special-purpose features to the ones already there doesn't move that needle in a desirable direction. I'd be more excited about this discussion if I didn't think that the chances of removing 'now'::timestamp are exactly zero. You can't just delete useful decades-old features, whether there's a better way or not. regards, tom lane
On Tue, 2 Jul 2024 at 02:43, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'd be more excited about this discussion if I didn't think that > the chances of removing 'now'::timestamp are exactly zero. You > can't just delete useful decades-old features, whether there's > a better way or not. Do you have any thoughts on rejecting trailing punctuation with the timestamp special values? For me, I've mixed feelings about it. I think it would be good to break things for people who are doing this and getting the wrong behaviour who haven't noticed yet, however, there could be a series of people doing this and have these embedded in statements that are parsed directly before execution, and they just happen to get the right behaviour. It might be better not to upset the latter set of people needlessly. Perhaps the former set of people don't exist since the behaviour is quite different and it seems quite obviously wrong. David
On Mon, Jul 1, 2024 at 02:52:42PM +1200, David Rowley wrote: > On Mon, 1 Jul 2024 at 13:41, David G. Johnston > <david.g.johnston@gmail.com> wrote: > > I presume the relatively new atomic SQL functions pose a similar hazard. > > Do you have an example of this? > > > The fact that 'now()'::timestamp fails to fail doesn't help... > > If that's the case, maybe a tiny step towards what Peter proposed is > just to make trailing punctuation fail for timestamp special values in > v18. I dug into this and I have a suggestion at the end. First, the special values like 'now' are the only values that can be optionally quoted: SELECT current_timestamp::timestamptz; current_timestamp ------------------------------- 2024-07-05 15:15:22.692072-04 SELECT 'current_timestamp'::timestamptz; ERROR: invalid input syntax for type timestamp with time zone: "current_timestamp" Also interestingly, "now" without quotes requires parentheses to make it a function call: SELECT 'now'::timestamptz; timestamptz ------------------------------- 2024-07-05 15:17:11.394182-04 SELECT 'now()'::timestamptz; timestamptz ------------------------------- 2024-07-05 15:17:15.201621-04 SELECT now()::timestamptz; now ------------------------------- 2024-07-05 15:17:21.925611-04 SELECT now::timestamptz; ERROR: column "now" does not exist LINE 1: SELECT now::timestamptz; ^ And the quoting shows "now" evaluation at function creation time: CREATE OR REPLACE FUNCTION testnow() RETURNS timestamptz LANGUAGE SQL RETURN 'now'::timestamptz; SELECT testnow(); SELECT pg_sleep(5); SELECT testnow(); testnow ------------------------------- 2024-07-05 15:19:38.915255-04 testnow ------------------------------- 2024-07-05 15:19:38.915255-04 -- same --------------------------------------------------------------------------- CREATE OR REPLACE FUNCTION testnow() RETURNS timestamptz LANGUAGE SQL RETURN 'now()'::timestamptz; SELECT testnow(); SELECT pg_sleep(5); SELECT testnow(); testnow ------------------------------- 2024-07-05 15:20:41.475997-04 testnow ------------------------------- 2024-07-05 15:20:41.475997-04 -- same --------------------------------------------------------------------------- CREATE OR REPLACE FUNCTION testnow() RETURNS timestamptz LANGUAGE SQL RETURN now()::timestamptz; SELECT testnow(); SELECT pg_sleep(5); SELECT testnow(); testnow ------------------------------- 2024-07-05 15:21:18.204574-04 testnow ------------------------------- 2024-07-05 15:21:23.210442-04 -- different I don't think we can bounce people around to different sections to explain this --- I think we need text in the CREATE TABLE ... DEFAULT section. I think the now() case is unusual since there are few cases where function calls can be put inside of single quotes. I have written the attached patch to clarify the behavior. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Attachment
Bruce Momjian <bruce@momjian.us> writes: > Also interestingly, "now" without quotes requires parentheses to make it > a function call: I'm not sure why you find that surprising, or why you think that 'now()'::timestamptz is a function call. (Well, it is a call of timestamptz_in, but not of the SQL function now().) Documentation that is equally confused won't help any. regards, tom lane
On Fri, Jul 5, 2024 at 04:50:32PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Also interestingly, "now" without quotes requires parentheses to make it > > a function call: > > I'm not sure why you find that surprising, or why you think that > 'now()'::timestamptz is a function call. (Well, it is a call of > timestamptz_in, but not of the SQL function now().) Documentation > that is equally confused won't help any. Well, 'now()' certainly _looks_ like a function call, though it isn't. The fact that 'now()'::timestamptz and 'now'::timestamptz generate volatile results via a function call was my point. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Bruce Momjian <bruce@momjian.us> writes: > Well, 'now()' certainly _looks_ like a function call, though it isn't. > The fact that 'now()'::timestamptz and 'now'::timestamptz generate > volatile results via a function call was my point. The only reason 'now()'::timestamptz works is that timestamptz_in ignores irrelevant punctuation (or what it thinks is irrelevant, anyway). I do not think we should include examples that look like that, because it will further confuse readers who don't already have a solid grasp of how this works. regards, tom lane
On Fri, Jul 5, 2024 at 1:55 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Jul 5, 2024 at 04:50:32PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Also interestingly, "now" without quotes requires parentheses to make it
> > a function call:
>
> I'm not sure why you find that surprising, or why you think that
> 'now()'::timestamptz is a function call.
I suspect mostly because SQL has a habit of adding functions that don't require parentheses and it isn't obvious that "now" is not one of them.
select current_timestamp;
current_timestamp
-------------------------------
2024-07-05 13:55:12.521334-07
(1 row)
current_timestamp
-------------------------------
2024-07-05 13:55:12.521334-07
(1 row)
(Well, it is a call of
> timestamptz_in, but not of the SQL function now().) Documentation
> that is equally confused won't help any.
Well, 'now()' certainly _looks_ like a function call, though it isn't.
The fact that 'now()'::timestamptz and 'now'::timestamptz generate
volatile results via a function call was my point.
They generate volatile results during typed value construction. That such things are implemented via functions are best left unreferenced here, reserving mention of function calls to those things users explicitly add to their query that are, and only are, function calls.
Whether we change going forward or not I'd be content to simply add a warning that writing 'now()' in a default expression is invalid syntax that fails-to-fails on backward compatibility grounds. If you want the function don't quote it, if you want the literal, remove the parentheses.
David J.
On Fri, Jul 5, 2024 at 05:03:35PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Well, 'now()' certainly _looks_ like a function call, though it isn't. > > The fact that 'now()'::timestamptz and 'now'::timestamptz generate > > volatile results via a function call was my point. > > The only reason 'now()'::timestamptz works is that timestamptz_in > ignores irrelevant punctuation (or what it thinks is irrelevant, > anyway). I do not think we should include examples that look like > that, because it will further confuse readers who don't already > have a solid grasp of how this works. Wow, I see that now: test=> SELECT 'now('::timestamptz; timestamptz ------------------------------- 2024-07-05 17:04:33.457915-04 If I remove the 'now()' mention in the docs, patch attached, I am concerned people will be confused whether it is the removal of the single quotes or the use of "()" which causes insert-time evaluation, and they might try 'now()'. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Attachment
On Fri, Jul 5, 2024 at 2:11 PM Bruce Momjian <bruce@momjian.us> wrote:
If I remove the 'now()' mention in the docs, patch attached, I am
concerned people will be confused whether it is the removal of the
single quotes or the use of "()" which causes insert-time evaluation,
and they might try 'now()'.
Literals are DDL-time because of parsing, functions are insert-time because of execution. IMO this is presently confusing because we are focused on characters, not concepts.
index c55fa607e8..ac661958fd 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2391,6 +2391,17 @@ TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'
</para>
</caution>
+ <caution>
+ <para>
+ The input parser for timestamp values is forgiving: it ignores
+ trailing invalid characters. This poses a hazard in
+ the case of the <literal>'now'</literal> special date/time input.
+ The constant <literal>'now()'</literal> is the same special date/time input;
+ not the <function>now()</function> function, which like all function
+ call expressions, is not single-quoted. Writing <literal>'now()'</literal>
+ is considered deprecated and may become an error in future versions.
+ </para>
+ </caution>
+
</sect3>
</sect2>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index f19306e776..4cecab011a 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -889,9 +889,10 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</para>
<para>
- The default expression will be used in any insert operation that
- does not specify a value for the column. If there is no default
- for a column, then the default is null.
+ The default expression is immediately parsed, which causes evaluation of any literals, notably
+ <link linkend="datatype-datetime-special-table">special date/time inputs</link>.
+ Execution happens during insert for any row that does not specify a value for the column.
+ If there is no explicit default constraint for a column, the default is a null value.
</para>
</listitem>
+ </caution>
+
</sect3>
</sect2>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index f19306e776..4cecab011a 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -889,9 +889,10 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</para>
<para>
- The default expression will be used in any insert operation that
- does not specify a value for the column. If there is no default
- for a column, then the default is null.
+ The default expression is immediately parsed, which causes evaluation of any literals, notably
+ <link linkend="datatype-datetime-special-table">special date/time inputs</link>.
+ Execution happens during insert for any row that does not specify a value for the column.
+ If there is no explicit default constraint for a column, the default is a null value.
</para>
</listitem>
</varlistentry>
David J.
On Tue, 2 Jul 2024 at 13:48, David Rowley <dgrowleyml@gmail.com> wrote: > > On Tue, 2 Jul 2024 at 02:43, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I'd be more excited about this discussion if I didn't think that > > the chances of removing 'now'::timestamp are exactly zero. You > > can't just delete useful decades-old features, whether there's > > a better way or not. > > Do you have any thoughts on rejecting trailing punctuation with the > timestamp special values? Cancel that idea. I'd thought that these special values must be standalone, but I didn't realise until a few minutes ago that it's perfectly valid to mix them: select 'yesterday 13:00:00'::timestamp, 'yesterday allballs'::timestamp; David
On Thu, Jun 27, 2024 at 1:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > Maybe I'm slow on the uptake, but I've yet to see anything here where > > time literals act in a special way DEFAULT constraints. This is why I > > couldn't understand why we should be adding documentation about this > > under CREATE TABLE. > > It's not that the parsing rules are any different: it's that in > ordinary DML queries, it seldom matters very much whether a > subexpression is evaluated at parse time versus run time. > In CREATE TABLE that difference is very in-your-face, so people > who haven't understood the rules clearly can get burnt. > > However, there are certainly other places where it matters, > such as queries in plpgsql functions. So I understand your > reluctance to go on about it in CREATE TABLE. At the same > time, I see where David J. is coming from. > > Maybe we could have a discussion of this in some single spot, > and link to it from CREATE TABLE and other relevant places? > ISTR there is something about it in the plpgsql doco already. > +1 to this idea.
On Wednesday, October 16, 2024, Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Jul 5, 2024 at 05:11:22PM -0400, Bruce Momjian wrote:
> On Fri, Jul 5, 2024 at 05:03:35PM -0400, Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > Well, 'now()' certainly _looks_ like a function call, though it isn't.
> > > The fact that 'now()'::timestamptz and 'now'::timestamptz generate
> > > volatile results via a function call was my point.
> >
> > The only reason 'now()'::timestamptz works is that timestamptz_in
> > ignores irrelevant punctuation (or what it thinks is irrelevant,
> > anyway). I do not think we should include examples that look like
> > that, because it will further confuse readers who don't already
> > have a solid grasp of how this works.
>
> Wow, I see that now:
>
> test=> SELECT 'now('::timestamptz;
> timestamptz
> -------------------------------
> 2024-07-05 17:04:33.457915-04
>
> If I remove the 'now()' mention in the docs, patch attached, I am
> concerned people will be confused whether it is the removal of the
> single quotes or the use of "()" which causes insert-time evaluation,
> and they might try 'now()'.
Does anyone like this patch? I changed now()::timestamptz to
now::timestamptz.
I do not, but maybe I’m being overly pedantic. All string literals are parsed during the create table command. It’s only the situations where that parsing is non-deterministic that cause an issue.
Is there anything wrong with the patch I proposed?
David J.
Bruce Momjian <bruce@momjian.us> writes: > Does anyone like this patch? I changed now()::timestamptz to > now::timestamptz. No, because you clearly didn't bother to test it: regression=# select now::timestamptz; ERROR: column "now" does not exist LINE 1: select now::timestamptz; ^ Also "a string that returns a volatile result once cast to a data type" is pretty meaningless to my eyes, not least because the real problem is that the construct is *not* volatile, but gets folded to a constant at CREATE TABLE time. The distinction we want to draw is that 'now'::timestamptz is a parse-time constant and so is not equivalent to the function call now() (which already produces timestamptz, so no need for a cast). This is already covered at the end of section 9.9.5 Current Date/Time, although I have no objection to repeating it in some form in the CREATE TABLE docs. regards, tom lane
On Wed, Oct 16, 2024 at 04:45:39PM -0700, David G. Johnston wrote: > On Wednesday, October 16, 2024, Bruce Momjian <bruce@momjian.us> wrote: > I do not, but maybe I’m being overly pedantic. All string literals are parsed > during the create table command. It’s only the situations where that parsing > is non-deterministic that cause an issue. > > Is there anything wrong with the patch I proposed? I thought it was too verbose so the point was not clear. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?"
On 10/17/24 06:19, Bruce Momjian wrote: > On Fri, Jul 5, 2024 at 05:11:22PM -0400, Bruce Momjian wrote: >> On Fri, Jul 5, 2024 at 05:03:35PM -0400, Tom Lane wrote: >>> Bruce Momjian <bruce@momjian.us> writes: >>>> Well, 'now()' certainly _looks_ like a function call, though it isn't. >>>> The fact that 'now()'::timestamptz and 'now'::timestamptz generate >>>> volatile results via a function call was my point. >>> >>> The only reason 'now()'::timestamptz works is that timestamptz_in >>> ignores irrelevant punctuation (or what it thinks is irrelevant, >>> anyway). I do not think we should include examples that look like >>> that, because it will further confuse readers who don't already >>> have a solid grasp of how this works. >> >> Wow, I see that now: >> >> test=> SELECT 'now('::timestamptz; >> timestamptz >> ------------------------------- >> 2024-07-05 17:04:33.457915-04 >> >> If I remove the 'now()' mention in the docs, patch attached, I am >> concerned people will be confused whether it is the removal of the >> single quotes or the use of "()" which causes insert-time evaluation, >> and they might try 'now()'. > > Does anyone like this patch? I changed now()::timestamptz to > now::timestamptz. Pardon the noise, but can you consider the idea of replacing the phrase 'data insertion time' with something like 'immediately before the insertion operation starts'? Sometimes people (especially younglings) ask which time it is precisely: will it differ for each tuple? -- regards, Andrei Lepikhov
Andrei Lepikhov <lepihov@gmail.com> writes: > Pardon the noise, but can you consider the idea of replacing the phrase > 'data insertion time' with something like 'immediately before the > insertion operation starts'? Sometimes people (especially younglings) > ask which time it is precisely: will it differ for each tuple? If we're discussing the meaning of "now", it's the transaction start timestamp, cf https://www.postgresql.org/docs/devel/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT I do not think other wordings will improve on that. regards, tom lane
On Fri, Jul 5, 2024 at 05:11:22PM -0400, Bruce Momjian wrote: > Wow, I see that now: > > test=> SELECT 'now('::timestamptz; > timestamptz > ------------------------------- > 2024-07-05 17:04:33.457915-04 > > If I remove the 'now()' mention in the docs, patch attached, I am > concerned people will be confused whether it is the removal of the > single quotes or the use of "()" which causes insert-time evaluation, > and they might try 'now()'. > diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml > index f19306e7760..4d47248fccf 100644 > --- a/doc/src/sgml/ref/create_table.sgml > +++ b/doc/src/sgml/ref/create_table.sgml > @@ -888,6 +888,13 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM > match the data type of the column. > </para> > > + <para> > + Note, a string that returns a volatile result once cast to a data > + type, like <literal>'now'::timestamptz</literal>, is evaluated at > + table creation time, while <literal>now()::timestamptz</literal> > + (without quotes) is evaluated at data insertion time. > + </para> > + > <para> > The default expression will be used in any insert operation that > does not specify a value for the column. If there is no default It seems we never came to an agreed-upon documentation addition to warn users about this. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?"