Thread: Parallel safety of CURRENT_* family
Hello How should I mark a function which calls CURRENT_DATE? Parallel safe or parallel restricted? pg_proc shows that now() is marked as restricted, but transaction_timestamp() is marked as safe. The manual (https://www.postgresql.org/docs/9.6/static/functions-datetime.html) says that "now() is a traditional PostgreSQLequivalent to transaction_timestamp()" and "transaction_timestamp() is equivalent to CURRENT_TIMESTAMP". The code seems to confirm: GetSQLCurrentDate(), GetSQLCurrentTimestamp() and now() all invoke the same GetCurrentTransactionStartTimestamp(). It looks like that functions which uses the CURRENT_* family of functions should be marked as restricted (something to addto the docs) and that transaction_timestamp() is incorrectly marked as safe. Am I wrong? Mark Kolar ---- Sent using Guerrillamail.com Block or report abuse: https://www.guerrillamail.com/abuse/?a=UUZnFAFNS7UWgwum414PfQfTStiVwcZciatQew%3D%3D
<5bih4k+4jfl6m39j23k@guerrillamail.com> writes: > How should I mark a function which calls CURRENT_DATE? Parallel safe or parallel restricted? > pg_proc shows that now() is marked as restricted, but transaction_timestamp() is marked as safe. That's certainly silly, because they're equivalent. I should think they're both safe. Robert? regards, tom lane
I wrote: > <5bih4k+4jfl6m39j23k@guerrillamail.com> writes: >> pg_proc shows that now() is marked as restricted, but transaction_timestamp() is marked as safe. > That's certainly silly, because they're equivalent. I should think > they're both safe. Robert? ... well, they would be if we passed down xactStartTimestamp to parallel workers, but I can't find any code that does that. In view of the fact that transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6. regards, tom lane
On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> <5bih4k+4jfl6m39j23k@guerrillamail.com> writes: >>> pg_proc shows that now() is marked as restricted, but transaction_timestamp() is marked as safe. > >> That's certainly silly, because they're equivalent. I should think >> they're both safe. Robert? > > ... well, they would be if we passed down xactStartTimestamp to parallel > workers, but I can't find any code that does that. In view of the fact that > transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6. Yeah. Do you think we should arrange to pass that down, or change the marking? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> ... well, they would be if we passed down xactStartTimestamp to parallel >> workers, but I can't find any code that does that. In view of the fact that >> transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6. > Yeah. Do you think we should arrange to pass that down, or change the marking? We can't fix the marking in existing 9.6 installations, so I think we have to pass it down. (Which would be a better response anyway.) Having said that, I find myself unable to reproduce a problem. This should fail: regression=# set parallel_setup_cost TO 0; SET regression=# set parallel_tuple_cost TO 0; SET regression=# set min_parallel_relation_size TO 0; SET regression=# set enable_indexscan TO 0; SET regression=# explain verbose select distinct transaction_timestamp() from tenk1; QUERYPLAN --------------------------------------------------------------------------------------Unique (cost=0.00..424.67 rows=1 width=8) Output: (transaction_timestamp()) -> Gather (cost=0.00..424.67 rows=10000 width=8) Output: (transaction_timestamp()) Workers Planned: 2 -> Parallel Seq Scan on public.tenk1 (cost=0.00..410.08 rows=4167width=8) Output: transaction_timestamp() (7 rows) but it doesn't: regression=# select distinct transaction_timestamp() from tenk1; transaction_timestamp -------------------------------2016-12-01 15:44:12.839417-05 (1 row) How is that happening? regards, tom lane
On Thu, Dec 1, 2016 at 3:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> ... well, they would be if we passed down xactStartTimestamp to parallel >>> workers, but I can't find any code that does that. In view of the fact that >>> transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6. > >> Yeah. Do you think we should arrange to pass that down, or change the marking? > > We can't fix the marking in existing 9.6 installations, so I think we > have to pass it down. (Which would be a better response anyway.) > > Having said that, I find myself unable to reproduce a problem. > This should fail: > > regression=# set parallel_setup_cost TO 0; > SET > regression=# set parallel_tuple_cost TO 0; > SET > regression=# set min_parallel_relation_size TO 0; > SET > regression=# set enable_indexscan TO 0; > SET > regression=# explain verbose select distinct transaction_timestamp() from tenk1; > QUERY PLAN > -------------------------------------------------------------------------------------- > Unique (cost=0.00..424.67 rows=1 width=8) > Output: (transaction_timestamp()) > -> Gather (cost=0.00..424.67 rows=10000 width=8) > Output: (transaction_timestamp()) > Workers Planned: 2 > -> Parallel Seq Scan on public.tenk1 (cost=0.00..410.08 rows=4167 width=8) > Output: transaction_timestamp() > (7 rows) > > but it doesn't: > > regression=# select distinct transaction_timestamp() from tenk1; > transaction_timestamp > ------------------------------- > 2016-12-01 15:44:12.839417-05 > (1 row) > > How is that happening? Because the table is so small, the leader probably finishes running the whole plan before the workers finish starting up. You can see the problem like this, though: rhaas=# begin; BEGIN rhaas=# select transaction_timestamp(); transaction_timestamp -------------------------------2016-12-01 15:51:14.443116-05 (1 row) rhaas=# select transaction_timestamp(); transaction_timestamp -------------------------------2016-12-01 15:51:14.443116-05 (1 row) rhaas=# select transaction_timestamp(); transaction_timestamp -------------------------------2016-12-01 15:51:14.443116-05 (1 row) rhaas=# set force_parallel_mode = true; SET rhaas=# select transaction_timestamp(); transaction_timestamp -------------------------------2016-12-01 15:51:26.603302-05 (1 row) rhaas=# select transaction_timestamp(); transaction_timestamp -------------------------------2016-12-01 15:51:27.316032-05 (1 row) force_parallel_mode causes the whole plan to be run by the worker, without any participation by the leader. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Dec 1, 2016 at 3:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> but it doesn't: >> >> regression=# select distinct transaction_timestamp() from tenk1; >> transaction_timestamp >> ------------------------------- >> 2016-12-01 15:44:12.839417-05 >> (1 row) >> >> How is that happening? > Because the table is so small, the leader probably finishes running > the whole plan before the workers finish starting up. Good try, but EXPLAIN ANALYZE says that the workers are processing some of the rows. Also, I see the same behavior with a much larger test table. > You can see the problem like this, though: Yeah, I didn't have any doubt that it was real. Still don't know why my test case isn't doing what I expected, though. regards, tom lane
I wrote: > Yeah, I didn't have any doubt that it was real. Still don't know > why my test case isn't doing what I expected, though. Doh: the planner knows that transaction_timestamp() is stable, so it concludes that the DISTINCT condition is vacuous. There is a "Unique" node in the plan, but it has zero columns to compare, so it thinks the tuple are all equivalent and emits only the first. I had noticed that there was no "Sort" node, but failed to realize that that implied the "Unique" node was degenerate. Maybe this is over-optimization, but I think we'd be very sad if the planner didn't do it; getting rid of useless sort columns is critical in a lot of situations. regards, tom lane
On Thu, Dec 1, 2016 at 3:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> ... well, they would be if we passed down xactStartTimestamp to parallel >>> workers, but I can't find any code that does that. In view of the fact that >>> transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6. > >> Yeah. Do you think we should arrange to pass that down, or change the marking? > > We can't fix the marking in existing 9.6 installations, so I think we > have to pass it down. (Which would be a better response anyway.) I happened across this thread today and took a look at what it would take to fix this. I quickly ran up against the fact that SerializeTransactionState() and RestoreTransactionState() are not exactly brilliantly designed, relying on the notion that each individual value that we want to serialize will be no wider than a TransactionId, which won't be true for timestamps. Even apart from that, the whole design of those functions is pretty lame, and I'm pretty sure I wrote all of that code myself, so I have nobody to blame but me. Anyway, here's a proposed patch to refactor that code into something a little more reasonable. It doesn't fix the actual problem here, but I think it's a good first step. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers