Thread: Repeatable Read Isolation Level "transaction start time"
https://www.postgresql.org/docs/16/transaction-iso.html#XACT-REPEATABLE-READ The PostgreSQL documentation for the Repeatable Read Isolation Level states the following: “UPDATE, DELETE, MERGE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searchingfor target rows: they will only find target rows that were committed as of the transaction start time.” What is defined as the "transaction start time?" When I first read the statement, I interpreted it as the start of the transaction: BEGIN; SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; But in my testing, I find that according to that statement, the transaction start time is actually "the start of the firstnon-transaction-control statement in the transaction" (as mentioned earlier in the section). Is my conclusion correct,or am I misunderstanding the documentation?
On Monday, September 23, 2024, Wizard Brony <wizardbrony@gmail.com> wrote:
https://www.postgresql.org/docs/16/transaction-iso.html# XACT-REPEATABLE-READ
The PostgreSQL documentation for the Repeatable Read Isolation Level states the following:
“UPDATE, DELETE, MERGE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the transaction start time.”
What is defined as the "transaction start time?" When I first read the statement, I interpreted it as the start of the transaction:
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
But in my testing, I find that according to that statement, the transaction start time is actually "the start of the first non-transaction-control statement in the transaction" (as mentioned earlier in the section). Is my conclusion correct, or am I misunderstanding the documentation?
Probably, since indeed the transaction cannot start at begin because once it does start it cannot be modified.
David J.
On Tue, Sep 24, 2024 at 8:29 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Monday, September 23, 2024, Wizard Brony <wizardbrony@gmail.com> wrote:https://www.postgresql.org/docs/16/transaction-iso.html#XACT-REPEATABLE-READ
The PostgreSQL documentation for the Repeatable Read Isolation Level states the following:
“UPDATE, DELETE, MERGE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the transaction start time.”
What is defined as the "transaction start time?" When I first read the statement, I interpreted it as the start of the transaction:
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
But in my testing, I find that according to that statement, the transaction start time is actually "the start of the first non-transaction-control statement in the transaction" (as mentioned earlier in the section). Is my conclusion correct, or am I misunderstanding the documentation?Probably, since indeed the transaction cannot start at begin because once it does start it cannot be modified.
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> crustacean!
Wizard Brony <wizardbrony@gmail.com> writes: > But in my testing, I find that according to that statement, the transaction start time is actually "the start of the firstnon-transaction-control statement in the transaction" (as mentioned earlier in the section). Is my conclusion correct,or am I misunderstanding the documentation? It's even looser than that, really: it's the first statement that requires an MVCC snapshot. From memory, LOCK TABLE is an important exception --- you can acquire table locks before pinning down a snapshot, and this is important in some scenarios. regards, tom lane
On 9/24/24 05:59, Ron Johnson wrote: > On Tue, Sep 24, 2024 at 8:29 AM David G. Johnston > <david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> wrote: > > On Monday, September 23, 2024, Wizard Brony <wizardbrony@gmail.com > <mailto:wizardbrony@gmail.com>> wrote: > > https://www.postgresql.org/docs/16/transaction-iso.html#XACT-REPEATABLE-READ <https://www.postgresql.org/docs/16/transaction-iso.html#XACT-REPEATABLE-READ> > > The PostgreSQL documentation for the Repeatable Read Isolation > Level states the following: > > “UPDATE, DELETE, MERGE, SELECT FOR UPDATE, and SELECT FOR SHARE > commands behave the same as SELECT in terms of searching for > target rows: they will only find target rows that were committed > as of the transaction start time.” > > What is defined as the "transaction start time?" When I first > read the statement, I interpreted it as the start of the > transaction: > > BEGIN; > SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; > > But in my testing, I find that according to that statement, the > transaction start time is actually "the start of the first > non-transaction-control statement in the transaction" (as > mentioned earlier in the section). Is my conclusion correct, or > am I misunderstanding the documentation? > > > Probably, since indeed the transaction cannot start at begin because > once it does start it cannot be modified. > > Huh? BEGIN; SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; I read it as the transaction does not start at BEGIN because if it did you could not SET TRANSACTION to change it's characteristics. The docs go into more detail: https://www.postgresql.org/docs/current/sql-set-transaction.html The transaction isolation level cannot be changed after the first query or data-modification statement (SELECT, INSERT, DELETE, UPDATE, MERGE, FETCH, or COPY) of a transaction has been executed. So: begin ; SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; SET SET TRANSACTION ISOLATION LEVEL READ COMMITTED; SET select * from csv_test ; [...] SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; ERROR: SET TRANSACTION ISOLATION LEVEL must be called before any query > > -- > Death to <Redacted>, and butter sauce. > Don't boil me, I'm still alive. > <Redacted> crustacean! -- Adrian Klaver adrian.klaver@aklaver.com
On Tue, Sep 24, 2024 at 12:06 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/24/24 05:59, Ron Johnson wrote:
> On Tue, Sep 24, 2024 at 8:29 AM David G. Johnston
> <david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> wrote:
>
> On Monday, September 23, 2024, Wizard Brony <wizardbrony@gmail.com
> <mailto:wizardbrony@gmail.com>> wrote:
>
> https://www.postgresql.org/docs/16/transaction-iso.html#XACT-REPEATABLE-READ <https://www.postgresql.org/docs/16/transaction-iso.html#XACT-REPEATABLE-READ>
>
> The PostgreSQL documentation for the Repeatable Read Isolation
> Level states the following:
>
> “UPDATE, DELETE, MERGE, SELECT FOR UPDATE, and SELECT FOR SHARE
> commands behave the same as SELECT in terms of searching for
> target rows: they will only find target rows that were committed
> as of the transaction start time.”
>
> What is defined as the "transaction start time?" When I first
> read the statement, I interpreted it as the start of the
> transaction:
>
> BEGIN;
> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
>
> But in my testing, I find that according to that statement, the
> transaction start time is actually "the start of the first
> non-transaction-control statement in the transaction" (as
> mentioned earlier in the section). Is my conclusion correct, or
> am I misunderstanding the documentation?
>
>
> Probably, since indeed the transaction cannot start at begin because
> once it does start it cannot be modified.
>
> Huh?
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
I read it as the transaction does not start at BEGIN because if it did
you could not SET TRANSACTION to change it's characteristics.
The docs go into more detail:
https://www.postgresql.org/docs/current/sql-set-transaction.html
The transaction isolation level cannot be changed after the first query
or data-modification statement (SELECT, INSERT, DELETE, UPDATE, MERGE,
FETCH, or COPY) of a transaction has been executed.
So:
begin ;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET
select * from csv_test ;
[...]
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
ERROR: SET TRANSACTION ISOLATION LEVEL must be called before any query
Makes sense. Never would have occurred to me to try and change the isolation level using a second SET TRANSACTION statement, though.
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> crustacean!
On 9/24/24 09:12, Ron Johnson wrote: > Makes sense. Never would have occurred to me to try and change the > isolation level using a second SET TRANSACTION statement, though. From the docs: https://www.postgresql.org/docs/current/sql-set-transaction.html The SET TRANSACTION command sets the characteristics of the current transaction. It has no effect on any subsequent transactions. SET SESSION CHARACTERISTICS sets the default transaction characteristics for subsequent transactions of a session. These defaults can be overridden by SET TRANSACTION for an individual transaction. > > -- > Death to <Redacted>, and butter sauce. > Don't boil me, I'm still alive. > <Redacted> crustacean! -- Adrian Klaver adrian.klaver@aklaver.com
On Tue, Sep 24, 2024 at 10:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
It's even looser than that, really: it's the first statement that
requires an MVCC snapshot.
Hm....so why does "SELECT 1;" work as a transaction start marker then, as opposed to "SHOW work_mem;", which does not? Do we simply consider anything with a SELECT as needing a snapshot?
Cheers,
Greg
On 9/25/24 10:22 AM, Greg Sabino Mullane wrote: > On Tue, Sep 24, 2024 at 10:28 AM Tom Lane <tgl@sss.pgh.pa.us > <mailto:tgl@sss.pgh.pa.us>> wrote: > > It's even looser than that, really: it's the first statement that > requires an MVCC snapshot. > > > Hm....so why does "SELECT 1;" work as a transaction start marker then, > as opposed to "SHOW work_mem;", which does not? Do we simply consider > anything with a SELECT as needing a snapshot? SELECT some_func(); Where some_func() does something that requires a snapshot. > > Cheers, > Greg > -- Adrian Klaver adrian.klaver@aklaver.com
On Wed, Sep 25, 2024 at 1:45 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/25/24 10:22 AM, Greg Sabino Mullane wrote:
> On Tue, Sep 24, 2024 at 10:28 AM Tom Lane <tgl@sss.pgh.pa.us
> <mailto:tgl@sss.pgh.pa.us>> wrote:
>
> It's even looser than that, really: it's the first statement that
> requires an MVCC snapshot.
>
>
> Hm....so why does "SELECT 1;" work as a transaction start marker then,
> as opposed to "SHOW work_mem;", which does not? Do we simply consider
> anything with a SELECT as needing a snapshot?
SELECT some_func();
Where some_func() does something that requires a snapshot.
But why does "SELECT 1;" need a snapshot? Heck, why does "SELECT <immutable>;" need a snapshot?
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> crustacean!
Ron Johnson <ronljohnsonjr@gmail.com> writes: > But why does "SELECT 1;" need a snapshot? Heck, why does "SELECT > <immutable>;" need a snapshot? Because we're not going to analyze the statement in the amount of depth needed to make that distinction before we crank up the transactional machinery. If it says SELECT, it gets a snapshot. regards, tom lane
On Wed, Sep 25, 2024 at 1:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ron Johnson <ronljohnsonjr@gmail.com> writes:
> But why does "SELECT 1;" need a snapshot? Heck, why does "SELECT
> <immutable>;" need a snapshot?
Because we're not going to analyze the statement in the amount of
depth needed to make that distinction before we crank up the
transactional machinery. If it says SELECT, it gets a snapshot.
Perfectly reasonable.
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> crustacean!
On Wed, Sep 25, 2024 at 1:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Because we're not going to analyze the statement in the amount of depth needed to make that distinction before we crank up the
transactional machinery. If it says SELECT, it gets a snapshot.
Ok, thanks. So to the original poster's point, perhaps the path with the least side effects / best Principle of Least Surprise (POLS) support is to start the transaction, and immediately call a "SELECT 1;" or perhaps better still, a 'SELECT timeofday();'
Cheers,
Greg
On Wed, Sep 25, 2024 at 4:23 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
On Wed, Sep 25, 2024 at 1:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:Because we're not going to analyze the statement in the amount of depth needed to make that distinction before we crank up the
transactional machinery. If it says SELECT, it gets a snapshot.Ok, thanks. So to the original poster's point, perhaps the path with the least side effects / best Principle of Least Surprise (POLS) support is to start the transaction, and immediately call a "SELECT 1;" or perhaps better still, a 'SELECT timeofday();'
Since transactions should be "as short as possible, without being too short", how much time is there between when you run "BEGIN;" and the first "work statement"?
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> crustacean!
Since transactions should be "as short as possible, without being too short", how much time is there between when you run "BEGIN;" and the first "work statement"?
I don't know that it really matters. For something automated, it would be a few milliseconds. Either way, I'm sure most people/apps already think of the initial 'BEGIN ...' as the start of the transaction, and act accordingly.
Maybe long-term something like
BEGIN ISOLATION MODE REPEATABLE READ SNAPSHOT NOW;
could be useful.
Cheers,
Greg
> On Sep 25, 2024, at 13:49, Greg Sabino Mullane <htamfids@gmail.com> wrote: > BEGIN ISOLATION MODE REPEATABLE READ SNAPSHOT NOW; This might well be a failure of imagination on my part, but when would it pragmatically matter that the snapshot is takenat the first statement as opposed to at BEGIN?
On Wed, Sep 25, 2024 at 4:50 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
Since transactions should be "as short as possible, without being too short", how much time is there between when you run "BEGIN;" and the first "work statement"?I don't know that it really matters. For something automated, it would be a few milliseconds.
That's what I'm thinking, too. It might cause a problem if you're typing transaction commands in between drinking coffee and poking around other PgAdmin tabs, but that's your fault, not PG's fault.
Either way, I'm sure most people/apps already think of the initial 'BEGIN ...' as the start of the transaction, and act accordingly.Maybe long-term something likeBEGIN ISOLATION MODE REPEATABLE READ SNAPSHOT NOW;
Without the "NOW", that's essentially the command used by the legacy rdbms which I used to work on.
I'm trying to remember, though, if "SET TRANSACTION READ WRITE RESERVING foo FOR <isolation level>;" (it's syntax for beginning a transaction) started the transaction, or waited until an "action" statement. Been too long.
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> crustacean!
On Wed, Sep 25, 2024 at 4:54 PM Christophe Pettus <xof@thebuild.com> wrote:
> On Sep 25, 2024, at 13:49, Greg Sabino Mullane <htamfids@gmail.com> wrote:
> BEGIN ISOLATION MODE REPEATABLE READ SNAPSHOT NOW;
This might well be a failure of imagination on my part, but when would it pragmatically matter that the snapshot is taken at the first statement as opposed to at BEGIN?
I could imagine lots of cases where you know something is about to happen (say, a major delete), and you want to get a snapshot of the database as it existed just before that point. Many people will (quite understandably) assume that a BEGIN ISOLATION MODE <non read committed>; command would do just that, and be quite surprised to find that when they actually query the table in that first process, the rows are not there.
It's certainly a non-intuitive behavior. I understand why we do it this way, but perhaps this warrants a stronger warning in the docs at least? It's too late in the day for me to tackle that now, but I'll throw it out there.
Cheers,
Greg
"Peter J. Holzer" <hjp-pgsql@hjp.at> writes: > On 2024-09-25 13:53:30 -0700, Christophe Pettus wrote: >> This might well be a failure of imagination on my part, but when would >> it pragmatically matter that the snapshot is taken at the first >> statement as opposed to at BEGIN? > It may make a difference if you're comparing timestamps. > For example, if you're using isolation level REPEATABLE READ and > (mistakenly) assume that the snapshot is taken at BEGIN, you would > expect any transaction_timestamp() written by a different transaction > and readable by this transaction to be earlier than the > transaction_timestamp() of this transaction. > But that's wrong because the other transaction could have happened > entirely in the time between your BEGIN and the statement which actually > triggers the snapshot. I don't find that hugely compelling, because there's always going to be some skew between the time we read the clock for the timestamp and the time we obtain the snapshot. Admittedly, that would normally not be a very long interval if BEGIN did both things ... but on a busy system you could lose the CPU for awhile in between. regards, tom lane
On 10/5/24 02:14, Peter J. Holzer wrote: > On 2024-09-25 18:09:44 -0400, Tom Lane wrote: >> "Peter J. Holzer" <hjp-pgsql@hjp.at> writes: >> Admittedly, that would normally not be a very long interval if BEGIN >> did both things ... but on a busy system you could lose the CPU for >> awhile in between. > > Assuming that the system does have a global clock of sufficiently > fine resolution which returns strictly monotonically increasing > timestamps[1], I think the following is true: > > Every snapshot divides the set of transactions into two non-overlapping > subsets: Those which have committed early enough that their effects are > visible in the snapshot and those which haven't. Let's call the first set > the "earlier" transactions and the second the "later" transactions. Let's > call the current transaction c and any transaction in the earlier set e > (we ignore the later transactions for now). > > Performing a commit and taking a snapshot take some time, but there > should be a time t_C(e) in each commit and t_S(c) in the snapshot, such > that t_C(e) < t_S(c) for each "earlier" transaction. Assuming t_C is time of commit and t_S is time of snapshot, is the above not the crux of the matter? Namely when in the current transaction the snapshot is actually taken. That would determine what constitutes an earlier visible transaction relative to the current transaction. In other words I am not seeing how this changes anything? > > Within each transaction each timestamp t which could be visible outside > of the transaction must have been obtained before the commit, > so t(e) < t_C(e) < t_S(c). > > If we choose the transaction_timestamp to be >= t_S, then > transaction_timestamp(e) < t_C(e) < t_S(c) <= transaction_timestamp(c) > and therefore > transaction_timestamp(e) < transaction_timestamp(c) > > Such a guarantee might be useful for some applications and it's not > (IMHO) an entirely unreasonable assumption, but it's not true for > PostgreSQL. So a programmer should be aware of that. > > hp -- Adrian Klaver adrian.klaver@aklaver.com
Greg Sabino Mullane <htamfids@gmail.com> writes: > All we can guarantee > via pg_stat_activity is that if xact_start and query_start *are* identical, > no snapshot has been granted yet, Surely that's not true either. xact_start = query_start implies that the current statement is the first in its transaction (assuming sufficiently fine-grained clock timestamps, something I'm not sure is an entirely safe assumption). But if that statement is not simply a BEGIN, it's likely obtained its own transaction snapshot after a few microseconds. As long as "read the system clock" is a distinct operation from "read a snapshot", there are going to be skew issues here. We could maybe eliminate that by reading the clock while holding the lock that prevents commits while reading a snapshot, but I doubt that anybody is going to accept that on performance grounds. Adding a not-guaranteed-cheap syscall inside that extremely hot code path seems unsatisfactory. Also, we currently do guarantee that xact_start matches query_start for the first statement of the transaction (the converse of what I said above). Removing that guarantee in order to add some other one wouldn't necessarily please everybody. regards, tom lane
"Peter J. Holzer" <hjp-pgsql@hjp.at> writes: > Again, I'm not arguing for such a change, but I'm wondering if recording > transaction_timestamp just after the snapshot might be a safe change or > whether that might break some assumption that programmers can currently > make. As I mentioned upthread, we currently promise that xact_start matches the query_start of the transaction's first statement. (I'm not sure how well that's documented, but the code goes out of its way to make it so, so somebody thought it was important.) Another issue with redefining things like that is that presumably, for a session that has issued BEGIN but not yet taken a snapshot, xact_start would have to read as NULL, misleadingly implying that the session doesn't have an open transaction. Lastly, wouldn't postponing the collection of the timestamp like that break the very property you want to promise, by making other transactions' timestamps nontrivially later than they are now? I think if we wanted to do something here, it'd make more sense to keep xact_start as it stands and introduce a new variable snapshot_timestamp or something like that. Then maybe we could have some guarantees about what you get when comparing other sessions' xact_start to your own snapshot_timestamp. But I'm not convinced we can really guarantee anything without reading the snapshot_timestamp within the snapshot-collecting critical section, and I'm not for that. regards, tom lane
On 10/5/24 15:25, Peter J. Holzer wrote: > On 2024-10-05 17:03:08 -0400, Tom Lane wrote: >> "Peter J. Holzer" <hjp-pgsql@hjp.at> writes: >>> Again, I'm not arguing for such a change, but I'm wondering if recording >>> transaction_timestamp just after the snapshot might be a safe change or >>> whether that might break some assumption that programmers can currently >>> make. >> >> As I mentioned upthread, we currently promise that xact_start matches >> the query_start of the transaction's first statement. (I'm not sure >> how well that's documented, but the code goes out of its way to make >> it so, so somebody thought it was important.) > > It's mentioned in > https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT > > | statement_timestamp() and transaction_timestamp() return the same value during the first command of a transaction > > But that seems to be wrong in practice. The first statement of a > transaction is BEGIN. So whatever command calls statement_timestamp() is > already the second command, so statement_timestamp() is later than > transaction_timestamp(). This is even true if the BEGIN and SELECT are > on the same line: > > hjp=> begin; select transaction_timestamp(), statement_timestamp(), clock_timestamp(); rollback; > BEGIN > Time: 0.406 ms > ╔═[ RECORD 1 ]══════════╤═══════════════════════════════╗ > ║ transaction_timestamp │ 2024-10-05 23:55:47.356582+02 ║ > ║ statement_timestamp │ 2024-10-05 23:55:47.357106+02 ║ > ║ clock_timestamp │ 2024-10-05 23:55:47.357397+02 ║ > ╚═══════════════════════╧═══════════════════════════════╝ > > Time: 0.570 ms > ROLLBACK > Time: 0.285 ms > > The difference are only 0.5 ms, but it's not zero. > > I think the only case where transaction_timestamp() = statement_timestamp() > is outside of a transaction. > AFAIK that is still a transaction: https://www.postgresql.org/docs/current/sql-begin.html By default (without BEGIN), PostgreSQL executes transactions in “autocommit” mode, that is, each statement is executed in its own transaction and a commit is implicitly performed at the end of the statement (if execution was successful, otherwise a rollback is done). -- Adrian Klaver adrian.klaver@aklaver.com
On Sat, Oct 5, 2024 at 5:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
As I mentioned upthread, we currently promise that xact_start matches the query_start of the transaction's first statement. (I'm not sure
how well that's documented, but the code goes out of its way to make it so, so somebody thought it was important.)
I'm not convinced this is terribly useful in practice, but it is good to know.
I think if we wanted to do something here, it'd make more sense to keep xact_start as it stands and introduce a new variable
snapshot_timestamp or something like that.
I agree; I've been thinking about something like this, as it is too hard to try to shoehorn the information into the existing fields. Will throw this onto my "possible patch idea" pile.
Then maybe we could have some guarantees about what you get when comparing other sessions'
xact_start to your own snapshot_timestamp. But I'm not convinced we can really guarantee anything without reading the snapshot_timestamp within the snapshot-collecting critical section, and I'm not for that.
Fair enough, but even a not-guaranteed 100% accurate value might be better than the current situation, which is no (user) visibility at all about the (MVCC) visibility. Heck, even a boolean "snapshot acquired" would be an improvement (which becomes a subset of the info returned by a timestamp via null/notnull).
Cheers,
Greg