Thread: commit timestamps and replication

commit timestamps and replication

From
Andreas Brandl
Date:
Hi,

we're discussing a system design and it boils down to a question
regarding commit timestamps (in the sense of [1],
track_commit_timestamp='on'):

We have a insert-only (append-only) table. Do commit timestamps on
this table constitute the same order in which records become visible
on a secondary (streaming replication)? Is there any reason why this
might not be the case?

To put this differently: If a client reads from a secondary and reads
the "latest" record by commit timestamp (commit time T). Is it safe to
assume that there won't ever be another record with a lower commit
timestamp < T, that only shows up on the secondary after that read?

I'm aware of concerns regarding physical time, time adjustments and so
on, so the question here assumes those things never happen.

Thank you!

Best regards,
Andreas

[1] https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.5#Commit_timestamp_tracking


Re: commit timestamps and replication

From
Andres Freund
Date:
Hi,

On 2018-09-14 21:01:54 +0200, Andreas Brandl wrote:
> we're discussing a system design and it boils down to a question
> regarding commit timestamps (in the sense of [1],
> track_commit_timestamp='on'):
> 
> We have a insert-only (append-only) table. Do commit timestamps on
> this table constitute the same order in which records become visible
> on a secondary (streaming replication)? Is there any reason why this
> might not be the case?
> 
> To put this differently: If a client reads from a secondary and reads
> the "latest" record by commit timestamp (commit time T). Is it safe to
> assume that there won't ever be another record with a lower commit
> timestamp < T, that only shows up on the secondary after that read?

No, that's not safe to assume.  The order of visibility on the standby
is determined by the point the WAL record is inserted into the log.  The
commit timestamp has to *previously* be determined to be part of the WAL
log:

        SetCurrentTransactionStopTimestamp();

        XactLogCommitRecord(xactStopTimestamp,
                            nchildren, children, nrels, rels,
                            nmsgs, invalMessages,
                            RelcacheInitFileInval, forceSyncCommit,
                            MyXactFlags,
                            InvalidTransactionId, NULL /* plain commit */ );

Those two things happen sequentially, *WITHOUT* a lock preventing
concurrent transactions to do the same.


> I'm aware of concerns regarding physical time, time adjustments and so
> on, so the question here assumes those things never happen.

Note that in addition to that you cannot be certain that concurrent
transactions *in the same postgres cluster* have perfectly coherent
timestamp - clock drift between CPU sockets does still sometimes occur,
and was extremely common.

Greetings,

Andres Freund