Re: Request for further clarification on synchronous_commit - Mailing list pgsql-docs
From | Kasper Kondzielski |
---|---|
Subject | Re: Request for further clarification on synchronous_commit |
Date | |
Msg-id | CAFv2VPQRT=8d2Q3ipTXZTaOdEY+taR8gBE77kL6dkk8gE09Xnw@mail.gmail.com Whole thread Raw |
In response to | Re: Request for further clarification on synchronous_commit (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: Request for further clarification on synchronous_commit
|
List | pgsql-docs |
> On Tue, Aug 18, 2020 at 12:50:34PM +0200, Kasper Kondzielski wrote:
> > Hi, thanks for the reply.
> >
> > To be honest I don't think it is better. Previously paragraph about
> > remote_apply was after paragraph about `on` and before remote_write which
> > followed natural order in terms of how strict these parameters are (i.e. how
> > strong are the guarantees they provide). Because of that I think that
> > remote_apply should return to its previous position.
> Uh, not really --- see below.
Ok, I see, thanks. Shouldn't we then stick to this order whenever possible (might be sometimes reversed).
So, in the proposed patch I would suggest putting remote_apply first. (Of course, before that we can mention that the default option is `on`, but without going to much into the details.)
> Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
> fsync. If you want to go in order of severity, with the most severe
> first, it is:
>
> remote_apply
> on
> remote_write
> local
Wouldn't the table be beneficial when it comes to highlighting these differences?
> and this defines the 'on' behavior:
>
> /* Define the default setting for synchronous_commit */
> #define SYNCHRONOUS_COMMIT_ON SYNCHRONOUS_COMMIT_REMOTE_FLUSH
Is there any valid reason to hide this behavior under `on` alias? In my opinion `remote_flush` does much better job with describing what it does. Maybe we could rename `on` to `remote_flush` but also create an alias `on=remote_flush` to keep backward compatibility?
+ Finally, when set to <literal>remote_apply</literal>, commits
+ will wait until replies from the current synchronous standby(s)
+ indicate they have received the commit record of the transaction
+ and applied it, so that it has become visible to queries on the
+ standby(s), and also written to durable storage on the standbys.
"and also written to durable storage on the standbys." -> You mean flushed? Maybe it should be better to stick to cohesive terminology to not introduce any confusion.
> Well, there is a doc section that talks about WAL:
>
> https://www.postgresql.org/docs/12/wal.html
>
> and other parts of the config docs that talk about WAL.
Yes, I know what is WAL for. I only don't get what kind of operation do you mean by 'WAL replay'. The only one thing which I can think of is the process of restoring database after a crash, when we apply changes from WAL to the data pages which haven't been flushed to the disk, but I don't think that this is that. Basically what I wonder is how can a WAL replay influence the transaction commit?
> > Hi, thanks for the reply.
> >
> > To be honest I don't think it is better. Previously paragraph about
> > remote_apply was after paragraph about `on` and before remote_write which
> > followed natural order in terms of how strict these parameters are (i.e. how
> > strong are the guarantees they provide). Because of that I think that
> > remote_apply should return to its previous position.
> Uh, not really --- see below.
Ok, I see, thanks. Shouldn't we then stick to this order whenever possible (might be sometimes reversed).
So, in the proposed patch I would suggest putting remote_apply first. (Of course, before that we can mention that the default option is `on`, but without going to much into the details.)
> Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
> fsync. If you want to go in order of severity, with the most severe
> first, it is:
>
> remote_apply
> on
> remote_write
> local
Wouldn't the table be beneficial when it comes to highlighting these differences?
+-----------------------------+---------------------------------------------------------+
| | synchronous_commit |
+-----------------------------+--------------+-------------------+--------------+-------+
| operation on standby server | remote_apply | on (remote_flush) | remote_write | local |
+-----------------------------+--------------+-------------------+--------------+-------+
| write to WAL | Yes | Yes | Yes | No |
+-----------------------------+--------------+-------------------+--------------+-------+
| fsync | Yes | Yes | No | No |
+-----------------------------+--------------+-------------------+--------------+-------+
| apply WAL data | Yes | No | No | No |
+-----------------------------+--------------+-------------------+--------------+-------+
> and this defines the 'on' behavior:
>
> /* Define the default setting for synchronous_commit */
> #define SYNCHRONOUS_COMMIT_ON SYNCHRONOUS_COMMIT_REMOTE_FLUSH
Is there any valid reason to hide this behavior under `on` alias? In my opinion `remote_flush` does much better job with describing what it does. Maybe we could rename `on` to `remote_flush` but also create an alias `on=remote_flush` to keep backward compatibility?
+ Finally, when set to <literal>remote_apply</literal>, commits
+ will wait until replies from the current synchronous standby(s)
+ indicate they have received the commit record of the transaction
+ and applied it, so that it has become visible to queries on the
+ standby(s), and also written to durable storage on the standbys.
"and also written to durable storage on the standbys." -> You mean flushed? Maybe it should be better to stick to cohesive terminology to not introduce any confusion.
> Well, there is a doc section that talks about WAL:
>
> https://www.postgresql.org/docs/12/wal.html
>
> and other parts of the config docs that talk about WAL.
Yes, I know what is WAL for. I only don't get what kind of operation do you mean by 'WAL replay'. The only one thing which I can think of is the process of restoring database after a crash, when we apply changes from WAL to the data pages which haven't been flushed to the disk, but I don't think that this is that. Basically what I wonder is how can a WAL replay influence the transaction commit?
wt., 18 sie 2020 o 19:17 Bruce Momjian <bruce@momjian.us> napisał(a):
On Tue, Aug 18, 2020 at 10:58:51AM -0400, Bruce Momjian wrote:
> Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
> fsync. If you want to go in order of severity, with the most severe
> first, it is:
>
> remote_apply
> on
> remote_write
> local
>
> This is seen in the C enum ordering for synchronous_commit, but in
> reverse order:
>
> typedef enum
> {
> SYNCHRONOUS_COMMIT_OFF, /* asynchronous commit */
> SYNCHRONOUS_COMMIT_LOCAL_FLUSH, /* wait for local flush only */
> SYNCHRONOUS_COMMIT_REMOTE_WRITE, /* wait for local flush and remote
> * write */
> SYNCHRONOUS_COMMIT_REMOTE_FLUSH, /* wait for local and remote flush */
> SYNCHRONOUS_COMMIT_REMOTE_APPLY /* wait for local flush and remote apply */
> } SyncCommitLevel;
Also, there is some logic to say that the postgresql.conf
synchronous_commit options list should be reordered from:
#synchronous_commit = on # synchronization level;
# off, local, remote_write, remote_apply, or on
to
#synchronous_commit = on # synchronization level;
# off, local, remote_write, on, or remote_apply
I think we should backpatch the doc changes, but maybe not the
postgresql.conf one --- I am not sure.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
pgsql-docs by date: