Home > mailing lists

Re: Request for further clarification on synchronous_commit - Mailing list pgsql-docs

From	Bruce Momjian
Subject	Re: Request for further clarification on synchronous_commit
Date	August 21, 2020 20:15:39
Msg-id	20200821201539.GA13363@momjian.us Whole thread Raw
In response to	Re: Request for further clarification on synchronous_commit (Kasper Kondzielski <kghost0@gmail.com>)
Responses	Re: Request for further clarification on synchronous_commit
List	pgsql-docs

Tree view

On Wed, Aug 19, 2020 at 11:39:53AM +0200, Kasper Kondzielski wrote:
> > On Tue, Aug 18, 2020 at 12:50:34PM +0200, Kasper Kondzielski wrote:
> > > Hi, thanks for the reply.
> > >
> > > To be honest I don't think it is better. Previously paragraph about
> > > remote_apply was after paragraph about `on` and before remote_write which
> > > followed natural order in terms of how strict these parameters are (i.e.
> how
> > > strong are the guarantees they provide). Because of that I think that
> > > remote_apply should return to its previous position.
> 
> > Uh, not really --- see below.
> 
> Ok, I see, thanks. Shouldn't we then stick to this order whenever possible
> (might be sometimes reversed).
> So, in the proposed patch I would suggest putting remote_apply first. (Of
> course, before that we can mention that the default option is `on`, but without
> going to much into the details.)

Well, it is kind of confusing.  I wanted to put remote_apply in its own
paragraph because it only applies to standbys, and because it is much
heavier and in a different scope (replay) than the others.  Frankly,
remote_apply is realated to synchronicity only to the extent it allows
consistent/synchronous results from standbys, not related to syncing
data to the kernel or durable storage.

> 
> > Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
> > fsync.  If you want to go in order of severity, with the most severe
> > first, it is:
> >
> >        remote_apply
> >        on
> >        remote_write
> >        local
> 
> Wouldn't the table be beneficial when it comes to highlighting these
> differences?

Uh, I don't think we list a table like this anywhere else for config
options.  I would be interested if others think it would be helpful.

> +-----------------------------+---------------------------------------------------------+
> |                             | synchronous_commit                                      |
> +-----------------------------+--------------+-------------------+--------------+-------+
> | operation on standby server | remote_apply | on (remote_flush) | remote_write | local |
> +-----------------------------+--------------+-------------------+--------------+-------+
> | write to WAL                | Yes          | Yes               | Yes          | No    |
> +-----------------------------+--------------+-------------------+--------------+-------+
> | fsync                       | Yes          | Yes               | No           | No    |
> +-----------------------------+--------------+-------------------+--------------+-------+
> | apply WAL data              | Yes          | No                | No           | No    |
> +-----------------------------+--------------+-------------------+--------------+-------+
> 
> 
> > and this defines the 'on' behavior:
> >
> >        /* Define the default setting for synchronous_commit */
> >        #define SYNCHRONOUS_COMMIT_ON   SYNCHRONOUS_COMMIT_REMOTE_FLUSH
> 
> Is there any valid reason to hide this behavior under `on` alias? In my opinion
> `remote_flush` does much better job with describing what it does. Maybe we
> could rename `on` to `remote_flush` but also create an alias `on=remote_flush`
> to keep backward compatibility? 

Well, I think we originally only had 'on', and later added the others. 
Also, 'on' is also local flush.  We don't support local _write_ where we
only write it to the kernel.  We support fysync off, which I think is
the local behavior of remote_write.  I think remote_write is saying we
want local fsync but no fsync for remote.  Is that even correct?

This is certainly confusing.  Maybe we do need a chart, but we need to
list local and standby behavior.

> +         Finally, when set to <literal>remote_apply</literal>, commits
> +         will wait until replies from the current synchronous standby(s)
> +         indicate they have received the commit record of the transaction
> +         and applied it, so that it has become visible to queries on the
> +         standby(s), and also written to durable storage on the standbys.
> 
> "and also written to durable storage on the standbys." -> You mean flushed?
> Maybe it should be better to stick to cohesive terminology to not introduce any
> confusion.

Yes, I mean written to durable storage.  I don't think you can use
"flushed" alone since you could be flusing the WAL buffers to the file
system.

> > Well, there is a doc section that talks about WAL:
> >
> >        https://www.postgresql.org/docs/12/wal.html
> >
> > and other parts of the config docs that talk about WAL.
> 
> Yes, I know what is WAL for. I only don't get what kind of operation do you
> mean by 'WAL replay'. The only one thing which I can think of is the process of
> restoring database after a crash, when we apply changes from WAL to the data
> pages which haven't been flushed to the disk, but I don't think that this is
> that. Basically what I wonder is how can a WAL replay influence the transaction
> commit?

Well, WAL reply is how replication works.  Pretty much the same thing
that happens during crash recovery, but it happens continually.

Someone just wrote this blog entry, which I think helps explain what we
are talking about:

    https://www.percona.com/blog/2020/08/21/postgresql-synchronous_commit-options-and-synchronous-standby-replication/

How is this for a table?

                  -- local --   ------------------- standbys ------------------
                   durable          query      durable commit  durable commit
                   commit        consistency   after OS crash  after PG crash
    remote_apply      X               X               X              X
    on                X                               X              X
    remote_write      X                                              X
    local             X
    off

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

pgsql-docs by date:

From: PG Doc comments form
Date: 21 August 2020, 09:25:20
Subject: Create a Foreign Table for PostgreSQL CSV Logs

From: Bruce Momjian
Date: 21 August 2020, 21:58:07
Subject: Re: Create a Foreign Table for PostgreSQL CSV Logs

Re: Request for further clarification on synchronous_commit - Mailing list pgsql-docs

Previous

Next