Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) |
Date | |
Msg-id | CAAKRu_bTjvHSd5tWWQkW8SWY-5frVp_vOJpFD+-V7jJG1Rf_Dw@mail.gmail.com Whole thread Raw |
In response to | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) (Justin Pryzby <pryzby@telsasoft.com>) |
Responses |
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
|
List | pgsql-hackers |
Attached is v46. On Wed, Dec 28, 2022 at 6:56 PM Andres Freund <andres@anarazel.de> wrote: > On 2022-10-06 13:42:09 -0400, Melanie Plageman wrote: > > > Additionally, some minor notes: > > > > > > - Since the stats are counting blocks, it would make sense to prefix the view columns with "blks_", and word them inthe past tense (to match current style), i.e. "blks_written", "blks_read", "blks_extended", "blks_fsynced" (realisticallyone would combine this new view with other data e.g. from pg_stat_database or pg_stat_statements, which alluse the "blks_" prefix, and stop using pg_stat_bgwriter for this which does not use such a prefix) > > > > I have changed the column names to be in the past tense. > > For a while I was convinced by the consistency argument (after Melanie > pointing it out to me). But the more I look, the less convinced I am. The > existing IO related stats in pg_stat_database, pg_stat_bgwriter aren't past > tense, just the ones in pg_stat_statements. pg_stat_database uses past tense > for tup_*, but not xact_*, deadlocks, checksum_failures etc. > > And even pg_stat_statements isn't consistent about it - otherwise it'd be > 'planned' instead of 'plans', 'called' instead of 'calls' etc. > > I started to look at the naming "tense" issue again, after I got "confused" > about "extended", because that somehow makes me think about more detailed > stats or such, rather than files getting extended. > > ISTM that 'evictions', 'extends', 'fsyncs', 'reads', 'reuses', 'writes' are > clearer than the past tense versions, and about as consistent with existing > columns. I have updated the column names to the above recommendation. On Wed, Jan 11, 2023 at 11:32 AM vignesh C <vignesh21@gmail.com> wrote: > > For some reason cfbot is not able to apply this patch as in [1], > please have a look and post an updated patch if required: > === Applying patches on top of PostgreSQL commit ID > 3c6fc58209f24b959ee18f5d19ef96403d08f15c === > === applying patch > ./v45-0001-pgindent-and-some-manual-cleanup-in-pgstat-relat.patch > patching file src/backend/storage/buffer/bufmgr.c > patching file src/backend/storage/buffer/localbuf.c > patching file src/backend/utils/activity/pgstat.c > patching file src/backend/utils/activity/pgstat_relation.c > patching file src/backend/utils/adt/pgstatfuncs.c > patching file src/include/pgstat.h > patching file src/include/utils/pgstat_internal.h > === applying patch ./v45-0002-pgstat-Infrastructure-to-track-IO-operations.patch > gpatch: **** Only garbage was found in the patch input. > > [1] - http://cfbot.cputube.org/patch_41_3272.log > This was an issue with cfbot that Thomas has now fixed as he describes in [1]. On Wed, Jan 11, 2023 at 4:58 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > Subject: [PATCH v45 4/5] Add system view tracking IO ops per backend type > > The patch can/will fail with: > > CREATE TABLESPACE test_io_shared_stats_tblspc LOCATION ''; > +WARNING: tablespaces created by regression test cases should have names starting with "regress_" > > CREATE TABLESPACE test_stats LOCATION ''; > +WARNING: tablespaces created by regression test cases should have names starting with "regress_" > > (I already sent patches to address the omission in cirrus.yml) Thanks. I've fixed this I make a tablespace in amcheck -- are there recommendations for naming tablespaces in contrib also? > > 1760 : errhint("Target must be \"archiver\", \"io\", \"bgwriter\", \"recovery_prefetch\",or \"wal\"."))); > => Do you want to put these in order? Thanks. Fixed. > pgstat_get_io_op_name() isn't currently being hit by tests; actually, > it's completely unused. Deleted it. > FlushRelationBuffers() isn't being hit for local buffers. I added a test. > > + <entry><structname>pg_stat_io</structname><indexterm><primary>pg_stat_io</primary></indexterm></entry> > > + <entry> > > + One row per backend type, context, target object combination showing > > + cluster-wide I/O statistics. > > I suggest: "One row for each combination of of .." I have made this change. > > + The <structname>pg_stat_io</structname> and > > + <structname>pg_statio_</structname> set of views are especially useful for > > + determining the effectiveness of the buffer cache. When the number of actual > > + disk reads is much smaller than the number of buffer hits, then the cache is > > + satisfying most read requests without invoking a kernel call. > > I would change this say "Postgres' own buffer cache is satisfying ..." So, this is existing copy to which I added the pg_stat_io view name and re-flowed the indentation. However, I think your suggestions are a good idea, so I've taken them and just rewritten this paragraph altogether. > > > However, these > > + statistics do not give the entire story: due to the way in which > > + <productname>PostgreSQL</productname> handles disk I/O, data that is not in > > + the <productname>PostgreSQL</productname> buffer cache might still reside in > > + the kernel's I/O cache, and might therefore still be fetched without > > I suggest to refer to "the kernel's page cache" same applies here. > > > + The <structname>pg_stat_io</structname> view will contain one row for each > > + backend type, I/O context, and target I/O object combination showing > > + cluster-wide I/O statistics. Combinations which do not make sense are > > + omitted. > > "..for each combination of .." I have changed this. > > > + <varname>io_context</varname> for a type of I/O operation. For > > "for I/O operations" So I actually mean for a type of I/O operation -- that is, relation data is normally written to a shared buffer but sometimes we bypass shared buffers and just call write and sometimes we use a buffer access strategy and write it to a special ring buffer (made up of buffers stolen from shared buffers, but still). So I don't want to say "for I/O operations" because I think that would imply that writes of relation data will always be in the same IO Context. > > > + <literal>vacuum</literal>: I/O operations done outside of shared > > + buffers incurred while vacuuming and analyzing permanent relations. > > s/incurred/performed/ I changed this. > > > + <literal>bulkread</literal>: Qualifying large read I/O operations > > + done outside of shared buffers, for example, a sequential scan of a > > + large table. > > I don't think it's correct to say that it's "outside of" shared-buffers. I suppose "outside of" gives the wrong idea. But I need to make clear that this I/O is to and from buffers which are not a part of shared buffers right now -- they may still be accessible from the same data structures which access shared buffers but they are currently being used in a different way. > s/Qualifying/Certain/ I feel like qualifying is more specific than certain, but I would be open to changing it if there was a specific reason you don't like it. > > > + <literal>bulkwrite</literal>: Qualifying large write I/O operations > > + done outside of shared buffers, such as <command>COPY</command>. > > Same > > > + Target object of an I/O operation. Possible values are: > > + <itemizedlist> > > + <listitem> > > + <para> > > + <literal>relation</literal>: This includes permanent relations. > > It says "includes permanent" but what seems to mean is that it > "exclusive of temporary relations". I've changed this. > > > + <row> > > + <entry role="catalog_table_entry"> > > + <para role="column_definition"> > > + <structfield>read</structfield> <type>bigint</type> > > + </para> > > + <para> > > + Number of read operations in units of <varname>op_bytes</varname>. > > This looks too much like it means "bytes". > Should say: "in number of blocks of size >op_bytes<" > > But wait - is it the number of read operations "in units of op_bytes" > (which would means this already multiplied by op_bytes, and is in units > of bytes). > > Or the "number of read operations" *of* op_bytes chunks ? Which would > mean this is a "pure" number, and could be multipled by op_bytes to > obtain a size in bytes. It is the number of read operations of op_bytes size -- thanks so much for pointing this out. The wording was really unclear. The idea is that you can do something like: SELECT pg_size_pretty(reads * op_bytes) FROM pg_stat_io; and get it in bytes. The view will contain other types of IO that are not in BLCKSZ chunks, which is where this column will be handy. > > > + Number of write operations in units of <varname>op_bytes</varname>. > > > + Number of relation extend operations in units of > > + <varname>op_bytes</varname>. > > same > > > + In <varname>io_context</varname> <literal>normal</literal>, this counts > > + the number of times a block was evicted from a buffer and replaced with > > + another block. In <varname>io_context</varname>s > > + <literal>bulkwrite</literal>, <literal>bulkread</literal>, and > > + <literal>vacuum</literal>, this counts the number of times a block was > > + evicted from shared buffers in order to add the shared buffer to a > > + separate size-limited ring buffer. > > This never defines what "evicted" means. Does it mea that a dirty > buffer was written out ? Thanks. I've updated this. > > > + The number of times an existing buffer in a size-limited ring buffer > > + outside of shared buffers was reused as part of an I/O operation in the > > + <literal>bulkread</literal>, <literal>bulkwrite</literal>, or > > + <literal>vacuum</literal> <varname>io_context</varname>s. > > Maybe say "as part of a bulk I/O operation (bulkread, bulkwrite, or > vacuum)." I've changed this. > > > + <para> > > + <structname>pg_stat_io</structname> can be used to inform database tuning. > > > + For example: > > + <itemizedlist> > > + <listitem> > > + <para> > > + A high <varname>evicted</varname> count can indicate that shared buffers > > + should be increased. > > + </para> > > + </listitem> > > + <listitem> > > + <para> > > + Client backends rely on the checkpointer to ensure data is persisted to > > + permanent storage. Large numbers of <varname>files_synced</varname> by > > + <literal>client backend</literal>s could indicate a misconfiguration of > > + shared buffers or of checkpointer. More information on checkpointer > > of *the* checkpointer > > > + Normally, client backends should be able to rely on auxiliary processes > > + like the checkpointer and background writer to write out dirty data as > > *the* bg writer > > > + much as possible. Large numbers of writes by client backends could > > + indicate a misconfiguration of shared buffers or of checkpointer. More > > *the* ckpointer I've made most of these changes. > Should this link to various docs for checkpointer/bgwriter ? I couldn't find docs related to tuning checkpointer outside of the WAL configuration docs. There is the docs page for the CHECKPOINT command -- but I don't think that is very relevant here. > Maybe the docs for ALTER/COPY/VACUUM/CREATE/etc should be updated to > refer to some central description of ring buffers. Maybe something > should be included to the appendix. I agree it would be nice to explain Buffer Access Strategies in the docs. - Melanie [1] https://www.postgresql.org/message-id/CA%2BhUKGLiY1e%2B1%3DpB7hXJOyGj1dJOfgde%2BHmiSnv3gDKayUFJMA%40mail.gmail.com
Attachment
pgsql-hackers by date: