Re: New statistics for tuning WAL buffer size - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: New statistics for tuning WAL buffer size |
Date | |
Msg-id | cc7beda5-7b1f-46b7-ddd5-2dd63f88f70e@oss.nttdata.com Whole thread Raw |
In response to | Re: New statistics for tuning WAL buffer size (Masahiro Ikeda <ikedamsh@oss.nttdata.com>) |
Responses |
Re: New statistics for tuning WAL buffer size
|
List | pgsql-hackers |
On 2020/09/09 13:57, Masahiro Ikeda wrote: > On 2020-09-07 16:19, Fujii Masao wrote: >> On 2020/09/07 9:58, Masahiro Ikeda wrote: >>> Thanks for the review and advice! >>> >>> On 2020-09-03 16:05, Fujii Masao wrote: >>>> On 2020/09/02 18:56, Masahiro Ikeda wrote: >>>>>> +/* ---------- >>>>>> + * Backend types >>>>>> + * ---------- >>>>>> >>>>>> You seem to forget to add "*/" into the above comment. >>>>>> This issue could cause the following compiler warning. >>>>>> ../../src/include/pgstat.h:761:1: warning: '/*' within block comment [-Wcomment] >>>>> >>>>> Thanks for the comment. I fixed. >>>> >>>> Thanks for the fix! But why are those comments necessary? >>> >>> Sorry about that. This comment is not necessary. >>> I removed it. >>> >>>>> The pg_stat_walwriter is not security restricted now, so ordinary users can access it. >>>>> It has the same security level as pg_stat_archiver. If you have any comments, please let me know. >>>> >>>> + <structfield>dirty_writes</structfield> <type>bigint</type> >>>> >>>> I guess that the column name "dirty_writes" derived from >>>> the DTrace probe name. Isn't this name confusing? We should >>>> rename it to "wal_buffers_full" or something? >>> >>> I agree and rename it to "wal_buffers_full". >>> >>>> +/* ---------- >>>> + * PgStat_MsgWalWriter Sent by the walwriter to update statistics. >>>> >>>> This comment seems not accurate because backends also send it. >>>> >>>> +/* >>>> + * WAL writes statistics counter is updated in XLogWrite function >>>> + */ >>>> +extern PgStat_MsgWalWriter WalWriterStats; >>>> >>>> This comment seems not right because the counter is not updated in XLogWrite(). >>> >>> Right. I fixed it to "Sent by each backend and background workers to update WAL statistics." >>> In the future, other statistics will be included so I remove the function's name. >>> >>> >>>> +-- There will surely and maximum one record >>>> +select count(*) = 1 as ok from pg_stat_walwriter; >>>> >>>> What about changing this comment to "There must be only one record"? >>> >>> Thanks, I fixed. >>> >>>> + WalWriterStats.m_xlog_dirty_writes++; >>>> LWLockRelease(WALWriteLock); >>>> >>>> Since WalWriterStats.m_xlog_dirty_writes doesn't need to be protected >>>> with WALWriteLock, isn't it better to increment that after releasing the lock? >>> >>> Thanks, I fixed. >>> >>>> +CREATE VIEW pg_stat_walwriter AS >>>> + SELECT >>>> + pg_stat_get_xlog_dirty_writes() AS dirty_writes, >>>> + pg_stat_get_walwriter_stat_reset_time() AS stats_reset; >>>> + >>>> CREATE VIEW pg_stat_progress_vacuum AS >>>> >>>> In system_views.sql, the definition of pg_stat_walwriter should be >>>> placed just after that of pg_stat_bgwriter not pg_stat_progress_analyze. >>> >>> OK, I fixed it. >>> >>>> } >>>> - >>>> /* >>>> * We found an existing collector stats file. Read it and put all the >>>> >>>> You seem to accidentally have removed the empty line here. >>> >>> Sorry about that. I fixed it. >>> >>>> - errhint("Target must be \"archiver\" or \"bgwriter\"."))); >>>> + errhint("Target must be \"archiver\" or \"bgwriter\" or >>>> \"walwriter\"."))); >>>> >>>> There are two "or" in the message, but the former should be replaced with ","? >>> >>> Thanks, I fixed. >>> >>> >>> On 2020-09-05 18:40, Magnus Hagander wrote: >>>> On Fri, Sep 4, 2020 at 5:42 AM Fujii Masao >>>> <masao.fujii@oss.nttdata.com> wrote: >>>> >>>>> On 2020/09/04 11:50, tsunakawa.takay@fujitsu.com wrote: >>>>>> From: Fujii Masao <masao.fujii@oss.nttdata.com> >>>>>>>> I changed the view name from pg_stat_walwrites to >>>>> pg_stat_walwriter. >>>>>>>> I think it is better to match naming scheme with other views >>>>> like >>>>>>> pg_stat_bgwriter, >>>>>>>> which is for bgwriter statistics but it has the statistics >>>>> related to backend. >>>>>>> >>>>>>> I prefer the view name pg_stat_walwriter for the consistency with >>>>>>> other view names. But we also have pg_stat_wal_receiver. Which >>>>>>> makes me think that maybe pg_stat_wal_writer is better for >>>>>>> the consistency. Thought? IMO either of them works for me. >>>>>>> I'd like to hear more opinons about this. >>>>>> >>>>>> I think pg_stat_bgwriter is now a misnomer, because it contains >>>>> the backends' activity. Likewise, pg_stat_walwriter leads to >>>>> misunderstanding because its information is not limited to WAL >>>>> writer. >>>>>> >>>>>> How about simply pg_stat_wal? In the future, we may want to >>>>> include WAL reads in this view, e.g. reading undo logs in zheap. >>>>> >>>>> Sounds reasonable. >>>> >>>> +1. >>>> >>>> pg_stat_bgwriter has had the "wrong name" for quite some time now -- >>>> it became even more apparent when the checkpointer was split out to >>>> it's own process, and that's not exactly a recent change. And it had >>>> allocs in it from day one... >>>> >>>> I think naming it for what the data in it is ("wal") rather than which >>>> process deals with it ("walwriter") is correct, unless the statistics >>>> can be known to only *ever* affect one type of process. (And then >>>> different processes can affect different columns in the view). As a >>>> general rule -- and that's from what I can tell exactly what's being >>>> proposed. >>> >>> Thanks for your comments. I agree with your opinions. >>> I changed the view name to "pg_stat_wal". >>> >>> >>> I fixed the code to send the WAL statistics from not only backend and walwriter >>> but also checkpointer, walsender and autovacuum worker. >> >> Good point! Thanks for updating the patch! >> >> >> @@ -604,6 +604,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params, >> onerel->rd_rel->relisshared, >> Max(new_live_tuples, 0), >> vacrelstats->new_dead_tuples); >> + pgstat_send_wal(); >> >> I guess that you changed heap_vacuum_rel() as above so that autovacuum >> workers can send WAL stats. But heap_vacuum_rel() can be called by >> the processes (e.g., backends) other than autovacuum workers? Also >> what happens if autovacuum workers just do ANALYZE only? In that case, >> heap_vacuum_rel() may not be called. >> >> Currently autovacuum worker reports the stats at the exit via >> pgstat_beshutdown_hook(). Unlike other processes, autovacuum worker >> is not the process that basically keeps running during the service. It exits >> after it does vacuum or analyze. So ISTM that it's not bad to report the stats >> only at the exit, in autovacuum worker case. There is no need to add extra >> code for WAL stats report by autovacuum worker. Thought? > > Thanks, I understood. I removed this code. > >> >> @@ -1430,6 +1430,9 @@ WalSndWaitForWal(XLogRecPtr loc) >> else >> RecentFlushPtr = GetXLogReplayRecPtr(NULL); >> + /* Send wal statistics */ >> + pgstat_send_wal(); >> >> AFAIR logical walsender uses three loops in WalSndLoop(), WalSndWriteData() >> and WalSndWaitForWal(). But could you tell me why added pgstat_send_wal() >> into WalSndWaitForWal()? I'd like to know why WalSndWaitForWal() is the best >> for that purpose. > > I checked what function calls XLogBackgroundFlush() which calls > AdvanceXLInsertBuffer() to increment m_wal_buffers_full. > > I found that WalSndWaitForWal() calls it, so I added it. Ok. But XLogBackgroundFlush() calls AdvanceXLInsertBuffer() wit the second argument opportunistic=true, so in this case WALwrite by wal_buffers full seems to never happen. Right? If this understanding is right, WalSndWaitForWal() doesn't needto call pgstat_send_wal(). Probably also walwriter doesn't need to do that. The logical rep walsender can generate WAL and call AdvanceXLInsertBuffer() when it executes the replication commands likeCREATE_REPLICATION_SLOT. But this case is already covered by pgstat_report_activity()->pgstat_send_wal() called in PostgresMain(),with your patch. So no more calls to pgstat_send_wal() seems necessary for logical rep walsender. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
pgsql-hackers by date: