Re: New statistics for tuning WAL buffer size - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: New statistics for tuning WAL buffer size |
Date | |
Msg-id | ad49ba36-f0d1-5897-46e2-90873f1614e6@oss.nttdata.com Whole thread Raw |
In response to | Re: New statistics for tuning WAL buffer size (Masahiro Ikeda <ikedamsh@oss.nttdata.com>) |
Responses |
Re: New statistics for tuning WAL buffer size
|
List | pgsql-hackers |
On 2020/10/13 11:57, Masahiro Ikeda wrote: > On 2020-10-06 15:57, Masahiro Ikeda wrote: >> Hi, >> >> I think it's better to add other WAL statistics to the pg_stat_wal view. >> I'm thinking to add the following statistics. Please let me know your thoughts. >> >> 1. Basic wal statistics >> >> * wal_records: Total number of WAL records generated >> * wal_fpi: Total number of WAL full page images generated >> * wal_bytes: Total amount of WAL bytes generated +1 >> >> To understand DB's performance, first, we will check the performance >> trends for the entire database instance. >> For example, if the number of wal_fpi becomes higher, users may tune >> "wal_compression", "checkpoint_timeout" and so on. >> >> Although users can check the above statistics via EXPLAIN, >> auto_explain, autovacuum >> and pg_stat_statements now, if users want to see the performance >> trends for the entire database, >> they must preprocess the statistics. >> >> Is it useful to add the sum of the above statistics to the pg_stat_wal view? >> >> >> 2. Number of when new WAL file is created and zero-filled. >> >> As Fujii-san already commented, I think it's good for tuning. >> >>> Just idea; it may be worth exposing the number of when new WAL file is created and zero-filled. This initialization mayhave impact on the performance of write-heavy workload generating lots of WAL. If this number is reported high, to reducethe number of this initialization, we can tune WAL-related parameters so that more "recycled" WAL files can be hold. +1 But it might be better to track the number of when new WAL file is created whether it's zero-filled or not, if file creation and sync itself takes time. >> >> >> 3. Number of when to switch the WAL logfile segment. >> >> This is similar to 2, but this counts the number of when WAL file is >> recylcled too. >> I think it's useful for tuning "wal_segment_size" >> if the number is high relative to the startup time, "wal_segment_size" >> must be bigger. You're thinking to count all the WAL file switch? That number is equal to the number of WAL files generated since the last reset of pg_stat_wal? >> >> >> 4. Number of when WAL is flushed >> >> I think it's useful for tuning "synchronous_commit" and "commit_delay" >> for query executions. >> If the number of WAL is flushed is high, users can know >> "synchronous_commit" is useful for the workload. >> >> Also, it's useful for tuning "wal_writer_delay" and >> "wal_writer_flush_after" for wal writer. >> If the number is high, users can change the parameter for performance. >> >> I think it's better to separate this for backends and wal writer. +1 >> >> >> 5. Wait time when WAL is flushed. >> >> This is the accumulated time when wal is flushed. >> If the time becomes much higher, users can detect the possibility of >> disk failure. This should be tracked, e.g., only when track_io_timing is enabled? Otherwise, tracking that may cause performance overhead. >> >> Since users can see how much flash time occupies of the query execution time, >> it may lead to query tuning and so on. >> >> Since there is the above reason, I think it's better to separate this >> for backends and wal writer. I'm afraid that this counter for a backend may be a bit confusing. Because when the counter indicates small time, we may think that walwriter almost write WAL data and a backend doesn't take time to write WAL. But a backend may be just waiting for walwriter to write WAL. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
pgsql-hackers by date: