Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system |
Date | |
Msg-id | 1457917203.10232.2.camel@2ndquadrant.com Whole thread Raw |
In response to | Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system (Noah Misch <noah@leadboat.com>) |
Responses |
Re: Re: PATCH: Split stats file per database WAS:
autovacuum stress-testing our system
|
List | pgsql-hackers |
On Sun, 2016-03-13 at 18:46 -0400, Noah Misch wrote: > On Thu, Mar 03, 2016 at 06:08:07AM +0100, Tomas Vondra wrote: > > > > On 02/03/2016 06:46 AM, Noah Misch wrote: > > > > > > On Mon, Feb 01, 2016 at 07:03:45PM +0100, Tomas Vondra wrote: > > > > > > > > On 12/22/2015 03:49 PM, Noah Misch wrote: > > > > > > > > > > On Mon, Feb 18, 2013 at 06:19:12PM -0300, Alvaro Herrera > > > > > wrote: > > > > > > > > > > > > I have pushed it now. Further testing, of course, is > > > > > > always welcome. > > > > > While investigating stats.sql buildfarm failures, mostly on > > > > > animals > > > > > axolotl and shearwater, I found that this patch (commit > > > > > 187492b) > > > > > inadvertently removed the collector's ability to coalesce > > > > > inquiries. > > > > > Every PGSTAT_MTYPE_INQUIRY received now causes one stats file > > > > > write. > > > > > Before, pgstat_recv_inquiry() did: > > > > > > > > > > if (msg->inquiry_time > last_statrequest) > > > > > last_statrequest = msg->inquiry_time; > > > > > > > > > > and pgstat_write_statsfile() did: > > > > > > > > > > globalStats.stats_timestamp = GetCurrentTimestamp(); > > > > > ... (work of writing a stats file) ... > > > > > last_statwrite = globalStats.stats_timestamp; > > > > > last_statrequest = last_statwrite; > > > > > > > > > > If the collector entered pgstat_write_statsfile() with more > > > > > inquiries > > > > > waiting in its socket receive buffer, it would ignore them as > > > > > being too > > > > > old once it finished the write and resumed message > > > > > processing. Commit > > > > > 187492b converted last_statrequest to a "last_statrequests" > > > > > list that we > > > > > wipe after each write. > > So I've been looking at this today, and I think the attached patch > > should do > > the trick. I can't really verify it, because I've been unable to > > reproduce the > > non-coalescing - I presume it requires much slower system (axolotl > > is RPi, not > > sure about shearwater). > > > > The patch simply checks DBEntry,stats_timestamp in > > pgstat_recv_inquiry() and > > ignores requests that are already resolved by the last write (maybe > > this > > should accept a file written up to PGSTAT_STAT_INTERVAL in the > > past). > > > > The required field is already in DBEntry (otherwise it'd be > > impossible to > > determine if backends need to send inquiries or not), so this is > > pretty > > trivial change. I can't think of a simpler patch. > > > > Can you try applying the patch on a machine where the problem is > > reproducible? I might have some RPi machines laying around (for > > other > > purposes). > I've not attempted to study the behavior on slow hardware. Instead, > my report > used stat-coalesce-v1.patch[1] to simulate slow writes. (That > diagnostic > patch no longer applies cleanly, so I'm attaching a rebased > version. I've > changed the patch name from "stat-coalesce" to "slow-stat-simulate" > to > more-clearly distinguish it from the "pgstat-coalesce" > patch.) Problems > remain after applying your patch; consider "VACUUM pg_am" behavior: > > 9.2 w/ stat-coalesce-v1.patch: > VACUUM returns in 3s, stats collector writes each file 1x over 3s > HEAD w/ slow-stat-simulate-v2.patch: > VACUUM returns in 3s, stats collector writes each file 5x over 15s > HEAD w/ slow-stat-simulate-v2.patch and your patch: > VACUUM returns in 10s, stats collector writes no files over 10s Oh damn, the timestamp comparison in pgstat_recv_inquiry should be in the opposite direction. After fixing that "VACUUM pg_am" completes in 3 seconds and writes each file just once. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: