Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically? - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically? |
Date | |
Msg-id | CAH2-Wzm20CUamNR7gUnXsGQvLmPtLqHxEj7WWXYeM=UceMw1cw@mail.gmail.com Whole thread Raw |
In response to | Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically? (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?
|
List | pgsql-hackers |
On Mon, Dec 6, 2021 at 2:37 PM Peter Geoghegan <pg@bowt.ie> wrote: > On Mon, Dec 6, 2021 at 12:07 PM Robert Haas <robertmhaas@gmail.com> wrote: > > So does this. If some of the table is now all-visible when it wasn't > > before, it's certainly a good guess that the portions that still > > aren't have about the same distribution of dead tuples that they did > > before ... although the other direction is less clear: it seems > > possible that newly not-all-visible pages have fewer dead tuples than > > ones which have been not-all-visible for a while. But you have to make > > some guess. > > To me, it seems natural to accept and even embrace the inherent > uncertainty about the number of dead tuples. > The number of dead tuples in the table is an inherently dynamic thing, > which makes it totally dissimilar to the pg_statistics-based stats. > And so a single snapshot of a point in time is inherently much less > useful -- we ought to keep a few sets of old statistics within our new > pgstat_report_analyze() -- maybe 3 or 5. I just realized that I didn't really get around to explicitly connecting this to your point about newly not-all-visible pages being quite different to older ones that ANALYZE has seen -- which is definitely an important consideration. I'll do so now: Keeping some history makes the algorithm "less gullible" (a more useful goal than making it "smarter", at least IMV). Suppose that our starting point is 2 pieces of authoritative information, which are current as of the instant we want to estimate the number of dead tuples for VACUUM: 1. total relation size (relpages), and 2. current not-all-visible-pages count (interesting/countable pages, calculated by taking the "complement" of visibilitymap_count() value). Further suppose we store the same 2 pieces of information in our ANALYZE stats, reporting using pgstat_report_analyze() -- the same 2 pieces of information are stored alongside the actual count of dead tuples and live tuples found on not-all-visible pages. The algorithm avoids believing silly things about dead tuples by considering the delta between each piece of information, particularly the difference between "right now" and "the last time ANALYZE ran and called pgstat_report_analyze()". For example, if item 1/relpages increased by exactly the same number of blocks as item 2/not-all-visible pages (or close enough to it), that is recognized as a pretty strong signal. The algorithm should consider the newly not-all-visible pages as likely to have very few dead tuples. At the same time, the algorithm should not change its beliefs about the concentration of dead tuples in remaining, older not-all-visible pages. This kind of thing will still have problems, no doubt. But I'd much rather err in the direction of over-counting dead tuples like this. The impact of the problem on the workload/autovacuum is a big part of the picture here. Suppose we believe that not-all-visible pages have 20 LP_DEAD items on average, and they turn out to only have 3 or 5. Theoretically we've done the wrong thing by launching autovacuum workers sooner -- we introduce bias. But we also have lower variance over time, which might make it worth it. I also think that it might not really matter at all. It's no great tragedy if we clean up and set pages all-visible in the visibility map a little earlier on average. It might even be a positive thing. The fact that the user expresses the dead-tuple-wise threshold using autovacuum_vacuum_scale_factor is already somewhat arbitrary -- it is based on some pretty iffy assumptions. Even if we greatly overestimate dead tuples with the new algorithm, we're only doing so under circumstances that might have caused autovacuum_vacuum_insert_scale_factor to launch an autovacuum worker anyway. Just setting the visibility map bit has considerable value. -- Peter Geoghegan
pgsql-hackers by date: