Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise
Date
Msg-id CAH2-WzkgDuY2Y_Wv=ikrnt-AWJjC6+bLG7ei4MDdkrfNXXAf3g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise
List pgsql-hackers
On Wed, Jul 19, 2017 at 7:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
>> My argument for the importance of index bloat to the more general
>> bloat problem is simple: any bloat that accumulates, that cannot be
>> cleaned up, will probably accumulate until it impacts performance
>> quite noticeably.
>
> But that just begs the question: *does* it accumulate indefinitely, or
> does it eventually reach a more-or-less steady state?

Yes, I believe it does reach a more-or-less steady state. It saturates
when there is a lot of contention, because then you actually can reuse
the bloat. If it didn't saturate, and instead became arbitrarily bad,
then we'd surely have heard about that before now.

The bloat is not entirely wasted, because it actually prevents you
from getting even more bloat in that part of the keyspace.

> The traditional
> wisdom about btrees, for instance, is that no matter how full you pack
> them to start with, the steady state is going to involve something like
> 1/3rd free space.  You can call that bloat if you want, but it's not
> likely that you'll be able to reduce the number significantly without
> paying exorbitant costs.

For the purposes of this discussion, I'm mostly talking about
duplicates within a page on a unique index. If the keyspace owned by
an int4 unique index page only covers 20 distinct values, it will only
ever cover 20 distinct values, now and forever, despite the fact that
there is room for about 400 (a 90/10 split leaves you with 366 items +
1 high key).

I don't know if I should really even call this bloat, since the term
is so overloaded, although this is what other database systems call
index bloat. I like to think of it as "damage to the keyspace",
although that terminology seems unlikely to catch on.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise
Next
From: Thomas Munro
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables