Re: doc: explain pgstatindex fragmentation - Mailing list pgsql-hackers

From Frédéric Yhuel
Subject Re: doc: explain pgstatindex fragmentation
Date
Msg-id d15a1c65-4fb0-4140-a48d-c1bd013a0255@dalibo.com
Whole thread Raw
In response to Re: doc: explain pgstatindex fragmentation  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Responses Re: doc: explain pgstatindex fragmentation
List pgsql-hackers

On 1/22/25 12:34, Bertrand Drouvot wrote:
> Hi,
> 
> On Tue, Nov 05, 2024 at 06:36:47PM +0100, Frédéric Yhuel wrote:
>> Hi, I thought it would be nice to give the user a better idea of what
>> avg_leaf_density and leaf_fragmentation mean.
>>
>> Patch attached. What do you think?
> 
> Yeah, I think that can not hurt to give more details, thanks for the proposal!
> 

Hi Bertrand, thanks for your review!

> A few comments:
> 
> === 1
> 
> +     <literal>avg_leaf_density</literal> can be seen as the inverse of bloat,
> 
> I'm not sure it's good to describe something as the inverse of "something
> else". See my proposal below.
> 

Yeah... bloat is a more familiar concept, so I wanted to link these two 
metrics... but "inverse" is confusing... or maybe something like that:

A small <literal>avg_leaf_density</literal> means that the index is bloated.

> === 2
> 
> I’m not sure we need to add the extra details in a paragraph below the fields
> description. What about changing the fields description?
> 
> Something concise enough like?
> 
> avg_leaf_density: shows how full leaf pages currently are (100 if full)

That should do :-)

> leaf_fragmentation: shows how much physical and logical ordering of leaf pages
> differ (zero if they don't)
> 

It looks good to me.

I've noticed that maximum leaf_fragmentation can have a huge impact on a 
range index-only scan, when reading all blocs from disks, even on my 
laptop machine with SSD, but I don't know if this is the right place to 
document this?

I used the following psql scripts to test the effect of 
leaf_fragmentation (the first one calls the second one):

https://github.com/dalibo/misc/blob/main/fyhuel/leaf_fragmentation.sql
https://github.com/dalibo/misc/blob/main/fyhuel/evict_from_both_caches.sql

> Also the comments made in [1], [2] and [3] are not linked to this main thread,
> adding them for reference here (but better to keep the conversation going
> by replying to this email).
> 
> [1]: https://www.postgresql.org/message-id/4c5dee3a-8381-4e0f-b882-d1bd950e8972%40dalibo.com
> [2]: https://www.postgresql.org/message-id/c70fcc72-eed6-475b-81c8-508422299351%40dalibo.com
> [3]: https://www.postgresql.org/message-id/e8a6db36-073e-4ca3-b38c-b42d7094cba8%40dalibo.com
> 

Indeed, I think Benoît mistakenly thought that thread aggregation was 
based on thread titles alone. He appended the second conversation to the 
commitfest entry.



pgsql-hackers by date:

Previous
From: Japin Li
Date:
Subject: Re: [RFC] Lock-free XLog Reservation from WAL
Next
From: Sutou Kouhei
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations