Home > mailing lists

Re: Displaying accumulated autovacuum cost - Mailing list pgsql-hackers

From	Greg Smith
Subject	Re: Displaying accumulated autovacuum cost
Date	February 22, 2012 00:55:25
Msg-id	4F44752A.3090305@2ndQuadrant.com Whole thread Raw
In response to	Re: Displaying accumulated autovacuum cost (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Displaying accumulated autovacuum cost
List	pgsql-hackers

Tree view

I just took this for spin.  Everything I tried worked, docs built and 
read fine.  The description of how "dirty" differs from "written" is a 
bit cryptic, but I don't see an easy way to do better without a whole 
new section on that topic.  Once the extension upgrade questions are 
sorted out, I'd say this is ready to commit.  Example I have at the 
bottom here shows a case where this is a big improvement over the 
existing tracking.  I think this is a must-have improvement if we're 
going to advocate using pg_stat_statements for more things.

This works as expected in all of the EXPLAIN forms, I tried all of the 
supported formats.  Sample of the text one:

$ psql -d pgbench -c "EXPLAIN (ANALYZE,BUFFERS,FORMAT text) UPDATE 
pgbench_accounts SET aid=aid+0 WHERE aid<1000"
QUERY PLAN
---------- Update on pgbench_accounts  (cost=0.00..86.09 rows=860 width=103) 
(actual time=8.587..8.587 rows=0 loops=1)   Buffers: shared hit=8315 read=70 dirtied=16   ->  Index Scan using
pgbench_accounts_pkeyon pgbench_accounts 
 
(cost=0.00..86.09 rows=860 width=103) (actual time=0.017..2.086 rows=999 loops=1)         Index Cond: (aid < 1000)
  Buffers: shared hit=1828 read=28 Total runtime: 8.654 ms
 

Also ran just the UPDATE statement alone, then retrieved the counts from 
pg_stat_statements:

$ psql -x -c "select * from pg_stat_statements"
-[ RECORD 1 
]-------+-------------------------------------------------------------------------------------------
userid              | 10
dbid                | 16385
query               | UPDATE pgbench_accounts SET aid=aid+0 WHERE aid<1000
calls               | 1
total_time          | 0.007475
rows                | 999
shared_blks_hit     | 8370
shared_blks_read    | 15
shared_blks_dirtied | 15
shared_blks_written | 0
...

Note that there are no blocks shown as written there.  That is also 
demonstrated by the results after some pgbench "-M prepared" stress 
testing against a small database.  The pgbench tables are structured 
such that the number of branches < tellers << accounts.  On a small 
scale database (I used 10 here), there might only be a single page of 
branch data.  That shows up clearly in the different amount of dirtied 
blocks in each update:

$ psql -x -c "select 
query,shared_blks_hit,shared_blks_read,shared_blks_dirtied,shared_blks_written 
from pg_stat_statements order by calls desc limit 7"

...
query               | UPDATE pgbench_branches SET bbalance = bbalance + 
$1 WHERE bid = $2;
shared_blks_hit     | 32929
shared_blks_read    | 0
shared_blks_dirtied | 1
shared_blks_written | 0

query               | UPDATE pgbench_tellers SET tbalance = tbalance + 
$1 WHERE tid = $2;
shared_blks_hit     | 19074
shared_blks_read    | 0
shared_blks_dirtied | 7
shared_blks_written | 0

query               | UPDATE pgbench_accounts SET abalance = abalance + 
$1 WHERE aid = $2;
shared_blks_hit     | 35563
shared_blks_read    | 9982
shared_blks_dirtied | 4945
shared_blks_written | 2812

Note how in the branches and tellers case, the existing "written" 
counter shows 0.  Those hot pages stay in cache the whole time with a 
high usage count, backends never get to write them out; only the 
checkpointer does.  Only this new "dirtied" one reflects a useful write 
count for frequently used pages like that, and it does show that more 
pages are being touched by pgbench_tellers than pgbench_branches.

I'd never ran into this before because I normally test against larger 
databases.  But once I tried to find an example of this form, it was 
easy to do so.  Systems where much of the database fits into 
shared_buffers in particular are likely to see a deceptively small write 
count.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

pgsql-hackers by date:

From: Greg Smith
Date: 22 February 2012, 00:28:42
Subject: Re: Displaying accumulated autovacuum cost

From: Noah Misch
Date: 22 February 2012, 01:28:14
Subject: Re: Measuring relation free space

Re: Displaying accumulated autovacuum cost - Mailing list pgsql-hackers

Previous

Next