Home > mailing lists

Re: Need to run CLUSTER to keep performance - Mailing list pgsql-performance

From	Heikki Linnakangas
Subject	Re: Need to run CLUSTER to keep performance
Date	November 8, 2007 07:12:24
Msg-id	4732EF07.3060807@enterprisedb.com Whole thread Raw
In response to	Need to run CLUSTER to keep performance (Rafael Martinez <r.m.guerrero@usit.uio.no>)
Responses	Re: Need to run CLUSTER to keep performance
List	pgsql-performance

Tree view

Rafael Martinez wrote:
> This is a question about something we have seen sometimes in the last
> months. It happens with tables with a large amount of updates/selects
> compared with the amount of inserts/deletes. The sizes of these tables
> are small and the amount of rows too.
>
> The 'problem' is that performance decrease during the day and the only
> thing that helps is to run CLUSTER on the table with problems. VACUUM
> ANALYZE does not help.
>
> Some information that can help to find out why this happens:
>
> - PostgreSQL version: 8.1.9
>
> ------------------------------------------------------------------------------
> scanorama=# SELECT pg_size_pretty(pg_relation_size('hosts'));
>
>  pg_size_pretty
> ----------------
>  12 MB
> ------------------------------------------------------------------------------
> scanorama=# SELECT count(*) FROM hosts ;
>
>  count
> -------
>  16402
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..2771.56 rows=66756 width=314) (actual
> time=0.008..2013.415 rows=16402 loops=1)
>  Total runtime: 2048.486 ms
> ------------------------------------------------------------------------------
> scanorama=# VACUUM ANALYZE ;
> VACUUM
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..2718.57 rows=61357 width=314) (actual
> time=0.008..1676.283 rows=16402 loops=1)
>  Total runtime: 1700.826 ms
> ------------------------------------------------------------------------------
> scanorama=# CLUSTER hosts_pkey ON hosts ;
> CLUSTER
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..680.02 rows=16402 width=314) (actual
> time=0.008..31.205 rows=16402 loops=1)
>  Total runtime: 53.635 ms
> ------------------------------------------------------------------------------
> scanorama=# SELECT * from pg_stat_all_tables WHERE relname LIKE 'hosts';
>  relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan |
> idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
>
--------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------
>  105805 | public     | hosts   |  1996430 |  32360280252 |  2736391 |
>    3301856 |       948 |   1403325 |       737
>
> The information from pg_stat_all_tables is from the last 20 days.
> ------------------------------------------------------------------------------
> INFO:  analyzing "public.hosts"
> INFO:  "hosts": scanned 2536 of 2536 pages, containing 16410 live rows
> and 57042 dead rows; 16410 rows in sample, 16410 estimated total rows
> INFO:  free space map contains 191299 pages in 786 relations
> DETAIL:  A total of 174560 page slots are in use (including overhead).
> 174560 page slots are required to track all free space.
> Current limits are:  2000000 page slots, 4000 relations, using 12131 KB.
> ------------------------------------------------------------------------------
>
> The tables with this 'problem' are not big, so CLUSTER finnish very fast
> and it does not have an impact in the access because of locking. But we
> wonder why this happens.

2 seconds for seq scanning 12 MB worth of data sounds like a lot. Have
you increased shared_buffers from the default? Which operating system
are you using? Shared memory access is known to be slower on Windows.

On a small table like that you could run VACUUM every few minutes
without much impact on performance. That should keep the table size in
check.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-performance by date:

From: Rafael Martinez
Date: 08 November 2007, 06:36:49
Subject: Need to run CLUSTER to keep performance

From: Tomáš Vondra
Date: 08 November 2007, 07:20:33
Subject: Re: Need to run CLUSTER to keep performance

Re: Need to run CLUSTER to keep performance - Mailing list pgsql-performance

Previous

Next