Home > mailing lists

Re: default_statistics_target - Mailing list pgsql-performance

From	Carlo Stonebanks
Subject	Re: default_statistics_target
Date	March 22, 2010 19:28:36
Msg-id	ho8qaa$2ars$1@news.hub.org Whole thread Raw
In response to	Re: default_statistics_target (Greg Smith <greg@2ndquadrant.com>)
Responses	Re: default_statistics_target
List	pgsql-performance

Tree view

HI Greg,


Thanks for the insight. How much more of a server's resources will be
consumed by an ANALYZE with default_statistics_target = 100?

We have two environments hosting the same data. One is our "live" server,
which serves the web site, and this hosts our published data, not more than
200 - 300 tables.

PRODUCTION: The data warehouse consisting of our published data, as well as
our "input resources" which are transformed via ETL processes into our
published data. It is these "input resources" which currently consist of
about 8,000 tables and growing. Don't really require analysis, as they are
typically run once in a linear read when importing.they are typically read
linearly, and rarely more than once. They are kept for auditing and
rollbacks.

LIVE: Hosts just the published data, copied over from the production server.
Because the data does not get written to very often, older stats from
ANALYZE are likely to still be valid. Our concern is that with the older
setting of default_statistics_target = 10 it has not gone deep enough into
these tables (numbering in the millios of rows) to really represent the data
distribution properly.

> Given that it looks like you're running 8.3 from past messages I've seen
> from you, I'd also be concerned that you've overrun your max_fsm_pages, so
> that VACUUM is growing increasing ineffective for you, and that's
> contributing to your headache.

Below are the config values of our production server (those not listed are
those stubbed out) . Sadly, in an attempt to improve the server's
performance, someone wiped out all of the changes I had made to date, along
with comments indicating previous values, reason for the change, etc. What
do they call that again? Oh, yeah. Documentation.

# CENTOS 5.4
# Linux mdx_octo 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64
x86_64 x86_64 GNU/Linux
# pgsql 8.3.10, 8 CPUs, 48GB RAM
# RAID 10, 4 Disks
autovacuum = on   # Enable autovacuum subprocess?  'on'
autovacuum_analyze_scale_factor = 0.05 # fraction of table size before
analyze
autovacuum_analyze_threshold = 1000
autovacuum_naptime = 1min  # time between autovacuum runs
autovacuum_vacuum_cost_delay =  50 # default vacuum cost delay for
autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
autovacuum_vacuum_threshold = 1000
bgwriter_lru_maxpages = 100  # 0-1000 max buffers written/round
checkpoint_segments = 128 # in logfile segments, min 1, 16MB each
checkpoint_warning = 290s  # 0 is off
client_min_messages =  debug1 # values in order of decreasing detail:
datestyle = 'iso, mdy'
default_statistics_target = 250         # range 1-1000
default_text_search_config = 'pg_catalog.english'
lc_messages = 'C'   # locale for system error message
lc_monetary = 'C'   # locale for monetary formatting
lc_numeric = 'C'   # locale for number formatting
lc_time = 'C'    # locale for time formatting
listen_addresses = '*'  # what IP address(es) to listen on;
log_destination = 'stderr'  # Valid values are combinations of
log_error_verbosity =  verbose  # terse, default, or verbose messages
log_line_prefix = '%t '   # special values:
log_min_error_statement =  debug1 # values in order of decreasing detail:
log_min_messages = debug1  # values in order of decreasing detail:
logging_collector = on  # Enable capturing of stderr and csvlog
maintenance_work_mem = 256MB
max_connections = 100   # (change requires restart)
max_fsm_relations = 1000  # min 100, ~70 bytes each
max_locks_per_transaction = 128  # min 10
port = 5432    # (change requires restart)
shared_buffers = 4096MB
shared_preload_libraries = '$libdir/plugins/plugin_debugger.so'  # (change
requires restart)
track_counts = on
vacuum_cost_delay = 5   # 0-1000 milliseconds
wal_buffers = 4MB
wal_sync_method = open_sync
work_mem = 64MB

Carlo


"Greg Smith" <greg@2ndquadrant.com> wrote in message
news:4B9E33AF.2020608@2ndquadrant.com...
> Carlo Stonebanks wrote:
>> The whole topic of messing with stats makes my head spin but I am
>> concerned about some horridly performing queries that have had bad rows
>> estimates and others which always choose seq scans when indexes are
>> available. Reading up on how to improve planner estimates, I have seen
>> references to default_statistics_target being changed from the default of
>> 10 to 100.
>>
>> Our DB is large, with thousands of tables
>
> Stop right there for a second.  Are you sure autovacuum is working well
> here?  With thousands of tables, it wouldn't surprise me to discover your
> planner estimates are wrong because there hasn't been a recent enough
> ANALYZE on the relevant tables.  If you haven't already, take a look at
> pg_stat_user_tables and make sure that tables that have the bad estimates
> have actually been analyzed recently.  A look at the live/dead row counts
> there should be helpful as well.
>
> If all that's recent, but you're still getting bad estimates, only then
> would I suggest trying an increase to default_statistics_target.  In the
> situation where autovacuum isn't keeping up with some tables because you
> have thousands of them, increasing the stats target can actually make the
> problem worse, because the tables that are getting analyzed will take
> longer to process--more statistics work to be done per table.
>
> --
> Greg Smith  2ndQuadrant US  Baltimore, MD
> PostgreSQL Training, Services and Support
> greg@2ndQuadrant.com   www.2ndQuadrant.us
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

pgsql-performance by date:

From: Greg Stark
Date: 22 March 2010, 18:25:39
Subject: Re: Block at a time ...

From: "Carlo Stonebanks"
Date: 22 March 2010, 19:58:37
Subject: Got that new server, now it's time for config!

Re: default_statistics_target - Mailing list pgsql-performance

Previous

Next