Re: poor VACUUM performance on large tables - Mailing list pgsql-performance
From | Jan Peterson |
---|---|
Subject | Re: poor VACUUM performance on large tables |
Date | |
Msg-id | 72e966b005090617543ad88811@mail.gmail.com Whole thread Raw |
In response to | Re: poor VACUUM performance on large tables (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: poor VACUUM performance on large tables
|
List | pgsql-performance |
Thomas F. O'Connell: >Do you have your Free Space Map settings configured appropriately? Our current FSM settings are: max_fsm_pages = 500000 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 1000 # min 100, ~50 bytes each > You'll want to run a VACUUM VERBOSE and note the numbers at the end, > which describe how many pages are used and how many are needed. > max_fsm_pages should be set according to that, and you can set > max_fsm_relations based on it, too, although typically one knows > roughly how many relations are in a database. Here are the last two lines from a VACUUM VERBOSE FULL we did when the database was totally full: INFO: free space map: 82 relations, 532349 pages stored; 632352 total pages needed DETAIL: Allocated FSM size: 1000 relations + 500000 pages = 2995 kB shared memory. VACUUM Based on this, it looks like we could stand to bump up our FSM another couple hundred thousand. Does it buy us anything to reduce the number of FSM relations from the default of 1000? > have you experimented with pg_autovacuum We're not using pg_autovacuum. We have our own mechanism that works basically the same as pg_autovacuum, but split into two separate threads, one for large tables and one for small tables. We consider tables to be "large" if their size exceeds 100MB. Tables are selected for vacuuming if they've changed "enough" (I can get you actual metrics for what is "enough", but I don't know off the top of my head). Our main reason for splitting out small vs. large tables was that the large tables take a long time to VACUUM and we didn't want our small tables to go a long time between VACUUMs. Of course, due to the transactional nature of VACUUM, we aren't really gaining much here, anyway (this was one of the things we were hoping to address with Hannu's patch, but there are issues with his patch on 8.0.2 that we haven't tracked down yet). Tom Lane: > Which PG version, exactly? We're currently running 8.0.2. > Plain VACUUM (not FULL) certainly ought not take that long. (If you're > using VACUUM FULL, the answer is going to be "don't do that".) Heh, we're definitely not doing a VACUUM FULL. We're doing VACUUM ANALYZE {tablename} exclusively, except when we get close to the transaction id wraparound threshold when we do a VACUUM ANALYZE of the entire database. > What maintenance_work_mem (or vacuum_mem in older releases) are > you running it under? It looks like we are using the defaults for work_mem (1024) and maintenance_work_mem (16384). We could certainly bump these up. Is there a good way to determine what settings would be reasonable? I'll note, however, that we had experimented with bumping these previously and not noticed any change in performance. > Can you get VACUUM VERBOSE output from some of these cases > so we can see which phase(s) are eating the time? I'll get some, but it will take a few more days as we have recently reset our test environment. I can get some sample runs of VACUUM VERBOSE on pg_largeobject in a few hours (it takes a few hours to run) and will post them when I have them. > It'd also be interesting to watch the output of vmstat or local > equivalent --- it might just be that your I/O capability is nearly > saturated and VACUUM is pushing the system over the knee > of the response curve. If so, the vacuum delay options of 8.0 > would be worth experimenting with. We've been monitoring I/O rates with iostat and we're generally running around 90% I/O usage after we kick into the rolling delete stage (before we reach that stage, we're running around 20%-50% I/O usage). We are definitely I/O bound, hence trying to find a way to make VACUUM process less data. Our system (the database is on an appliance system) is a dual CPU box, and we're burning about 25% of our CPU time in I/O waits (again, after our rolling delete kicks in). A higher performance I/O subsystem is something we could try. Our biggest concern with increasing the vacuum delay options is the length of time it currently takes to VACUUM our large tables (and pg_largeobject). Holding a transaction open for these long periods degrades performance in other places. > > Statistics are also skewed during this > > process and we have observed the planner choosing sequential scans on > > tables where it is obvious that an index scan would be more efficient. > > That's really pretty hard to believe; VACUUM doesn't affect the > statistics until the very end. Can you give some specifics of how > the "statistics are skewed"? I don't have any hard evidence for this, but we have noticed that at certain times a particular query which we run will run for an extremely long time (several hours). Re-running the query with EXPLAIN always shows it using an index scan and it runs very quickly. We haven't been able to catch it with an EXPLAIN in the state where it will take a long time (it's not deterministic). Our assumption is that the planner is taking the wrong path because we can't figure out any other reason why the query would take such a long time. We'll run some more experiments and try to reproduce this behavior. Is there anything specific that would help track this down (other than getting EXPLAIN output showing the bogus execution plan)? Thanks for your help. -jan- -- Jan L. Peterson <jan.l.peterson@gmail.com>
pgsql-performance by date: