Re: Performance query about large tables, lots of concurrent access - Mailing list pgsql-performance
From | Karl Wright |
---|---|
Subject | Re: Performance query about large tables, lots of concurrent access |
Date | |
Msg-id | 467AA77D.3030204@metacarta.com Whole thread Raw |
In response to | Re: Performance query about large tables, lots of concurrent access (Scott Marlowe <smarlowe@g2switchworks.com>) |
Responses |
Re: Performance query about large tables, lots of concurrent access
Re: Performance query about large tables, lots of concurrent access |
List | pgsql-performance |
Scott Marlowe wrote: > Karl Wright wrote: >> Scott Marlowe wrote: >>> Karl Wright wrote: >>> >>>> Shaun Thomas wrote: >>>> >>>>> On Wednesday 20 June 2007 12:55:20 pm Karl Wright wrote: >>>>> >>>>> >>>>>> I am afraid that I did answer this. My largest tables >>>>>> are the ones continually being updated. The smaller >>>>>> ones are updated only infrequently. >>>>> >>>>> >>>>> >>>>> You know, it actually sounds like you're getting whacked by the >>>>> same problem that got us a while back. It sounds like you weren't >>>>> vacuuming frequently enough initially, and then tried vacuuming >>>>> later, only after you noticed performance degrade. >>>>> >>>>> Unfortunately what that means, is for several weeks or months, >>>>> Postgres has not been reusing rows on your (admittedly) active and >>>>> large tables; it just appends at the end, and lets old rows slowly >>>>> bloat that table larger and larger. Indexes too, will suffer from >>>>> dead pages. As frightening/sickening as this sounds, you may need >>>>> to dump/restore the really huge table, or vacuum-full to put it on >>>>> a crash diet, and then maintain a strict daily or bi-daily vacuum >>>>> schedule to keep it under control. >>>>> >>>> >>>> A nice try, but I had just completed a VACUUM on this database three >>>> hours prior to starting the VACUUM that I gave up on after 27 >>>> hours. So I don't see how much more frequently I could do it. (The >>>> one I did earlier finished in six hours - but to accomplish that I >>>> had to shut down EVERYTHING else that machine was doing.) >>> >>> >>> So, have you ever run vacuum full or reindex on this database? >>> >> >> No. However, this database has only existed since last Thursday >> afternoon. > Well, a couple of dozen update statements with no where clause on large > tables could bloat it right up. > > It's not about age so much as update / delete patterns. >> >>> You are aware of the difference between how vacuum and vacuum full >>> work, right? >>> >>> vacuum := mark deleted tuples as available, leave in table >>> vacuum full := compact tables to remove deleted tuples. >>> >>> While you should generally avoid vacuum full, if you've let your >>> database get so bloated that the majority of space in your tables is >>> now empty / deleted tuples, you likely need to vacuuum full / reindex >>> it. >>> >> If the database is continually growing, should VACUUM FULL be necessary? > If it's only growing, with no deletes or updates, then no. Generally, > on a properly vacuumed database, vacuum full should never be needed. >>> For instance, on my tiny little 31 Gigabyte reporting database, the >>> main table takes up about 17 Gigs. This query gives you some idea >>> how many bytes each row is taking on average: >>> >>> select relname, relpages::float*8192 as size, reltuples, >>> (relpages::double precision*8192)/reltuples::double precision as >>> bytes_per_row from pg_class where relname = 'businessrequestsummary'; >>> relname | size | reltuples | bytes_per_row >>> ------------------------+-------------+-------------+----------------- >>> businessrequestsummary | 17560944640 | 5.49438e+07 | 319.61656229454 >>> >>> Note that these numbers are updated by running analyze... >>> >>> What does it say about your DB? >>> >> >> I wish I could tell you. Like I said, I had to abandon this project >> to test out an upgrade procedure involving pg_dump and pg_restore. >> (The upgrade also seems to take a very long time - over 6 hours so >> far.) When it is back online I can provide further information. > > Well, let us know. I would definitely recommend getting more / faster > disks. Right now I've got a simple 4 disk RAID10 on the way to replace > the single SATA drive I'm running on right now. I can't wait. > I checked the disk picture - this is a RAID disk array with 6 drives, with a bit more than 1Tbyte total storage. 15,000 RPM. It would be hard to get more/faster disk than that. Karl
pgsql-performance by date: