Re: strange table disk sizes - Mailing list pgsql-general
From | Rik Bellens |
---|---|
Subject | Re: strange table disk sizes |
Date | |
Msg-id | 4E5F7CDA.3050907@telin.ugent.be Whole thread Raw |
In response to | strange table disk sizes (Rik Bellens <rik.bellens@telin.ugent.be>) |
Responses |
Re: strange table disk sizes
Re: strange table disk sizes |
List | pgsql-general |
Op 01-09-11 14:22, Scott Marlowe schreef: > On Thu, Sep 1, 2011 at 6:05 AM, Rik Bellens<rik.bellens@telin.ugent.be> wrote: >> Op 01-09-11 13:31, Scott Marlowe schreef: >>> On Thu, Sep 1, 2011 at 4:32 AM, Rik Bellens<rik.bellens@telin.ugent.be> >>> wrote: >>>> Hello, >>>> >>>> I have two large tables in a database, one containing original data and >>>> the >>>> other one derived from the first table. The first table contains several >>>> columns and indexes, while the second table has less columns and only one >>>> index. Both tables have the same number of rows. Nevertheless, the second >>>> table is much larger in disk size than the first one. How can this be >>>> explained? >>> This is most likely due to table bloat. In PostgreSQL when you update >>> or delete a row, a dead version gets left behind. Vacuum eventually >>> comes along and reclaims the empty space to be reused. If you delete >>> / update a LOT of rows at once, then you'll have a lot of dead rows >>> which can only be reused after vacuuming when you do more updates or >>> deletes later on. >>> >>> A few salient questions. What version of PostgreSQL are you running? >>> Is autovacuum running? Do you do a LOT of bulk deletes / updates? If >>> you do a lot of bulk deletes on this table, and you delete everything, >>> can you switch to using the truncate command instead? >> I use version 8.3. I see the 'autovacuum launcher process' and 'autovacuum >> worker process' in the process list, so I suppose autovacuum is running. >> >> Rows in the measurement table are added once and never deleted or updated. >> Adding a row to this table triggers a function that adds a row to the >> stats_count table. Normally rows are added chronologically. So rows in the >> stats_count table are normally not updated either. If however, for some >> reason, a measurement is added from an older time, all rows of that device >> which come after this time, are updated, but I don't think this will happen >> very often. >> >> The table 'stats_count' was created in a later stage, so the first 45M rows >> were added at once and chronologically. However, because the function to >> initialize this table took a long time and the client application crashed a >> few times, I had to restart this function several times. Can it be that >> there is some trash left from running this function several times without >> finishing it? Would it be a solution to run 'VACUUM FULL' to reclaim some >> disk space? > Yeah, could be. Take a look at this page: > http://wiki.postgresql.org/wiki/Show_database_bloat and see if the > query there sheds some light on your situ. thanks for this answer if i run the query, I get 12433752064 wasted bytes on stats_count_pkey, so I suppose that is the reason
pgsql-general by date: