Thread: vacuum - reclaiming disk space.
I have a large table with numerous indexes which has approximately doubled in size after adding a column - every row was rewritten and 50% of the tuples are dead. I'd like to reclaim this space, but VACUUM FULL cannot seem to finish within the scheduled downtime.
Any suggestions for reclaiming the space without excessive downtime?
On Thu, Mar 17, 2016 at 7:27 AM, Mike Blackwell <mike.blackwell@rrd.com> wrote:
I have a large table with numerous indexes which has approximately doubled in size after adding a column - every row was rewritten and 50% of the tuples are dead. I'd like to reclaim this space, but VACUUM FULL cannot seem to finish within the scheduled downtime.Any suggestions for reclaiming the space without excessive downtime?
pg_repack is a good tool for removing bloat. https://github.com/reorg/pg_repack
On Thu, Mar 17, 2016 at 10:57 AM, bricklen <bricklen@gmail.com> wrote:
On Thu, Mar 17, 2016 at 7:27 AM, Mike Blackwell <mike.blackwell@rrd.com> wrote:I have a large table with numerous indexes which has approximately doubled in size after adding a column - every row was rewritten and 50% of the tuples are dead. I'd like to reclaim this space, but VACUUM FULL cannot seem to finish within the scheduled downtime.Any suggestions for reclaiming the space without excessive downtime?pg_repack is a good tool for removing bloat. https://github.com/reorg/pg_repack
"I have a large table with numerous indexes :
My first thought is, "DEFINE NUMEROUS". How many indexes do you actually have? How many of those indexes are actually used? In addition to VACUUMing the table, it also needs to go through every index you have.
So find out if you have any unneeded indexes with:
SELECT n.nspname as schema,
i.relname as table,
i.indexrelname as index,
i.idx_scan,
i.idx_tup_read,
i.idx_tup_fetch,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.relname))) AS table_size,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.indexrelname))) AS index_size,
pg_get_indexdef(idx.indexrelid) as idx_definition
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE i.idx_scan = 0
AND NOT idx.indisprimary
AND NOT idx.indisunique
ORDER BY 1, 2, 3;
SELECT n.nspname as schema,
i.relname as table,
i.indexrelname as index,
i.idx_scan,
i.idx_tup_read,
i.idx_tup_fetch,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.relname))) AS table_size,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.indexrelname))) AS index_size,
pg_get_indexdef(idx.indexrelid) as idx_definition
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE i.idx_scan = 0
AND NOT idx.indisprimary
AND NOT idx.indisunique
ORDER BY 1, 2, 3;
Then drop any index that shows up!
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Just to throw some extreme ideas out there, you could stand up a postgres on some other server, pg_dump your current database and use that dump to build up your second postgres. Use that new postgres when your system goes live again after downtime. Restoring from a dump means your database would not take up as much space since I assume your issue is that all that space was allocated to postgres for the purposes of your large number of table updates.
On Thu, Mar 17, 2016 at 11:34 AM, Melvin Davidson <melvin6925@gmail.com> wrote:
"I have a large table with numerous indexes :On Thu, Mar 17, 2016 at 10:57 AM, bricklen <bricklen@gmail.com> wrote:On Thu, Mar 17, 2016 at 7:27 AM, Mike Blackwell <mike.blackwell@rrd.com> wrote:I have a large table with numerous indexes which has approximately doubled in size after adding a column - every row was rewritten and 50% of the tuples are dead. I'd like to reclaim this space, but VACUUM FULL cannot seem to finish within the scheduled downtime.Any suggestions for reclaiming the space without excessive downtime?pg_repack is a good tool for removing bloat. https://github.com/reorg/pg_repackMy first thought is, "DEFINE NUMEROUS". How many indexes do you actually have? How many of those indexes are actually used? In addition to VACUUMing the table, it also needs to go through every index you have.So find out if you have any unneeded indexes with:
SELECT n.nspname as schema,
i.relname as table,
i.indexrelname as index,
i.idx_scan,
i.idx_tup_read,
i.idx_tup_fetch,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.relname))) AS table_size,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.indexrelname))) AS index_size,
pg_get_indexdef(idx.indexrelid) as idx_definition
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE i.idx_scan = 0
AND NOT idx.indisprimary
AND NOT idx.indisunique
ORDER BY 1, 2, 3;Then drop any index that shows up!
--Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
Robert McAlpine
Autovacuum will eventually free your extra pages regarding index bloat but it takes multiple runs.
You could also use reindex instead of vacuum full since you are only interested in the index.
For the table there may be other options but they depend on your pattern of writes.
On Thu, Mar 17, 2016 at 7:27 AM, Mike Blackwell <mike.blackwell@rrd.com> wrote:
I have a large table with numerous indexes which has approximately doubled in size after adding a column - every row was rewritten and 50% of the tuples are dead. I'd like to reclaim this space, but VACUUM FULL cannot seem to finish within the scheduled downtime.Any suggestions for reclaiming the space without excessive downtime?
--
Best Wishes,
Chris Travers
Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.