Re: dead tuples and VACUUM - Mailing list pgsql-general

From Bruno Wolff III
Subject Re: dead tuples and VACUUM
Date
Msg-id 20030602154141.GB16405@wolff.to
Whole thread Raw
In response to Re: dead tuples and VACUUM  (Dmitry Tkach <dmitry@openratings.com>)
List pgsql-general
On Mon, Jun 02, 2003 at 10:57:18 -0400,
  Dmitry Tkach <dmitry@openratings.com> wrote:
> Thanks for the reply!
>
> I'd still like to clariofy some points...
>
> >If you replace every tuple in the table, you probably don't have a
> >large enough fress space map to track all of that.  So VACUUM FULL is
> >a good idea.
> >
> >
> Where do I set that free space map? Is it a configuration parameter?
> And what exactly does it mean? If it is not high enough and I never run
> VACUUM FULL, are my dead tuples lost forever?

It is a configuration paramter. I believe the name is FSM.

>
> >But another answer is to VACUUM every (say) couple thousand UPDATEs.
> >That'll keep the table size managable.
> >
> I am afraid, that is not feasible too - you see, that monthly update job
> I mentioned runs continuosly and takes about a couple of weeks to
> complete as it is.
> If I pause it every couple thousands of rows to do a vacuum, it will, I
> am afraid, take months (and I have to stay under at least 1 month,
> because the whole idea is to update every month).

If you are doing normal vacuum (not vacuum full), it doesn't lock tables
so you can run it along with other updates without taking a big performance
hit. All it does is mark deleted tuples that are not visible to any
current transactions as safe to reuse. By running vacuum frequently you
can maintain a steady state size for your database. If it grows unusually
large and you need to recover some space for other uses then you would
want to use vacuum full.

>
> >You could even run a parallel
> >VACUUM.  If the whole table is UPDATEd in one transaction, though,
> >that won't help.
> >
> It is not one transaction. I am running about 10 parallel jobs, and each
> of them commits every now and then (every 10K rows, I believe).
> If I run VACUUM in parallel, what exactly is going to happen?

It will work just fine.

>
> Will I still be loosing some of those dead tuples after each update?

Vacuum will mark deleted tuples as reusuable if there are no open
transactions that can still see them.

>
> Basically, my question is - if I let it run this way forever, will the
> database just keep growing on me until I am out of the disk space?
> Is there any way to prevent that, without having to take it offline for
> days?

It should reach a steady state size that depends on how often you vacuum
and how often you update. If something goes wrong (say a transaction
is started intereactively and then left open for a day or two) then
the database may grow unusually large and you may need to do a vacuum
full.

>
> Thanks!
>
> Dima
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

pgsql-general by date:

Previous
From: Bruno Wolff III
Date:
Subject: Re: check constraint
Next
From: "Ian Harding"
Date:
Subject: Re: installing contrib/tsearch with a FreeBSD Port