Thread: Eternal vacuuming....

Eternal vacuuming....

From

Tim Perdue

Date:

11 May 2000, 11:56:22

I checked the changelog for 7.0 and it doesn't look like this is fixed
yet.

In 6.4.x and 6.5.x if you delete a large number of rows (say 100,000 -
1,000,000) then hit vacuum, the vacuum will run literally forever.

If you drop the indexes on the table, vacuuming takes only minutes, but
that's a pain in the neck.

This problem kept my site down for some 12 HOURS last nite:

24244 ?        S      0:00 psql db_gotocity
24245 ?        R    951:34 /usr/local/pgsql/bin/postgres localhost tim
db_gotocity 

...before I finally killed the vacuum process, manually removed the
pg_vlock, dropped the indexes, then vacuumed again, and re-indexed.

Will this be fixed?

Tim

-- 
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

Re: Eternal vacuuming....

From

Thomas Lockhart

Date:

11 May 2000, 12:15:24

> In 6.4.x and 6.5.x if you delete a large number of rows (say 100,000 -
> 1,000,000) then hit vacuum, the vacuum will run literally forever.
> ...before I finally killed the vacuum process, manually removed the
> pg_vlock, dropped the indexes, then vacuumed again, and re-indexed.
> Will this be fixed?

Patches? ;)

Just thinking here: could we add an option to vacuum so that it would
drop and recreate indices "automatically"? We already have the ability
to chain multiple internal commands together, so that would just
require snarfing the names and properties of indices in the parser
backend and then doing the drops and creates on the fly.

A real problem with this is that those commands are currently not
rollback-able, so if something quits in the middle (or someone kills
the vacuum process; I've heard of this happening ;) then you are left
without indices in sort of a hidden way.

Not sure what the prospects are of making these DDL statements
transactionally secure though I know we've had some discussions of
this on -hackers.
                      - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California

Re: Eternal vacuuming....

From

Tim Perdue

Date:

11 May 2000, 12:33:23

Thomas Lockhart wrote:
> 
> > In 6.4.x and 6.5.x if you delete a large number of rows (say 100,000 -
> > 1,000,000) then hit vacuum, the vacuum will run literally forever.
> > ...before I finally killed the vacuum process, manually removed the
> > pg_vlock, dropped the indexes, then vacuumed again, and re-indexed.
> > Will this be fixed?
> 
> Patches? ;)

Hehehe - I say the same thing when someone complains about SourceForge.

Now you know I'm a huge postgres hugger - but PHP is my strength and you
would not like any C patches I'd submit anyway.

> Just thinking here: could we add an option to vacuum so that it would
> drop and recreate indices "automatically"? We already have the ability
> to chain multiple internal commands together, so that would just
> require snarfing the names and properties of indices in the parser
> backend and then doing the drops and creates on the fly.

This seems like a hack to me personally. Can someone figure out why the
vacuum runs forever and fix it? Probably a logic flaw somewhere?

Tim

-- 
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

Re: Eternal vacuuming....

From

Alfred Perlstein

Date:

11 May 2000, 12:39:23

* Thomas Lockhart <lockhart@alumni.caltech.edu> [000511 09:55] wrote:
> > In 6.4.x and 6.5.x if you delete a large number of rows (say 100,000 -
> > 1,000,000) then hit vacuum, the vacuum will run literally forever.
> > ...before I finally killed the vacuum process, manually removed the
> > pg_vlock, dropped the indexes, then vacuumed again, and re-indexed.
> > Will this be fixed?
> 
> Patches? ;)
> 
> Just thinking here: could we add an option to vacuum so that it would
> drop and recreate indices "automatically"?

I'm hoping automatically means some algorithm: When heap + N < index
ie. when it's really needed.

> We already have the ability
> to chain multiple internal commands together, so that would just
> require snarfing the names and properties of indices in the parser
> backend and then doing the drops and creates on the fly.
> 
> A real problem with this is that those commands are currently not
> rollback-able, so if something quits in the middle (or someone kills
> the vacuum process; I've heard of this happening ;) then you are left
> without indices in sort of a hidden way.
> 
> Not sure what the prospects are of making these DDL statements
> transactionally secure though I know we've had some discussions of
> this on -hackers.

One could do it in the opposite direction, rename the old index,
create a new index, drop the old.  If the worst happens you then
have two indexes, perhaps the database could warn about this somehow.

In fact, one could have a system table that is things to be deleted
at startup.  Put the name of the old index into it and at startup
the database could nuke the old index.  It's pretty hackish, but
would work pretty ok.

It does seem possible to have two indeces on a single column.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."

Re: Eternal vacuuming....

From

"Marc G. Fournier"

Date:

11 May 2000, 13:25:50

On Thu, 11 May 2000, Tim Perdue wrote:

> This seems like a hack to me personally. Can someone figure out why the
> vacuum runs forever and fix it? Probably a logic flaw somewhere?

I run on a >9million tuple database, and growing, in <10 minutes or so
... its the search engine for the archives, and I'm finding that if I do a
'vacuum verbose', I'm getting alot of deletes (updated records) ...

not quite the same number that you are reporting, mind you, but ...

what does a 'vacuum verbose' show for you?  and youa ren't doing a 'vacuum
analyze', are you?

Re: Eternal vacuuming....

From

Bruce Momjian

Date:

11 May 2000, 13:29:25

> > In 6.4.x and 6.5.x if you delete a large number of rows (say 100,000 -
> > 1,000,000) then hit vacuum, the vacuum will run literally forever.
> > ...before I finally killed the vacuum process, manually removed the
> > pg_vlock, dropped the indexes, then vacuumed again, and re-indexed.
> > Will this be fixed?
> 
> Patches? ;)
> 
> Just thinking here: could we add an option to vacuum so that it would
> drop and recreate indices "automatically"? We already have the ability
> to chain multiple internal commands together, so that would just
> require snarfing the names and properties of indices in the parser
> backend and then doing the drops and creates on the fly.

We could vacuum the heap table, and conditionally update or recreate the
index depending on how many tuple we needed to move during vacuum of the
heap.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Eternal vacuuming....

From

Tim Perdue

Date:

11 May 2000, 15:02:33

"Marc G. Fournier" wrote:
> what does a 'vacuum verbose' show for you?  and youa ren't doing a 'vacuum
> analyze', are you?

I believe I did 'vacuum analyze'. If info from 'vacuum verbose' would be
useful to your team, I can try to set up and reproduce this. I would
have to create a 3-million row table with an index on it, then delete
832,000 rows which I did last nite, then try again.

Tim

-- 
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

Re: Eternal vacuuming....

From

"Marc G. Fournier"

Date:

11 May 2000, 15:04:48

On Thu, 11 May 2000, Tim Perdue wrote:

> "Marc G. Fournier" wrote:
> > what does a 'vacuum verbose' show for you?  and youa ren't doing a 'vacuum
> > analyze', are you?
> 
> I believe I did 'vacuum analyze'. If info from 'vacuum verbose' would be
> useful to your team, I can try to set up and reproduce this. I would
> have to create a 3-million row table with an index on it, then delete
> 832,000 rows which I did last nite, then try again.

Okay, vacuum analyze is, from my experiences, atrociously slow ... it
*feels* faster, at least, if you do a simple vacuum first, then do the
analyze, but that might be just perception ...

Can you try just a simple 'vacuum verbose' first, without the analyze, and
see if that also takes 12hrs?

Also, what are you running this on?  Memory?  CPU?

Marc G. Fournier                               scrappy@hub.org
Systems Administrator @ hub.org                    
scrappy@{postgresql|isc}.org                       ICQ#7615664