On Dec 24, 2007, at 11:15 PM, Greg Smith wrote:
>>
>
> This may be better because it isn't doing the query first.  You may
> discover that you need to aggressively run one of the VACUUM
> processes (I'd guess regular and ANALYZE but not FULL) in order to
> keep performance steady as the number of records grows.  Anytime you
> update a row, that becomes a dead row that's still taking up space,
> and if you do a lot of those they get in the way of finding the rows
> that are still live.  Take a look at http://www.postgresql.org/docs/current/interactive/routine-vacuuming.html
>  to get an idea of the process.
Whoa.  I am going to have to dig into the implementation.  What is
wrong with update in place, concurrency issues?  The dead row
presumably is no longer indexed, right?   Since it is known to be dead
is it automatically removed when there are no live transaction that
reference or may reference it and its data page space marked available
for new rows?  If not, why not?   I'm dredging my mind for stuff from
my RDBMS implementation grad course a very long time ago.    I would
imagine that  vacuuming often in a huge insert update would be a
pretty poor performer depending on implementation.   How is this
implemented?  Why would it be heavy IO if a list of pointers
effectively is being kept to the dead rows to simply be added to the
free list?  What else is vacuum doing?   Lookup implemented removal
from indices, something else?
- samantha