Home > mailing lists

Re: sigh - Mailing list pgsql-hackers

From	Christopher Browne
Subject	Re: sigh
Date	December 3, 2003 02:28:35
Msg-id	m31xrmwim3.fsf@wolfe.cbbrowne.com Whole thread Raw
In response to	sigh (Thomas Zehetbauer <thomasz@hostmaster.org>)
Responses	Re: sigh
List	pgsql-hackers

Tree view

Martha Stewart called it a Good Thing when Randolf Richardson <rr@8x.ca> wrote:
>>> The count(*) information can be revisioned too, am I wrong ? I'm able
>>> to create a trigger that store the count(*) information in a special
>>> table, why not implement the same in a way "builded in" ?
>> 
>> Then every insert or delete would have to lock that count. Nobody else
>> would be able to insert or delete any records until you either commit or
>> roll back. 
>> 
>> That would lead to much lower concurrency, much more contention for
>> locks, and tons of deadlocks.
>
>         What about queueing all these updates for a separate
> low-priority thread?  The thread would be the only one with access
> to update this field.

If updates are "queued," then how do you get to use them if the
"update thread" isn't running because it's not high enough in
priority?

I am not being facetious.

The one way that is expected to be successful would be to have a
trigger that, upon seeing an insert of 5 rows to table "ABC", puts,
into table "count_detail", something like:
 insert into count_detail (table, value) values ('ABC', 5);

You then replace select count(*) from abc;

with select sum(value) from count_detail where table = 'ABC';

The "low priority" thread would be a process that does something akin
to vacuuming, where it would replace the contents of the table every
so often...
for curr_table in (select table from count_detail) do  new_total = select sum(value) from count_detail
wheretable = curr_table;  delete from count_detail where table = curr_table;  insert into count_detail (table, value)
values(curr_table,                                                  new_total);done

The point of this being to try to keep the number of rows to 1 per
table.

Note that this gets _real_ expensive for tables that see lots of
single row inserts and deletes.  There isn't a cheaper way that will
actually account for the true numbers of records that have been
committed.  

For a small table, it will be cheaper to walk through and calculate
count(*) directly from the tuples themselves.

The situation where it may be worthwhile to do this is a table which
is rather large (thus count(*) is expensive) where there is some
special reason to truly care how many rows there are in the table.
For _most_ tables, it seems unlikely that this will be true.  For
_most_ tables, it is absolutely not worth the cost of tracking the
information.
-- 
(format nil "~S@~S" "cbbrowne" "acm.org")
http://cbbrowne.com/info/spreadsheets.html
Predestination was doomed from the start.

pgsql-hackers by date:

From: Claudio Natoli
Date: 03 December 2003, 01:31:18
Subject: Re: fork/exec problem: DynaHashCxt

From: Mark Kirkwood
Date: 03 December 2003, 04:35:32
Subject: Re: *sigh*

Re: sigh - Mailing list pgsql-hackers

Previous

Next

Re: *sigh* - Mailing list pgsql-hackers

Previous

Next

Re: sigh - Mailing list pgsql-hackers