Re: Derived columns / denormalization - Mailing list pgsql-sql

From Erik Jones
Subject Re: Derived columns / denormalization
Date
Msg-id 10375BF5-D718-4FAA-8CFD-B3114EF4F48C@engineyard.com
Whole thread Raw
In response to Re: Derived columns / denormalization  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Derived columns / denormalization
List pgsql-sql
On Jan 15, 2009, at 8:06 PM, Tom Lane wrote:

> "Jamie Tufnell" <diesql@googlemail.com> writes:
>> item_count int -- this is derived from (select count(*) from items
>> where group_id = id)
>> ...
>
>> item_count would be updated by insert/update/delete triggers on the
>> items table, hopefully that would ensure it is always correct?
>
>> I'm wondering is what I'm trying to do here pretty standard and are
>> there any gotchas I should be aware of?
>
> Concurrent updates to the items table make this much harder than
> it might first appear.  If you're willing to serialize all your  
> updating
> transactions then you can make it work, but ...

That was exactly the caveat I was about to point out.  That being  
said, keeping COUNT() values and other computed statistics based on  
other data in the database *is* a fairly common "tactic".  On method  
that I've used to great success to avoid the serialization problem is  
to have your triggers actually insert the necessary information for  
the update into a separate "update queue" table.  You then have  
separate process that routinely sweeps that update queue, aggregates  
the updates and then updates your count values in the groups table  
with the total update values for each groups entry with updates.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k







pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: Derived columns / denormalization
Next
From: "Jamie Tufnell"
Date:
Subject: Re: Derived columns / denormalization