Re: Frequent Update Project: Design Overview - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: Frequent Update Project: Design Overview |
Date | |
Msg-id | 455D8025.EE98.0025.0@wicourts.gov Whole thread Raw |
In response to | Re: Frequent Update Project: Design Overview ofHOTUpdates (Hannu Krosing <hannu@skype.net>) |
Responses |
Re: Frequent Update Project: Design Overview ofHOTUpdates
|
List | pgsql-hackers |
>>> On Fri, Nov 17, 2006 at 5:30 AM, in message <1163763016.2941.92.camel@localhost.localdomain>, Hannu Krosing <hannu@skype.net> wrote: > Ühel kenal päeval, E, 2006-11-13 kell 13:42, kirjutas Csaba Nagy: >> [snip] >> > IMHO *most* UPDATEs occur on non-indexed fields. [snip] >> > >> > If my assumption is badly wrong on that then perhaps HOT would not be >> > useful after all. If we find that the majority of UPDATEs meet the HOT >> > pre-conditions, then I would continue to advocate it. >> >> Just to confirm that the scenario is valid: our application has almost >> all it's updates affecting only non-indexed columns. There are a few >> exceptions, but the vast majority is non-indexed, and that holds to the >> execution frequency too, not just for the count of tables/queries. > > One interesting case which should also be considered is conditional > indexes: > > create index on payments(payment_id) where status = 'waiting'; > > here the payment_id is not changed when processing the payment, but when > status is changed to 'processed' it still should be removed from the > index. > > How would this interact with HOT ? I would say that at least 80% of our updates (probably higher) do not modify indexed columns. We have a few very small tables(under 100 rows) which have high update rates (often exceeding 100 updates per second) which are not against indexedcolumns. These quickly degraded our performance until we set pretty aggressive autovacuum parameters (20% + 1 rowevery 10 seconds) and added a daily cluster to our maintenance crontab runs. At the other extreme, we have a table which tracks the last modification timestamp of each court case, indexed by timestamp,to support our SOAP subscribers who want to stay up-to-date on all active court cases. Updates in this table areboth high volume and always involve an indexed column. Like Hannu, we do use conditional indexes with high updates on columns in the WHERE clause, although these columns are notpart of the index sequence. For example, we have a receivables table which contains a balance due. For audit trail purposesthese rows remain for many years after the balance hits zero, but they're not something you want to look at whensomeone is standing at the counter with their checkbook. We index by name where the balance is non-zero. The balanceis updated frequently, with most eventually hitting zero. (The reason for the frequent updates is that the receivableis maintained by triggers from the supporting assessment detail, so a receivable will be initially added with azero balance and may immediately be updated dozens of times as the assessment detail is added.) Infrequently, the balancemay hit zero and subsequently become non-zero again. I hope this is helpful. -Kevin
pgsql-hackers by date: