Re: Allow "snapshot too old" error, to prevent bloat - Mailing list pgsql-hackers
From | Jim Nasby |
---|---|
Subject | Re: Allow "snapshot too old" error, to prevent bloat |
Date | |
Msg-id | 54E0EC79.3060009@BlueTreble.com Whole thread Raw |
In response to | Re: Allow "snapshot too old" error, to prevent bloat (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Allow "snapshot too old" error, to prevent bloat
|
List | pgsql-hackers |
On 2/15/15 10:36 AM, Tom Lane wrote: > Kevin Grittner <kgrittn@ymail.com> writes: >> Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Kevin Grittner <kgrittn@ymail.com> writes: >>>> What this patch does is add a GUC call old_snapshot_threshold. It >>>> defaults to -1, which leaves behavior matching unpatched code. >>>> Above that it allows tuples to be vacuumed away after the number of >>>> transaction IDs specified by the GUC have been consumed. > >>> TBH, I'm not sure why we'd wish to emulate Oracle's single worst >>> operational feature. > >> I've run into cases where people have suffered horribly bloated >> databases because of one ill-behaved connection. Some companies >> don't want to be vulnerable to that and the disruption that >> recovery from that bloat causes. > > No doubt, preventing bloat is a good thing, but that doesn't mean this > is the best API we could create for the issue. The proposition this > patch offers to DBAs is: "You can turn this knob to reduce bloat by some > hard-to-quantify factor. The cost is that some long-running transactions > might fail. You won't know which ones are at risk, the failures won't be > the same from time to time, and you won't be able to do anything to spare > high-value transactions from that fate except by turning that knob back > again globally (which requires a database restart)." Maybe refugees from > Oracle will think that sounds good, but nobody else will. > > I wonder if we couldn't achieve largely the same positive effects through > adding a simple transaction-level timeout option. That would be far > easier for users to understand and manage, it would be trivial to allow > specific high-value transactions to run with a laxer setting, it does not > create any implementation-detail-dependent behavior that we'd be having to > explain to users forevermore, and (not incidentally) it would be a lot > simpler and more trustworthy to implement. There's no well-defined > correlation between your setting and the net effect on database bloat, > but that's true with the "snapshot too old" approach as well. A common use-case is long-running reports hitting relatively stable data in a database that also has tables with a high churn rate (ie: queue tables). In those scenarios your only options right now are to suffer huge amounts of bloat in the high-churn or not do your reporting. A simple transaction timeout only "solves" this by denying you reporting queries. An idea that I've had on this would be some way to "lock down" the tables that a long-running transaction could access. That would allow vacuum to ignore any snapshots that transaction had for tables it wasn't accessing. That's something that would be deterministic. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
pgsql-hackers by date: