Thread: AW: Plans for solving the VACUUM problem
> REDO in oracle is done by something known as a 'rollback segment'. You are not seriously saying that you like the "rollback segments" in Oracle. They only cause trouble: 1. configuration (for every different workload you need a different config) 2. snapshot too old 3. tx abort because rollback segments are full 4. They use up huge amounts of space (e.g. 20 Gb rollback seg for a 120 Gb SAP) If I read the papers correctly Version 9 gets rid of Point 1 but the rest ... Andreas
Actually I don't like the problems with rollback segments in oracle at all. I am just concerned that using WAL for UNDO will have all of the same problems if it isn't designed carefully. At least in oracle's rollback segments there are multiple of them, in WAL there is just one, thus you will potentially have that 20Gig all in your single log directory. People are already reporting the log directory growing to a gig or more when running vacuum in 7.1. Of the points you raised about oracle's rollback segment problems: 1. configuration (for every different workload you need a different config) Postgres should be able to do a better job here. 2. snapshot too old Shouldn't be a problem as long as postgres continues to use a non-overwriting storage manager. However under an overwritingstorage manager, you need to keep all of the changes in the UNDO records to satisfy the query snapshot, thus ifyou want to limit the size of UNDO you may need to kill long running queries. 3. tx abort because rollback segments are full If you want to limit the size of the UNDO, then this is a corresponding byproduct. I believe a mail note was sent out yesterday suggesting that limits like this be added to the todo list. 4. They use up huge amounts of space (e.g. 20 Gb rollback seg for a 120 Gb SAP) You need to store the UNDO information somewhere. And on active databases that can amount to alot of information, especially for bulk loads or massive updates. thanks, --Barry Zeugswetter Andreas SB wrote: > > >> REDO in oracle is done by something known as a 'rollback segment'. > > > You are not seriously saying that you like the "rollback segments" in Oracle. > They only cause trouble: > 1. configuration (for every different workload you need a different config) > 2. snapshot too old > 3. tx abort because rollback segments are full > 4. They use up huge amounts of space (e.g. 20 Gb rollback seg for a 120 Gb SAP) > > If I read the papers correctly Version 9 gets rid of Point 1 but the rest ... > > Andreas > >
> > 1. Compact log files after checkpoint (save records of uncommitted > > transactions and remove/archive others). > > On the grounds that undo is not guaranteed anyway (concurrent > heap access), why not simply forget it, We can set flag in ItemData and register callback function in buffer header regardless concurrent heap/index access. So buffer will be cleaned before throwing it out from buffer pool (little optimization: if at the time when pin drops to 0 buffer is undirty then we shouldn't really clean it up to avoid unnecessary write - we can save info in FSM that space is available and clean it up on first pin/read). So, only ability of *immediate reusing* is not guaranteed. But this is general problem of space reusing till we assume that buffer pin is enough to access data. > since above sounds rather expensive ? I'm not sure. For non-overwriting smgr required UNDO info is pretty small because of we're not required to keep tuple data, unlike Oracle & Co. We can even store UNDO info in non-WAL format to avoid log record header overhead. UNDO files would be kind of Oracle rollback segments but muuuuch smaller. But yeh - additional log reads. > The downside would only be, that long running txn's cannot > [easily] rollback to savepoint. We should implement savepoints for all or none transactions, no? > > 2. Abort long running transactions. > > This is imho "the" big downside of UNDO, and should not > simply be put on the TODO without thorow research. I think it > would be better to forget UNDO for long running transactions > before aborting them. Abort could be configurable. Vadim