Proposal for DROP TABLE rollback mechanism - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Proposal for DROP TABLE rollback mechanism |
Date | |
Msg-id | 3069.972681698@sss.pgh.pa.us Whole thread Raw |
List | pgsql-hackers |
(This is mostly directed at Vadim, but kibitzing is welcome.) Here's what I plan to do to make DROP TABLE rollbackable and clean up the handling of CREATE TABLE rollback. Comments? Overview: 1. smgrcreate() will create the file for the relation same as it does now, and will add the rel's RelFileNode information to an smgr-private list of rels created in the current transaction. 2. smgrunlink() normally will NOT immediately delete the file; instead it will perform smgrclose() and then add the rel's RelFileNode information to an smgr-private list of rels to be deleted at commit. However, if the file appears in the list created by smgrcreate() --- ie, the rel is being created and deleted in the same xact --- then we can delete it immediately. In this case we remove the file from the smgrcreate list and do not put it on the unlink list. 3. smgrcommit() will delete all the files mentioned in the list created by smgrunlink, then discard both lists. 4. smgrabort() will delete all the files mentioned in the list created by smgrcreate, then discard both lists. Points 1 and 4 will replace the existing relcache-based mechanism for deleting files created in the current xact when the xact aborts. Various details: To support deleting files at xact commit/abort, we will need something like an "mdblindunlink" entrypoint to md.c. I am inclined to simply redefine mdunlink to take a RelFileNode instead of a complete Relation, rather than supporting two entrypoints --- I don't think there'll be any future use for the existing mdunlink. Objections? bufmgr.c's ReleaseRelationBuffers drops any dirty buffers for the target rel, and therefore it must NOT be called inside the transaction (else, rollback would mean we'd lost data). I am inclined to let it continue to behave that way, but to call it from smgrcommit/smgrabort, not from anywhere else. This would mean changing its API to take a RelFileNode, but that seems no big problem. This way, dirty buffers for a doomed relation will be allowed to live until transaction commit, in the hopes that we will be able to discard them unwritten. Will remove notices in DROP TABLE etc. warning that these operations are not rollbackable. Note that CREATE/DROP DATABASE is still not rollback-able, and so those two ops will continue to elog(ERROR) when called in a transaction block. Ditto for VACUUM; probably also ditto for REINDEX, though I haven't looked closely at that yet. The temp table name mapper will need to be modified so that it can undo all current-xact changes to its name mapping list at xact abort. Currently I think it only handles undoing additions, not deletions/renames. This does not need to be WAL-aware, does it? WAL: AFAICS, things will behave properly if calls to smgrcreate/smgrunlink are logged as WAL events. For redo, they are executed just the same as normal, except they shouldn't complain if the target file already exists (or already doesn't exist, for unlink). Undo of smgrcreate is just immediate mdunlink; undo of smgrunlink is a no-op. I have not studied the WAL code enough to be prepared to add the logging/undo/redo code, and it looks like you haven't implemented that anyway yet for smgr.c, so I will leave that part to you, OK? regards, tom lane
pgsql-hackers by date: