Re: making an unlogged table logged - Mailing list pgsql-hackers
From | Rob Wultsch |
---|---|
Subject | Re: making an unlogged table logged |
Date | |
Msg-id | AANLkTi=V6F3La5fyqFbmCcU8r4Yi4j7W8=ex5=ucctCC@mail.gmail.com Whole thread Raw |
In response to | making an unlogged table logged (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: making an unlogged table logged
Re: making an unlogged table logged |
List | pgsql-hackers |
On Tue, Jan 4, 2011 at 7:41 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Somebody asked about this on Depesz's blog today, and I think it's > come up here before too, so I thought it might be worth my writing up > a few comments on this. I don't think I'm going to have time to work > on this any time soon, but if someone else wants to work up a patch, > I'm game to review. I think it'd clearly be a good feature. > > Generally, to do this, it would be necessary to do the following > things (plus anything I'm forgetting): > > 1. Take an AccessExclusiveLock on the target table. You might think > that concurrent selects could be allowed, but I believe that's not the > case. Read on. > > 2. Verify that there are no foreign keys referencing other unlogged > tables, because if that were the case then after the change we'd have > a permanent table referencing an unlogged table, which would violate > referential integrity. (Note that unlogged referencing permanent is > OK, but permanent referencing unlogged is a no-no, so what matters > when upgrading is "outbound" foreign keys.) > > 3. Write out all shared buffers for the target table, and drop them. > This ensures that there are no buffers floating around for the target > relation that are marked BM_UNLOGGED, which would be a bad thing. Or > maybe it's possible to just clear the BM_UNLOGGED flag, instead of > dropping them. This is the step that makes me think we need an access > exclusive lock - otherwise, somebody else might read in a buffer and, > seeing that the relation is unlogged (which is true, since we haven't > committed yet), mark it BM_UNLOGGED. > > 4. fsync() any segments of the target relation - of any fork except > that init fork - that might have dirty pages not on disk. > > 5. Arrange for the appropriate file deletions at commit or abort, by > updating pendingDeletes. On commit, we want to delete the init fork > for the table and all its indexes. On abort, we want to delete > everything else, but only for pretend; that is, the abort record > should reflect the deletions since they'll need to happen on any > standbys, but we shouldn't actually perform them on the master since > we don't want to obliterate the contents of the table for no reason. > There's a subtle problem here I'm not quite sure how to deal with: > what happens if we *crash* without writing an abort record? It seems > like that could leave a stray file around on a standby, because the > current code only cleans things up on the standby at the start of > recovery; to make this bullet-proof, I suppose it'd need to repeat > that every time a crash happens on the master, but I don't know how to > do that. Note also that if wal_level is minimal, then we need only > worry about the commit case; the abort case can be a no-op. > > 6. If wal_level != minimal, XLOG every page of every fork except the > init fork, for both the table and the associated indexes. (Note that > this step also requires an AccessExclusiveLock rather than some weaker > lock, because of the arbitrary rule that only AccessExclusiveLocks are > sent to standbys. If we held only ShareRowExclusiveLock on the > master, for example, a Hot Standby backend might read the table while > it's only been half-copied.) > > 7. Update pg_class.relpersistence from 'u' to 'p', for both the table > and the associated indexes. > > Going the other direction ought to be possible too, although it seems > somewhat less useful. For that, you'd need to flip around the check > in step #2 (i.e. check for a reference FROM a permanent table), > perform step #3, skip step #4, do step #5 backwards (create and log > init forks, arranging for them to be removed on abort - this too has > an issue with crashes that don't leave abort records behind); and > arrange for the rest of the forks to be removed on commit on any > standby without doing it on the master), skip step #6, and do step #7 > backwards. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company A couple thoughts: 1. Could the making a table logged be a non-exclusive lock if the ALTER is allowed to take a full checkpoint? 2. Unlogged to logged has giant use case. 3. In MySQL I have had to ALTER tables to engine BLACKHOLE because they held data that was not vital, but the server was out of IO. Going logged -> unlogged has a significant placed, I think. -- Rob Wultsch wultsch@gmail.com
pgsql-hackers by date: