Re: update on global temporary and unlogged tables - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: update on global temporary and unlogged tables |
Date | |
Msg-id | AANLkTi=jDv=5FE_KpUysNQrCR687h2ei0a-E0MU_ZLrS@mail.gmail.com Whole thread Raw |
In response to | update on global temporary and unlogged tables (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: update on global temporary and unlogged tables
|
List | pgsql-hackers |
On Mon, Sep 6, 2010 at 10:55 PM, Robert Haas <robertmhaas@gmail.com> wrote: > 3. With respect to unlogged tables, the major obstacle seems to be > figuring out a way for these to get automatically truncated at startup > time. As with temporary table cleanup in general, the problem here is > that you can't do the obvious thing of iterating through pg_class and > truncating each unlogged table you find without greatly complicating > the startup sequence. However, I think there's a fairly easy way > around this problem: truncating a table basically means removing all > segments and relation forks other than the first segment of the main > fork, and truncating that one to zero length. So we could do it the > same way we clean up temporary files - namely, based on the file name > - if we made the filenames for unlogged tables distinguishable from > those for regular and temporary tables. What I'm thinking about is > reserving a backend ID of -2 for this purpose via some suitable > constant definition, just as -1 (InvalidBackendId) represents a > permanent table in this context. I tried this approach and got fairly far with it, but ran into a snag in the buffer manager. It's fairly obvious that the buffer manager has to know whether a particular buffer is from an unlogged relation or not; for example, FlushBuffer() should skip the XLOG flush for an unlogged buffer, and must pass the correct backend ID to smgropen(). So my first thought was just to define a bit BM_IS_UNLOGGED, with the obvious interpretation. That's not quite good enough, though, because GetNewRelFileNode doesn't guarantee that the OID chosen is absolutely unique; it just guarantees that it's unique within the space defined by the database ID and backend ID. So it's possible that you could have a logged relation and an unlogged relation with the same value for pg_class.relfilenode, which means that the buffer manager can't store the unlogged status as a random bit someplace, but actually needs to have it as part of the buffer tag (otherwise, a buffer descriptor hash table lookup might find the wrong buffer). You could maybe work around this problem by having GetNewRelFileNode(), when generating an OID for either a permanent or unlogged relation, check that the OID isn't in use for either of those things already, but that requires an extra system call, so it doesn't seem ideal. I'd be willing to go that route if people think it's cheap enough and more desirable for some reason, though. So I went looking for bit-space in the buffer tag and quickly found some. ForkNumber is an enum which I suppose means a 32-bit integer, but we've only got three forks right now and it's hard to imagine more than a handful of additional ones, so what I'm tempted to do is change this from an enum to a 2-byte integer and replace the enum values with #defines. That frees up 2 bytes in the buffer tag which is more than plenty. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-hackers by date: