Re: Big 7.1 open items - Mailing list pgsql-hackers
From | Don Baccus |
---|---|
Subject | Re: Big 7.1 open items |
Date | |
Msg-id | 3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com Whole thread Raw |
In response to | Re: Big 7.1 open items (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Big 7.1 open items
Re: Big 7.1 open items |
List | pgsql-hackers |
At 11:46 AM 6/16/00 -0400, Tom Lane wrote: >OK, to get back to the point here: so in Oracle, tables can't cross >tablespace boundaries, Right, the construct AFAIK is "create table/index foo on tablespace ..." > but a tablespace itself could span multiple >disks? Right. >Not sure if I like that better or worse than equating a tablespace >with a directory (so, presumably, all the files within it live on >one filesystem) and then trying to make tables able to span >tablespaces. We will need to do one or the other though, if we want >to have any significant improvement over the current state of affairs >for large tables. Oracle's way does a reasonable job of isolating the datamodel from the details of the physical layout. Take the OpenACS web toolkit, for instance. We could take each module's tables and indices and assign them appropriately to various dataspaces, then provide a separate .sql files with only "create tablespace" statements in there. By modifying that one central file, the toolkit installation could be customized to run anything from a small site (one disk with everything on it, ala my own personal webserver at birdnotes.net) or a very large site with many spindles, with various index and table structures spread out widely hither and thither. Given that the OpenACS datamodel is nearly 10K lines long (including many comments, of course), being able to customize an installation to such a degree by modifying a single file filled with "create tablespaces" would be very attractive. >One way is to play the flip-the-path-ordering game some more, >and access multiple-segment tables with pathnames like this: > > .../TABLESPACE/RELATION -- first or only segment > .../TABLESPACE/N/RELATION -- N'th extension segment > >This isn't any harder for md.c to deal with than what we do now, >but by making the /N subdirectories be symlinks, the dbadmin could >easily arrange for extension segments to go on different filesystems. I personally dislike depending on symlinks to move stuff around. Among other things, a pg_dump/restore (and presumably future backup tools?) can't recreate the disk layout automatically. >We'd still want to create some tools to help the dbadmin with slinging >all these symlinks around, of course. OK, if symlinks are simply an implementation detail hidden from the dbadmin, and if the physical structure is kept in the db so it can be rebuilt if necessary automatically, then I don't mind symlinks. > But I think it's critical to keep >the low-level file access protocol simple and reliable, which really >means minimizing the amount of information the backend needs to know to >figure out which file to write a page in. With something like the above >you only need to know the tablespace name (or more likely OID), the >relation OID (+name or not, depending on outcome of other argument), >and the offset in the table. No worse than now from the software's >point of view. Make the code that creates and otherwise manipulates tablespaces do the work, while keeping the low-level file access protocol simple. Yes, this approach sounds very good to me. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
pgsql-hackers by date: