Tablespaces - Mailing list pgsql-hackers
From | Gavin Sherry |
---|---|
Subject | Tablespaces |
Date | |
Msg-id | Pine.LNX.4.58.0402261942440.16702@linuxworld.com.au Whole thread Raw |
Responses |
Re: Tablespaces
Re: Tablespaces Re: Tablespaces Re: Tablespaces Re: Tablespaces Re: Tablespaces Re: Tablespaces Re: Tablespaces Re: Tablespaces Re: Tablespaces |
List | pgsql-hackers |
Hi all, I've been looking at implementing table spaces for 7.5. Some notes and implementation details follow. ------ Type of table space: There are many different table space implementations in relational database management systems. In my implementation, a table space in PostgreSQL will be the location of a directory on the file system in which files backing database objects can be stored. Global tables and non 'data' objects (WAL, CLOG, config files) will all remain in $PGDATA. $PGDATA/base will be the default table space. A given table space will be identified by a unique table space name. I haven't decided if 'unique' should mean database-wide unique or cross-database unique. It seems to me that we might run into problems with CREATE DATABASE ... TEMPLATE = <database with table spaces> if the uniqueness of table spaces is limited to the database level. A table space parameter will be added to DDL commands which create physical database objects (CREATE DATABASE/INDEX/TABLE/SEQUENCE) and to CREATE SCHEMA. The associated routines, as well as the corresponding DROP commands will need to be updated. Adding the ability to ALTER <object> TABLESPACE <name> seems a little painful. Would people use it? Comments? When an object is created the system will resolve the table space the object is stored in as follows: if the table space paramater is passed to the DDL command, then the object is stored in that table space (given validation of the table space, etc). If it is not passed, the object inherits its "parent's" table space where the parent/child hierarchy is as follows: database > schema > table > [index|sequence]. So, if you issued: create table foo.bar (...); We would first not that there is no TABLESPACE <name>, then cascade to the table space for the schema 'foo' (and possibly cascade to the table space for the database). A database which wasn't created with an explicit table space will be created under the default table space. This ensures backward compatibility. Creating a table space: A table space is a directory structure. The directory structure is as follows: [swm@dev /path/to/tblspc]$ ls OID1/ OID2/ OID1 and OID2 are the OIDs of databases which have created a table space against this file system location. In this respect, a table space resembles $PGDATA/base. I thought it useful to keep this kind of namespace mechanism in place so that administrators do not need to create hierarchies of names on different partitions if they want multiple databases to use the same partition. The actual creation of the table space will be done with: CREATE TABLE SPACE <name> LOCATION </path/to/tblspc>; Before creating the table space we must: 1) Check if the directory exists. If it does, create a sub directory as the OID of the current database. 2) Alternatively, if the directory doesn't exist, attempt to create it, then the sub directory. I wonder if a file, such as PG_TBLSPC, should be added to the table space directory so that, in the case of an existing non-empty directory, we can attempt to test if the directory is being used for something else and error out. Seems like: CREATE TABLESPACE tbl1 LOCATION '/var/' which will result in something like '/var/123443' is a bad idea. Then again, the user should know better. Comments? If everything goes well, we add an entry to pg_tablespace with the table space location and name (and and OID). Tying it all together: The catalogs pg_database, pg_namespace, and pg_class will have a 'tblspc' field. This will be the OID of the table space the object resides in, or 0 (default table space). Since we can then resolve relid/relname, schema and database to a tablespace, there aren't too many cases when extra logic needs to be added to the IO framework. In fact, most of it is taken care of because of the abstraction of relpath(). The creation of table spaces will need to be recorded in xlog in the same way that files are in heap_create() with the corresponding delete logic incase of ABORT. Postmaster startup: Ideally, the postmaster at startup should go into each tblspc/databaseoid directory and check for a postmaster.pid file to see if some other instance is touching the files we're interested in. This will require a control file listing tblspc/databaseoid paths and it will need to plug into WAL in case we die during CREATE TABLESPACE. Comments? Creating a database I think that createdb() is going to have to be reworked if pg_tablespace isn't shared (ie, tablespaces are only database unique). The reason being that if we create a database which has a table space, pg_tablespace in the new database will have to be updated and that cannot be done atomically with the `cp` based mechanism we currently use. I think I'm going to have to get my hands dirty before I can tell the extent to which createdb() will need reworking. pg_dump Obviously pg_dump will need to be able to dump table spaces. pg_dump running against <7.5 will DDL commands without a table space parameter and as such the database's physical layout, when loaded into 7.5, will be the same as for <7.5. --- Comments? Questions? Suggestions? Thanks, Gavin
pgsql-hackers by date: