Thread: ALTER TABLE MODIFY COLUMN
I was looking at how hard it would be to support altering column types and it seems to me that it would be trivial to support changing nullability, increasing the maximum length of the VARCHAR data type and increasing the precision or scale of the DECIMAL / NUMERIC data type. Oracle allows you to update a column to null and then modify its data type to any other type. This is easy because it stores all columns in a variable length format and a null looks the same regardless of type. I understand that with the current heap tuple format described in backend/access/common.c that changing the type of any fixed length attribute requires updating every row. Surely if we have an write exclusive table lock we can rewrite tuples in place rather than creating new versions with its corresponding 2x space requirement. We could presumably do the following to change a column data type: Preconditions: 1. Type conversion is possible from old type to new type and either a) old type is unconditionally convertible(e.g. length/precision increasing) b) read locked scan of table reveals that all values are convertible2. Exclusive write lock on table if(new and old types are variable length and are binary compatible){ 1. Change type in catalog 2. Done} else{ 1. Visit all current tuples and rewrite in place, converting attribute value to new type, and shifting all other attributes and null bitmask appropriately 2. Change type in catalog 3. Done} Does this sound reasonable? Also, is anyone working on ALTER TABLE DROP COLUMN right now? Speaking of which, couldn't we make it so that UPDATES and DELETES running under an exclusive table lock do an inline vacuum? - Mark Butler
Mark Butler wrote: > > I was looking at how hard it would be to support altering column types and it > seems to me that it would be trivial to support changing nullability, Yes. The problem is how to formulate 'DROP CONSTRAINT' feature. > increasing the maximum length of the VARCHAR data type and increasing the > precision or scale of the DECIMAL / NUMERIC data type. > Yes. The problem is how PostgreSQL could recognize the fact. [snip] > I understand that with the current heap tuple format described in > backend/access/common.c that changing the type of any fixed length attribute > requires updating every row. > > Surely if we have an write exclusive table lock we can rewrite tuples in place > rather than creating new versions with its corresponding 2x space requirement. > PostgreSQL has a no overwrite storage manager, so this seems to have little advantage. We now have a mechanism to replace existent relation files safely. We could avoid running VACUUM, multiple version of tuples in a file etc ... regards, Hiroshi Inoue
Mark Butler <butlerm@middle.net> writes: > Surely if we have an write exclusive table lock we can rewrite tuples > in place rather than creating new versions with its corresponding 2x > space requirement. Nyet. Consider transaction rollback. regards, tom lane
Tom Lane wrote: > > Mark Butler <butlerm@middle.net> writes: > > Surely if we have an write exclusive table lock we can rewrite tuples > > in place rather than creating new versions with its corresponding 2x > > space requirement. > > Nyet. Consider transaction rollback. Well, the first thing to consider would be to make this type of DDL operation un-abortable. If the database goes down while the table modification is in progress, the recovery process could continue the operation to completion before releasing the table for general access. The problem with the standard alternatives is that they waste space and are slow: Alt 1. create new version of tuples in new format like DROP COLUMN proposal Alt 2. rename column; add new column; copy column; drop (hide) old column Alt 3. rename indices; rename table; copy table; recreate indices; Now this probably only makes a difference in a data warehouse environment, where the speed of mass load / update operations is much more important than being able to roll them back. I suppose there are two really radical alternatives as well: Radical Alt 1: Use format versioning to allow multiple row formats to co-exist, lazy update to latest format Radical Alt 2: Delta compress different versions of the same row on the same page I can see that the conventional alternatives make sense for now, however. - Mark Butler