Re: ALTER TABLE DROP COLUMN - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: ALTER TABLE DROP COLUMN |
Date | |
Msg-id | 9500.960740562@sss.pgh.pa.us Whole thread Raw |
In response to | RE: ALTER TABLE DROP COLUMN ("Hiroshi Inoue" <Inoue@tpf.co.jp>) |
Responses |
Re: ALTER TABLE DROP COLUMN
RE: ALTER TABLE DROP COLUMN |
List | pgsql-hackers |
>> Seems we have 4 DROP COLUMN ideas: >> Method Advantage >> ----------------------------------------------------------------- >> 1 invisible column marked by negative attnum fast >> 2 invisible column marked by is_dropped column fast >> 3 make copy of table without column col removed >> 4 make new tuples in existing table without column col removed Bruce and I talked about this by phone yesterday, and we realized that none of these are very satisfactory. #1 and #2 both have the flaw that applications that examine pg_attribute will probably break: they will see a sequence of attnum values with gaps in it. And what should the rel's relnatts field be set to? #3 and #4 are better on that point, but they leave us with the problem of renumbering references to columns after the dropped one in constraints, rules, PL functions, etc. Furthermore, there is a closely related problem that none of these approaches give us much help on: recursive ALTER TABLE ADD COLUMN. Right now, ADD puts the new column at the end of each table it's added to, which often means that it gets a different column number in child tables than in parent tables. That leads to havoc for pg_dump. I think the only clean solution is to create a clear distinction between physical and logical column numbers. Each pg_attribute tuple would need two attnum fields, and pg_class would need two relnatts fields as well. A column once created would never change its physical column number, but its logical column number might change as a consequence of adding or dropping columns before it. ADD COLUMN would ensure that a column added to child tables receives the same logical column number as it has in the parent table, thus solving the dump/reload problem. DROP COLUMN would assign an invalid logical column number to dropped columns. They could be numbered zero except that we'd probably still want a unique index on attrelid+attnum, and the index would complain. I'd suggest using Hiroshi's idea: give a dropped column a logical attnum equal to -(physical_attnum + offset). With this approach, internal operations on tuples would all use physical column numbers, but operations that interface to the outside world would present a view of only the valid logical columns. For example, the parser would only allow logical columns to be referenced by name; "SELECT *" would expand to valid logical columns in logical- column-number order; COPY would send or receive valid logical columns in logical-column-number order; etc. Stored rules and so forth probably should store physical column numbers so that they need not be modified during column add/drop. This would require looking at all the places in the backend to determine whether they should be working with logical or physical column numbers, but the design is such that most all places would want to be using physical numbers, so I don't think it'd be too painful. Although I'd prefer to give the replacement columns two new names (eg, "attlnum" and "attpnum") to ensure we find all uses, this would surely break applications that examine pg_attribute. For compatibility we'd have to recycle "attnum" and "relnatts" to indicate logical column number and logical column count, while adding new fields (say "attpnum" and "relnpatts") for the physical number and count. Comments? regards, tom lane
pgsql-hackers by date: