Thread: DROP COLUMN Progress
OK, This is the problem I'm having with the DROP COLUMN implementation. Since I've already incorporated all of Hiroshi's changes, I think this may have been an issue in his trial implementation as well. I have attached my current patch, which works fine and compiles properly. Ok, if you drop a column 'b', then all these work properly: select * from tab; select tab.* from tab; select b from tab; update tab set b = 3; select * from tab where b = 3; insert into tab (b) values (3); That's all good. However, the issue is that one of the things that happens when you drop a column is that the column is renamed to 'dropped_%attnum%'. So, say the 'b' column is renamed to 'dropped_2', then you can do this: select dropped_2 from tab; select tab.dropped_2 from tab; update tab set dropped_2 = 3; select * from tab where dropped_2 = 3; Where have I missed the COLUMN_IS_DROPPED checks??? Another thing: I don't want to name dropped columns 'dropped_...' as I think that's unfair on our non-English speaking users. Should I just use 'xxxx' or something? Thanks for any help, Chris
Attachment
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > So, say the 'b' column is renamed to 'dropped_2', then you can do this: > select dropped_2 from tab; > select tab.dropped_2 from tab; > update tab set dropped_2 = 3; > select * from tab where dropped_2 = 3; > Where have I missed the COLUMN_IS_DROPPED checks??? Sounds like you aren't checking in the part of the parser that resolves simple variable references. > Another thing: I don't want to name dropped columns 'dropped_...' as I > think that's unfair on our non-English speaking users. Should I just use > 'xxxx' or something? Don't be silly --- the system catalogs are completely English-centric already. Do you want to change all the catalog and column names to meaningless strings? Since the dropped columns should be invisible to anyone who's not poking at the catalogs, I don't see that we are adding any cognitive load ... regards, tom lane
Christopher Kings-Lynne wrote: > OK, > > This is the problem I'm having with the DROP COLUMN implementation. Since > I've already incorporated all of Hiroshi's changes, I think this may have > been an issue in his trial implementation as well. > > I have attached my current patch, which works fine and compiles properly. > > Ok, if you drop a column 'b', then all these work properly: > > select * from tab; > select tab.* from tab; > select b from tab; > update tab set b = 3; > select * from tab where b = 3; > insert into tab (b) values (3); > > That's all good. However, the issue is that one of the things that happens > when you drop a column is that the column is renamed to 'dropped_%attnum%'. > So, say the 'b' column is renamed to 'dropped_2', then you can do this: > > select dropped_2 from tab; > select tab.dropped_2 from tab; > update tab set dropped_2 = 3; > select * from tab where dropped_2 = 3; > > Where have I missed the COLUMN_IS_DROPPED checks??? OK, my guess is that it is checks in parser/. I would issue each of these queries with a non-existant column name, find the error message in the code, and add an isdropped check in those places. > Another thing: I don't want to name dropped columns 'dropped_...' as I > think that's unfair on our non-English speaking users. Should I just use > 'xxxx' or something? I think "dropped" is OK. The SQL is still English, e.g. DROP. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> OK, my guess is that it is checks in parser/. I would issue each of > these queries with a non-existant column name, find the error message in > the code, and add an isdropped check in those places. OK, I think I've narrowed it down to this block of code in scanRTEForColumn in parse_relation.c: /* * Scan the user column names (or aliases) for a match. Complain if * multiple matches. */ foreach(c, rte->eref->colnames) { /* @@ SKIP DROPPED HERE? @@ */ attnum++; if (strcmp(strVal(lfirst(c)), colname) == 0) { if (result) elog(ERROR, "Column reference \"%s\" is ambiguous", colname); result = (Node *) make_var(pstate, rte, attnum); rte->checkForRead= true; } } I'm thinking that I should put a 'SearchSysCacheCopy' where my @@ comment is to retrieve the attribute by name, and then do a check to see if it's dropped. Is that the best/fastest way of doing things? Seems unfortunate to add a another cache lookup in every parsed query... Comments? Chris
Christopher Kings-Lynne wrote: > /* > * Scan the user column names (or aliases) for a match. Complain if > * multiple matches. > */ > foreach(c, rte->eref->colnames) > { > /* @@ SKIP DROPPED HERE? @@ */ > attnum++; > if (strcmp(strVal(lfirst(c)), colname) == 0) > { > if (result) > elog(ERROR, "Column reference \"%s\" is > ambiguous", colname); > result = (Node *) make_var(pstate, rte, attnum); > rte->checkForRead = true; > } > } > > > I'm thinking that I should put a 'SearchSysCacheCopy' where my @@ comment is > to retrieve the attribute by name, and then do a check to see if it's > dropped. Is that the best/fastest way of doing things? Seems unfortunate > to add a another cache lookup in every parsed query... I am still looking but perhaps you could supress dropped columns from getting into eref in the first place. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> > I'm thinking that I should put a 'SearchSysCacheCopy' where my > @@ comment is > > to retrieve the attribute by name, and then do a check to see if it's > > dropped. Is that the best/fastest way of doing things? Seems > unfortunate > > to add a another cache lookup in every parsed query... > > I am still looking but perhaps you could supress dropped columns from > getting into eref in the first place. OK, that's done. I'm working on not allowing dropped columns in UPDATE targets now. Chris
> > I am still looking but perhaps you could supress dropped columns from > > getting into eref in the first place. > > OK, that's done. I'm working on not allowing dropped columns in UPDATE > targets now. OK, I've fixed it so that dropped columns cannot be targetted in an update statement, however now I'm running into this problem: If you issue an INSERT statement without qualifying any field names, ie: INSERT INTO tab VALUES (3); Although it will automatically insert NULL for the dropped columns, it still does cache lookups for the type of the dropped columns, etc. I noticed that when I tried setting the atttypid of the dropped column to (Oid)-1. Where is the bit of code that figures out the list of columns to insert into in an unqualified INSERT statement? I'm thinking that it would be nice if the dropped columns never even make it into the list of target attributes, for performance reasons... Chris
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Christopher Kings-Lynne wrote: >> I'm thinking that I should put a 'SearchSysCacheCopy' where my @@ comment is >> to retrieve the attribute by name, and then do a check to see if it's >> dropped. Is that the best/fastest way of doing things? Seems unfortunate >> to add a another cache lookup in every parsed query... > I am still looking but perhaps you could supress dropped columns from > getting into eref in the first place. That was my first thought also, but then the wrong attnum would be used in the "make_var". Ugh. I think what Chris needs to do is extend the eref data structure so that there can be placeholders for dropped attributes. Perhaps NULLs could be included in the list, and then the code would become like attnum++;if (lfirst(c) && strcmp(strVal(lfirst(c)), colname) == 0) ... This would require changes in quite a number of places :-( but I'm not sure we have much choice. The eref structure really needs to line up with attnums. Another possibility is to enter the dropped attnames in the eref list normally, and do the drop test *after* hitting a match not before. This is still slow, but not as horrendously O(N^2) slow as Chris's original pseudocode. I'm not sure how much work it'd really save though; you might find yourself hitting all the same places to add tests. regards, tom lane
Christopher Kings-Lynne wrote: > > > I am still looking but perhaps you could supress dropped columns from > > > getting into eref in the first place. > > > > OK, that's done. I'm working on not allowing dropped columns in UPDATE > > targets now. > > OK, I've fixed it so that dropped columns cannot be targetted in an update > statement, however now I'm running into this problem: > > If you issue an INSERT statement without qualifying any field names, ie: > > INSERT INTO tab VALUES (3); > > Although it will automatically insert NULL for the dropped columns, it still > does cache lookups for the type of the dropped columns, etc. I noticed that > when I tried setting the atttypid of the dropped column to (Oid)-1. Where > is the bit of code that figures out the list of columns to insert into in an > unqualified INSERT statement? parse_target.c::checkInsertTargets() > > I'm thinking that it would be nice if the dropped columns never even make it > into the list of target attributes, for performance reasons... Yea, just sloppy to have them there. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> That was my first thought also, but then the wrong attnum would be used > in the "make_var". Ugh. I think what Chris needs to do is extend the > eref data structure so that there can be placeholders for dropped > attributes. Perhaps NULLs could be included in the list, and then the > code would become like Hmmm... I don't get it - at the moment I'm preventing them from even getting into the eref and all regression tests pass and every test I try works as well... Chris
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: >> Are you checking access to columns that're to the right of the one >> dropped? > OK, interesting: > test=# create table test (a int4, b int4, c int4, d int4); > CREATE TABLE > test=# insert into test values (1,2,3,4); > INSERT 16588 1 > test=# alter table test drop b; > ALTER TABLE > test=# select * from test; > a | d | d > ---+---+--- > 1 | 3 | 4 > (1 row) What of SELECT a,c,d FROM test I'll bet that doesn't work at all... regards, tom lane
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: >> What of >> SELECT a,c,d FROM test >> I'll bet that doesn't work at all... > Yeah, broken. Damn. Yup. That loop we were just looking at needs to derive the correct attnum when it matches a column name. If you remove deleted columns from the eref list altogether, you get the wrong answer. regards, tom lane
> > test=# create table test (a int4, b int4, c int4, d int4); > > CREATE TABLE > > test=# insert into test values (1,2,3,4); > > INSERT 16588 1 > > test=# alter table test drop b; > > ALTER TABLE > > test=# select * from test; > > a | d | d > > ---+---+--- > > 1 | 3 | 4 > > (1 row) > > What of > > SELECT a,c,d FROM test > > I'll bet that doesn't work at all... Yeah, broken. Damn. test=# SELECT a,c,d FROM test;a | c | d ---+---+---1 | 2 | 3 (1 row) test=# SELECT a,d FROM test;a | d ---+---1 | 3 (1 row) test=# SELECT d,c,a FROM test;d | c | a ---+---+---3 | 2 | 1 (1 row) Chris
> "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > >> That was my first thought also, but then the wrong attnum would be used > >> in the "make_var". Ugh. I think what Chris needs to do is extend the > >> eref data structure so that there can be placeholders for dropped > >> attributes. Perhaps NULLs could be included in the list, and then the > >> code would become like > > > Hmmm... I don't get it - at the moment I'm preventing them from even > > getting into the eref and all regression tests pass and every test I try > > works as well... > > Are you checking access to columns that're to the right of the one > dropped? OK, interesting: test=# create table test (a int4, b int4, c int4, d int4); CREATE TABLE test=# insert into test values (1,2,3,4); INSERT 16588 1 test=# alter table test drop b; ALTER TABLE test=# select * from test;a | d | d ---+---+---1 | 3 | 4 (1 row) It half works, half doesn't. Sigh - how come these things always turn out harder than I think!? pg_attribute: test=# select attrelid,attname,attisdropped from pg_attribute where attrelid=16586 and attnum > 0;attrelid | attname | attisdropped ----------+-----------+-------------- 16586 | a | f 16586 | dropped_2 | t 16586 | c | f 16586 | d | f (4 rows) Chris
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: >> That was my first thought also, but then the wrong attnum would be used >> in the "make_var". Ugh. I think what Chris needs to do is extend the >> eref data structure so that there can be placeholders for dropped >> attributes. Perhaps NULLs could be included in the list, and then the >> code would become like > Hmmm... I don't get it - at the moment I'm preventing them from even > getting into the eref and all regression tests pass and every test I try > works as well... Are you checking access to columns that're to the right of the one dropped? regards, tom lane