Thread: Re: [HACKERS] Refactoring identifier checks to consistently use strcmp
Re: [HACKERS] Refactoring identifier checks to consistently use strcmp
From
Heikki Linnakangas
Date:
On 04/04/2017 10:13 AM, Daniel Gustafsson wrote: >> On 04 Apr 2017, at 05:52, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> >> Daniel Gustafsson wrote: >>> Testing DefElem options is done with both strcmp() and pg_strcasecmp() a bit >>> mixed. Since the option defnames are all lowercased, either via IDENT, keyword >>> rules or “by hand” with makeString(), using strcmp() is safe (and assumed to be >>> so in quite a lot of places). >>> >>> While it’s not incorrect per se to use pg_strcasecmp(), it has the potential to >>> hide a DefElem created with a mixed-case defname where it in other places is >>> expected to be in lowercase, which may lead to subtle bugs. >>> >>> The attached patch refactors to use strcmp() consistently for option processing >>> in the command code as a pre-emptive belts+suspenders move against such subtle >>> bugs and to make the code more consistent. Also reorders a few checks to have >>> all in the same “format” and removes a comment related to the above. >>> >>> Tested with randomizing case on options in make check (not included in patch). >> >> Does it work correctly in the Turkish locale? > > Yes, running make check with random case defnames under tr_TR works fine. This no longer works: postgres=# CREATE TEXT SEARCH DICTIONARY public.simple_dict ( TEMPLATE = pg_catalog.simple, "STOPWORds" = english ); ERROR: unrecognized simple dictionary parameter: "STOPWORds" In hindsight, perhaps we should always have been more strict about that to begin wtih, but let's not break backwards-compatibility without a better reason. I didn't thoroughly check all of the cases here, to see if there are more like this. It'd be nice to have some kind of a rule on when pg_strcasecmp should be used and when strcmp() is enough. Currently, by looking at the code, I can't tell. - Heikki
Heikki Linnakangas <hlinnaka@iki.fi> writes: > This no longer works: > postgres=# CREATE TEXT SEARCH DICTIONARY public.simple_dict ( > TEMPLATE = pg_catalog.simple, > "STOPWORds" = english > ); > ERROR: unrecognized simple dictionary parameter: "STOPWORds" > In hindsight, perhaps we should always have been more strict about that > to begin wtih, but let's not break backwards-compatibility without a > better reason. I didn't thoroughly check all of the cases here, to see > if there are more like this. You have a point, but I'm not sure that this is such a bad compatibility break as to be a reason not to change things to be more consistent. > It'd be nice to have some kind of a rule on when pg_strcasecmp should be > used and when strcmp() is enough. Currently, by looking at the code, I > can't tell. My thought is that if we are looking at words that have been through the parser, then it should *always* be plain strcmp; we should expect that the parser already did the appropriate case-folding. If the user prevented case-folding by double-quoting, I don't have a lot of sympathy for any complaints about it. Generally speaking, what we're dealing with here is things that are logically keywords but we did not wish to make them real parser keywords. But in SQL, once you quote a keyword, it's not a keyword at all anymore. So I think the argument that quoting "stopwords" should be legal at all in this context is pretty weak, and the argument that quoting a weirdly-cased version of it should work is even weaker. pg_strcasecmp would be appropriate, perhaps, if we're dealing with stuff that somehow came in without going through the parser. regards, tom lane
On Wed, Aug 16, 2017 at 11:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > You have a point, but I'm not sure that this is such a bad compatibility > break as to be a reason not to change things to be more consistent. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On 16 Aug 2017, at 17:51, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Heikki Linnakangas <hlinnaka@iki.fi> writes: >> This no longer works: > >> postgres=# CREATE TEXT SEARCH DICTIONARY public.simple_dict ( >> TEMPLATE = pg_catalog.simple, >> "STOPWORds" = english >> ); >> ERROR: unrecognized simple dictionary parameter: "STOPWORds" > >> In hindsight, perhaps we should always have been more strict about that >> to begin wtih, but let's not break backwards-compatibility without a >> better reason. I didn't thoroughly check all of the cases here, to see >> if there are more like this. > > You have a point, but I'm not sure that this is such a bad compatibility > break as to be a reason not to change things to be more consistent. I agree with this, but I admittedly have no idea how common the above case would be in the wild. >> It'd be nice to have some kind of a rule on when pg_strcasecmp should be >> used and when strcmp() is enough. Currently, by looking at the code, I >> can't tell. > > My thought is that if we are looking at words that have been through the > parser, then it should *always* be plain strcmp; we should expect that > the parser already did the appropriate case-folding. +1 > pg_strcasecmp would be appropriate, perhaps, if we're dealing with stuff > that somehow came in without going through the parser. In that case it would be up to the consumer of the data to handle required case-folding for the expected input, so pg_strcasecmp or strcmp depending on situation. cheers ./daniel
> On 17 Aug 2017, at 11:08, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 16 Aug 2017, at 17:51, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> Heikki Linnakangas <hlinnaka@iki.fi> writes: >>> This no longer works: >> >>> postgres=# CREATE TEXT SEARCH DICTIONARY public.simple_dict ( >>> TEMPLATE = pg_catalog.simple, >>> "STOPWORds" = english >>> ); >>> ERROR: unrecognized simple dictionary parameter: "STOPWORds" >> >>> In hindsight, perhaps we should always have been more strict about that >>> to begin wtih, but let's not break backwards-compatibility without a >>> better reason. I didn't thoroughly check all of the cases here, to see >>> if there are more like this. >> >> You have a point, but I'm not sure that this is such a bad compatibility >> break as to be a reason not to change things to be more consistent. > > I agree with this, but I admittedly have no idea how common the above case > would be in the wild. > >>> It'd be nice to have some kind of a rule on when pg_strcasecmp should be >>> used and when strcmp() is enough. Currently, by looking at the code, I >>> can't tell. >> >> My thought is that if we are looking at words that have been through the >> parser, then it should *always* be plain strcmp; we should expect that >> the parser already did the appropriate case-folding. > > +1 > >> pg_strcasecmp would be appropriate, perhaps, if we're dealing with stuff >> that somehow came in without going through the parser. > > In that case it would be up to the consumer of the data to handle required > case-folding for the expected input, so pg_strcasecmp or strcmp depending on > situation. This patch has been marked “Waiting on Author”, but I’m not sure what the concensus of this thread came to with regards to quoted keywords and backwards compatibility. There seems to be a 2-1 vote for allowing a break, and forcing all keywords out of the parser to be casefolded. Any other opinions? cheers ./daniel
On Tue, Sep 5, 2017 at 5:34 PM, Daniel Gustafsson <daniel@yesql.se> wrote: >> On 17 Aug 2017, at 11:08, Daniel Gustafsson <daniel@yesql.se> wrote: >>> On 16 Aug 2017, at 17:51, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> My thought is that if we are looking at words that have been through the >>> parser, then it should *always* be plain strcmp; we should expect that >>> the parser already did the appropriate case-folding. >> >> +1 >> >>> pg_strcasecmp would be appropriate, perhaps, if we're dealing with stuff >>> that somehow came in without going through the parser. >> >> In that case it would be up to the consumer of the data to handle required >> case-folding for the expected input, so pg_strcasecmp or strcmp depending on >> situation. > > This patch has been marked “Waiting on Author”, but I’m not sure what the > concensus of this thread came to with regards to quoted keywords and backwards > compatibility. There seems to be a 2-1 vote for allowing a break, and forcing > all keywords out of the parser to be casefolded. Any other opinions? This patch impacts the DDL grammar of aggregates, operators, collations, text search, views, etc. Still I agree with the purpose of this thread that it would be nice to get a more consistent behavior even if it breaks some queries, so +1 for the argument with the post-parser comparison which should use strcmp. The patch needs a rebase, and there are a couple of places that need an extra lookup I think: $ git grep defname -- *.c | grep strcasecmp | wc -l 39 Switching to "waiting on author" for now. Thanks, -- Michael
> On 17 Nov 2017, at 03:31, Michael Paquier <michael.paquier@gmail.com> wrote: > > On Tue, Sep 5, 2017 at 5:34 PM, Daniel Gustafsson <daniel@yesql.se> wrote: >>> On 17 Aug 2017, at 11:08, Daniel Gustafsson <daniel@yesql.se> wrote: >>>> On 16 Aug 2017, at 17:51, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> My thought is that if we are looking at words that have been through the >>>> parser, then it should *always* be plain strcmp; we should expect that >>>> the parser already did the appropriate case-folding. >>> >>> +1 >>> >>>> pg_strcasecmp would be appropriate, perhaps, if we're dealing with stuff >>>> that somehow came in without going through the parser. >>> >>> In that case it would be up to the consumer of the data to handle required >>> case-folding for the expected input, so pg_strcasecmp or strcmp depending on >>> situation. >> >> This patch has been marked “Waiting on Author”, but I’m not sure what the >> concensus of this thread came to with regards to quoted keywords and backwards >> compatibility. There seems to be a 2-1 vote for allowing a break, and forcing >> all keywords out of the parser to be casefolded. Any other opinions? > > This patch impacts the DDL grammar of aggregates, operators, > collations, text search, views, etc. Still I agree with the purpose of > this thread that it would be nice to get a more consistent behavior > even if it breaks some queries, so +1 for the argument with the > post-parser comparison which should use strcmp. Thanks for reviewing! > The patch needs a rebase, and there are a couple of places that need > an extra lookup I think: > $ git grep defname -- *.c | grep strcasecmp | wc -l > 39 Rebased and handled a few more places which I had either missed in the last round, or that had been added in the meantime. “PARALLEL” in aggregatecmds.c is intentionally using pg_strcasecmp() due to the old-style syntax which is still supported. AFAICS this covers all relevant codepaths from the 39 above. cheers ./daniel
Attachment
On Tue, Nov 28, 2017 at 12:11 AM, Daniel Gustafsson <daniel@yesql.se> wrote: >> The patch needs a rebase, and there are a couple of places that need >> an extra lookup I think: >> $ git grep defname -- *.c | grep strcasecmp | wc -l >> 39 > > Rebased and handled a few more places which I had either missed in the last > round, or that had been added in the meantime. “PARALLEL” in aggregatecmds.c > is intentionally using pg_strcasecmp() due to the old-style syntax which is > still supported. This meritates a comment. Code readers may get confused. > AFAICS this covers all relevant codepaths from the 39 above. I was just looking at the tsearch code which uses pg_strcmpcase, and those are defined with makeDefElem() so you should switch to strcmp in this case as well, no? If I patch the code myself I would get an error when double-quoting, making those command more consistent with the rest of what you are patching here: create extension unaccent; alter text search dictionary unaccent (Rules = 'unaccent'); -- ok alter text search dictionary unaccent (RuLes = 'unaccent'); -- ok alter text search dictionary unaccent ("Rules" = 'unaccent'); -- error -- Michael
On Tue, Nov 28, 2017 at 10:07 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > I was just looking at the tsearch code which uses pg_strcmpcase, and > those are defined with makeDefElem() so you should switch to strcmp in > this case as well, no? If I patch the code myself I would get an error > when double-quoting, making those command more consistent with the > rest of what you are patching here: > create extension unaccent; > alter text search dictionary unaccent (Rules = 'unaccent'); -- ok > alter text search dictionary unaccent (RuLes = 'unaccent'); -- ok > alter text search dictionary unaccent ("Rules" = 'unaccent'); -- error Daniel, I am waiting for your input on this one, and you did not have much time to send an update. So I am moving this patch to next CF. -- Michael
> On 29 Nov 2017, at 04:29, Michael Paquier <michael.paquier@gmail.com> wrote: > > On Tue, Nov 28, 2017 at 10:07 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> I was just looking at the tsearch code which uses pg_strcmpcase, and >> those are defined with makeDefElem() so you should switch to strcmp in >> this case as well, no? If I patch the code myself I would get an error >> when double-quoting, making those command more consistent with the >> rest of what you are patching here: >> create extension unaccent; >> alter text search dictionary unaccent (Rules = 'unaccent'); -- ok >> alter text search dictionary unaccent (RuLes = 'unaccent'); -- ok >> alter text search dictionary unaccent ("Rules" = 'unaccent'); -- error > > Daniel, I am waiting for your input on this one, and you did not have > much time to send an update. Sorry for the slow updates, I managed to catch a nasty fever which turned my brain closer to a mushroom soup than I’d like. > So I am moving this patch to next CF. Good move, thanks! cheers ./daniel
On Thu, Nov 30, 2017 at 7:40 AM, Daniel Gustafsson <daniel@yesql.se> wrote: >> On 29 Nov 2017, at 04:29, Michael Paquier <michael.paquier@gmail.com> wrote: >> >> On Tue, Nov 28, 2017 at 10:07 AM, Michael Paquier >> <michael.paquier@gmail.com> wrote: >>> I was just looking at the tsearch code which uses pg_strcmpcase, and >>> those are defined with makeDefElem() so you should switch to strcmp in >>> this case as well, no? If I patch the code myself I would get an error >>> when double-quoting, making those command more consistent with the >>> rest of what you are patching here: >>> create extension unaccent; >>> alter text search dictionary unaccent (Rules = 'unaccent'); -- ok >>> alter text search dictionary unaccent (RuLes = 'unaccent'); -- ok >>> alter text search dictionary unaccent ("Rules" = 'unaccent'); -- error >> >> Daniel, I am waiting for your input on this one, and you did not have >> much time to send an update. > > Sorry for the slow updates, I managed to catch a nasty fever which turned my > brain closer to a mushroom soup than I’d like. No problems. Take care. -- Michael
> On 28 Nov 2017, at 02:07, Michael Paquier <michael.paquier@gmail.com> wrote: > > On Tue, Nov 28, 2017 at 12:11 AM, Daniel Gustafsson <daniel@yesql.se> wrote: >>> The patch needs a rebase, and there are a couple of places that need >>> an extra lookup I think: >>> $ git grep defname -- *.c | grep strcasecmp | wc -l >>> 39 >> >> Rebased and handled a few more places which I had either missed in the last >> round, or that had been added in the meantime. “PARALLEL” in aggregatecmds.c >> is intentionally using pg_strcasecmp() due to the old-style syntax which is >> still supported. > > This meritates a comment. Code readers may get confused. Good point, added. I also sent a separate doc patch for this to -docs the other day. >> AFAICS this covers all relevant codepaths from the 39 above. > > I was just looking at the tsearch code which uses pg_strcmpcase, and > those are defined with makeDefElem() so you should switch to strcmp in > this case as well, no? If I patch the code myself I would get an error > when double-quoting, making those command more consistent with the > rest of what you are patching here: > create extension unaccent; > alter text search dictionary unaccent (Rules = 'unaccent'); -- ok > alter text search dictionary unaccent (RuLes = 'unaccent'); -- ok > alter text search dictionary unaccent ("Rules" = 'unaccent'); — error For reasons unknown to me I had avoided poking in contrib/. Attached patch handles the additional defname comparisons in contrib that are applicable. The remainder of the pg_strcasecmp() calls in the text search code are operating on a defelem list created in deserialize_deflist() rather than in the parser, so I opted for keeping that as is rather than casefolding in the list generation. cheers ./daniel
Attachment
On Thu, Nov 30, 2017 at 6:40 AM, Daniel Gustafsson <daniel@yesql.se> wrote: > For reasons unknown to me I had avoided poking in contrib/. Attached patch > handles the additional defname comparisons in contrib that are applicable. I am having a bit of trouble understanding why the first hunk in DefineAggregate() is taking PARALLEL as a special case. The documentation gives three separate synopses for CREATE AGGREGATE, but parallel appears in all of them, and there are other options that do too, so the comment doesn't really help me understand why it's special as compared to other, similar options. I think the changes in DefineView and ATExecSetRelOptions is wrong, because transformRelOptions() is still using pg_strcasecmp. With the patch: rhaas=# create view v(x) with ("Check_option"="local") as select 1; CREATE VIEW rhaas=# create view v(x) with ("check_option"="local") as select 1; ERROR: WITH CHECK OPTION is supported only on automatically updatable views HINT: Views that do not select from a single table or view are not automatically updatable. Oops. Here are, for the record, examples of SQL that will be generate errors or warnings with the patch, but not presently, with a note about which function got changed at the C level to affect the behavior. transformRelOptions/interpretOidsOption: create table a () with ("OiDs"=true); DefineAggregate, second hunk: CREATE AGGREGATE avg (float8) (sfunc = float8_accum, stype = float8[], finalfunc = float8_avg, initcond = '{0,0,0}', parallel = 'sAfe'); DefineCollation: CREATE COLLATION stunk ("LoCaLeS" = "C"); compute_attributes_with_style: create function one() returns int as $$select 1$$ language sql with ("isStrict" = 'true'); DefineOperator: create operator @ (procedure = pg_catalog.int4eq, leftarg = int4, "RIGHTARG" = int4); DefineType: create type awesome (input = int4in, "OuTpUt" = int4out); validateWithCheckOption: create table t(a int, b text, unique(a)); create view x with (check_option = 'loCal') as select * from t; I have not yet managed to figure out what the impact of the contrib changes, or the text search changes in core, is. This is partly a lack of subject matter expertise, but the fact that the functions being modified in contrib have a grand total of 0 lines of comments between them does not help. That's not this patch's fault, nor this patch's job to fix, but it is a barrier to understanding. I think it would be nice to have a complete list of examples of what syntax this patch is affecting. I am actually pretty dubious that we want to do this. I found one bug already (see above), and I don't think there's any guarantee that it's the only one, because it's pretty hard to be sure that none of the remaining calls to pg_strcasecmp are being applied to any of these values in some other part of the code. I'm not sure that the backward compatibility issue is a huge deal, but from my point of view this carries a significant risk of introducing new bugs, might annoy users who spell any of these keywords in all caps with surrounding quotation marks, and really has no clear benefit that I can see. The originally-offered justification is that making this consistent would help us avoid subtle bugs, but it seems at least as likely to CREATE subtle bugs, and the status quo is AFAICT harming nobody. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Greetings Michael, Daniel, all, * Michael Paquier (michael.paquier@gmail.com) wrote: > On Fri, Dec 1, 2017 at 4:14 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > I think the changes in DefineView and ATExecSetRelOptions is wrong, > > because transformRelOptions() is still using pg_strcasecmp. With the > > patch: > > > > rhaas=# create view v(x) with ("Check_option"="local") as select 1; > > CREATE VIEW > > rhaas=# create view v(x) with ("check_option"="local") as select 1; > > ERROR: WITH CHECK OPTION is supported only on automatically updatable views > > HINT: Views that do not select from a single table or view are not > > automatically updatable. > > > > Oops. > > I am getting the feeling that there could be other issues than this > one... If this patch ever gets integrated, I think that this should > actually be shaped as two patches: > - One patch with the set of DDL queries taking advantage of the > current grammar with pg_strcasecmp. > - A second patch that does the switch from pg_strcasecmp to strcmp, > and allows checking which paths of patch 1 are getting changed. > Patch 1 is the hard part to figure out all the possible patterns that > could be changed. > > > I am actually pretty dubious that we want to do this. I found one bug > > already (see above), and I don't think there's any guarantee that it's > > the only one, because it's pretty hard to be sure that none of the > > remaining calls to pg_strcasecmp are being applied to any of these > > values in some other part of the code. I'm not sure that the backward > > compatibility issue is a huge deal, but from my point of view this > > carries a significant risk of introducing new bugs, might annoy users > > who spell any of these keywords in all caps with surrounding quotation > > marks, and really has no clear benefit that I can see. The > > originally-offered justification is that making this consistent would > > help us avoid subtle bugs, but it seems at least as likely to CREATE > > subtle bugs, and the status quo is AFAICT harming nobody. > > Changing opinion here ;) > Yes, I agree that the risk of bugs may not be worth the compatibility > effort at the end. I still see value in this patch for long-term > purposes by making the code more consistent though. Looks like this discussion has progressed to where this patch should really be marked as Rejected. Does anyone else want to voice an opinion regarding it, or perhaps Daniel could post a rebuttal to the concerns raised here? Thinking through it, for my own 2c on this, I tend to agree with Tom that, really, we should be using strcmp() for anything coming out of the parser and that the backward-compatibility break from that is acceptable. I also understand Robert's concern that there may be other bugs hiding and I wonder if there might be a way to check for them, though no great ideas spring to mind offhand. Would be great to hear your thoughts, Daniel, so leaving this in Waiting on Author for now. Thanks! Stephen
Attachment
Stephen Frost <sfrost@snowman.net> writes: > Thinking through it, for my own 2c on this, I tend to agree with Tom > that, really, we should be using strcmp() for anything coming out of the > parser and that the backward-compatibility break from that is > acceptable. I also understand Robert's concern that there may be other > bugs hiding and I wonder if there might be a way to check for them, > though no great ideas spring to mind offhand. Would be great to hear > your thoughts, Daniel, so leaving this in Waiting on Author for now. FWIW, I don't especially agree with Robert's position, because I think he is ignoring the argument that it's a bug that some things are case-sensitive and other seemingly similar things are not. It's definitely concerning that the submitted patch introduced a new bug, but we have seldom taken the position that bugs in an initial submission are sufficient grounds for rejecting the entire concept. ISTM that if this patch gets rid of a large fraction of the uses of pg_strcasecmp in the backend, which is what I expect it should, then it might not be out of reach to go through all the surviving ones to make sure they are not processing strings that originate in the parser. regards, tom lane
On Sun, Jan 7, 2018 at 9:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > ISTM that if this patch gets rid of a large fraction of the uses of > pg_strcasecmp in the backend, which is what I expect it should, then > it might not be out of reach to go through all the surviving ones to > make sure they are not processing strings that originate in the parser. Yeah, that's why I think that it is important for this patch to provide as well tests to make sure that all the code paths are working as they should with this patch. -- Michael
On Sat, Jan 6, 2018 at 7:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > It's definitely concerning that the submitted patch introduced a new bug, > but we have seldom taken the position that bugs in an initial submission > are sufficient grounds for rejecting the entire concept. Fair point. I withdraw my categorical -1 vote and replace it with the statement that the patch hasn't been sufficiently-carefully checked by the patch author or other reviewers for bugs to consider committing it -- nor has anyone taken the trouble to list with precision all of the places where the behavior will change. I think the latter is absolutely indispensable, which is why I started to compile such a list in my previous post. The former needs to be done as well, of course. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On 08 Jan 2018, at 17:27, Robert Haas <robertmhaas@gmail.com> wrote: > > On Sat, Jan 6, 2018 at 7:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It's definitely concerning that the submitted patch introduced a new bug, >> but we have seldom taken the position that bugs in an initial submission >> are sufficient grounds for rejecting the entire concept. > > Fair point. I withdraw my categorical -1 vote and replace it with the > statement that the patch hasn't been sufficiently-carefully checked by > the patch author or other reviewers for bugs to consider committing it > -- nor has anyone taken the trouble to list with precision all of the > places where the behavior will change. I think the latter is > absolutely indispensable, which is why I started to compile such a > list in my previous post. Sorry for dropping off the radar, my available time for hacking was severely limited (well, down to zero really). Seeing the surface again I’ve started on the complete list and hope to have something quite soon, and I think (as seems to be the consensus on this thread) that that list is a prerequisite for the review. cheers ./daniel
> On 07 Jan 2018, at 01:17, Stephen Frost <sfrost@snowman.net> wrote: > Would be great to hear your thoughts, Daniel, so leaving this in Waiting on > Author for now. I am still of the opinion that it’s worth going through and ensuring that we are matching against parser output in a consistent way, if only to lower the risk of subtle hard to find bugs. This patch is a first stab, but there are more string comparisons to consider should we decide to ahead with this patch (or the general approach in some other fashion than this implementation). Collating the responses so far with an updated patch, thanks to everyone who has reviewed this patch! Sorry being slow to respond, $life hasn’t allowed for much hacking lately. > On 30 Nov 2017, at 20:14, Robert Haas <robertmhaas@gmail.com> wrote: > I am having a bit of trouble understanding why the first hunk in > DefineAggregate() is taking PARALLEL as a special case. The > documentation gives three separate synopses for CREATE AGGREGATE, but > parallel appears in all of them, and there are other options that do > too, so the comment doesn't really help me understand why it's special > as compared to other, similar options. The reason for the special handling of parallel in the old-style CREATE AGGREGATE syntax is that it’s parsed with IDENT rather than ColLabel. AFAICT that works for all parameters except parallel as it’s an unreserved keyword since 7aea8e4f2da. Extending old_aggr_elem to handle PARALLEL separately seems to work for not requiring the quoting, but I may be missing something as the parser hacking isn’t my speciality. The patch has been updated with this (and the documentation + tests changes to go with it) but it clearly needs a close eye. > I think the changes in DefineView and ATExecSetRelOptions is wrong, > because transformRelOptions() is still using pg_strcasecmp. Correct, I had missed that reloptions case, sorry about that. The hunk in AlterTableGetRelOptionsLockLevel() seems correct to me, and testing didn’t break anything, but I wonder if there is a case where it would need pg_strncasecmp? > On 08 Jan 2018, at 17:27, Robert Haas <robertmhaas@gmail.com> wrote: > nor has anyone taken the trouble to list with precision all of the > places where the behavior will change. I think the latter is > absolutely indispensable, I had a look at the available commands in postgres and compiled a list of them in options.sql based on if they have options, and how those options and matched (case sensitive of insensitive). The queries in the file are nonsensical, they are just meant to show the handling of the options. The idea was to illustrate the impact of this proposal by running examples. Running this file with and without the patches applied shows the following commands being affected: CREATE TABLE CREATE TABLE AS ALTER TABLE CREATE TABLESPACE ALTER TABLESPACE CREATE VIEW ALTER VIEW CREATE INDEX ALTER INDEX CREATE AGGREGATE (new and old syntax) CREATE COLLATION CREATE FUNCTION CREATE OPERATOR ALTER OPERATOR CREATE TYPE CREATE TEXT SEARCH CONFIGURATION CREATE TEXT SEARCH DICTIONARY ALTER TEXT SEARCH DICTIONARY The output with the patch is attached as options_patched.out, and the output from master as options_master.out. Diffing the two files is rather helpful I think. All possible options aren’t excercised, and I hope I didn’t miss any statements that should’ve been covered. The options.sql file makes it quite obvious that we currently have quite a mix of case sensitive and insensitive commands. Is this in line with what you were thinking Robert? I’m definitely up for better ideas. One open question from this excercise is how to write a good test for this. It can either be made part of the already existing test queries or a separate suite. I’m leaning on the latter simply because the case-flipping existing tests seems like something that can be cleaned up years from now accidentally because it looks odd. > On 07 Jan 2018, at 01:38, Tom Lane <tgl@sss.pgh.pa.us> wrote: > ISTM that if this patch gets rid of a large fraction of the uses of > pg_strcasecmp in the backend, which is what I expect it should, then > it might not be out of reach to go through all the surviving ones to > make sure they are not processing strings that originate in the parser. I completely agree, but I have not expanded this patch with this. One case was actually instead put back, since I did see a mention in the documentation for isStrict and isCachable they aren’t case sensitive. Instead I removed that section from the documentation in a hope that we some day can make these case sensitive. Additionally, while poking at the commands I noticed an inconsistency in checking for conflicting options in CREATE FUNCTION. The below statement is correctly erroring on IMMUTABLE and VOLATILE being conflicting: create function int42(cstring) returns int42 AS 'int4in' language internal strict immutable volatile; If you however combine the new and old syntax, the statement works and the WITH option wins by overwriting any previous value. The below statement creates an IMMUTABLE function: create function int42(cstring) returns int42 AS 'int4in' language internal strict volatile with (isstrict, iscachable); It’s arguably a pretty silly statement to write, and may very well never have been seen in production, but I would still expect that query to error out. Attached volatility.patch fixes it in a hacky way to behave like the former statement, there is probably a cleaner way but I didn’t want to spend too much time on it before hearing if others think it’s not worth fixing. Thanks for looking at this! cheers ./daniel
Attachment
On Wed, Jan 10, 2018 at 11:49:47PM +0100, Daniel Gustafsson wrote: >> On 08 Jan 2018, at 17:27, Robert Haas <robertmhaas@gmail.com> wrote: >> nor has anyone taken the trouble to list with precision all of the >> places where the behavior will change. I think the latter is >> absolutely indispensable, > > I had a look at the available commands in postgres and compiled a list > of them in options.sql based on if they have options, and how those > options and matched (case sensitive of insensitive). The queries in > the file are nonsensical, they are just meant to show the handling of > the options. The idea was to illustrate the impact of this proposal > by running examples. Running this file with and without the patches > applied shows the following commands being affected: > > <snip> > > The output with the patch is attached as options_patched.out, and the output > from master as options_master.out. Diffing the two files is rather helpful I > think. Thanks. This is saving me hours of lookups and testing during the review, as now reviewers just have to map you test series with the code modified. I can't help to notice that tests for code paths with contrib modules are missing. This brings up the point: do we want those tests to be in the patch? I would like to think that a special section dedicated to option compatibility for each command would be welcome to track which grammar is supported and which grammar is not supported. > All possible options aren’t excercised, and I hope I didn’t miss any > statements that should’ve been covered. The options.sql file makes it > quite obvious that we currently have quite a mix of case sensitive and > insensitive commands. Is this in line with what you were thinking > Robert? I’m definitely up for better ideas. I would think that one option tested in the series is fine to cover grounds. Most of those code paths are made of a series of if/elif using strcmp so all of them should be consistent.. > One open question from this excercise is how to write a good test for > this. It can either be made part of the already existing test queries > or a separate suite. I’m leaning on the latter simply because the > case-flipping existing tests seems like something that can be cleaned > up years from now accidentally because it looks odd. Adding them into src/test/regress/ sounds like a good plan to me. > If you however combine the new and old syntax, the statement works and the WITH > option wins by overwriting any previous value. The below statement creates an > IMMUTABLE function: > > create function int42(cstring) returns int42 AS 'int4in' > language internal strict volatile with (isstrict, iscachable); Here is another idea: nuking isstrict and iscachable from CREATE FUNCTION syntax and forget about them. I would be tempted of the opinion to do that before the rest. -old_aggr_elem: IDENT '=' def_arg +old_aggr_elem: PARALLEL '=' def_arg + { + $$ = makeDefElem("parallel", (Node *)$3, @1); + } + | IDENT '=' def_arg Nit: alphabetical order. I have spent a couple of hours reviewing all the calls to pg_strcasecmp, and the only thing I have noticed is in transformRelOptions(), where the namespace string should be evaluated as well by strcmp, no? On HEAD: =# create table a1 (a text) with ( "Toast"."Autovacuum_vacuum_cost_limit" = 1); CREATE TABLE =# create table a2 (a text) with ( "Toast.Autovacuum_vacuum_cost_limit" = 1); ERROR: 22023: unrecognized parameter "Toast.Autovacuum_vacuum_cost_limit" =# create table a3 (a text) with ( "Toast".autovacuum_vacuum_cost_limit = 1); CREATE TABLE =# create table a4 (a text) with ( toast."Autovacuum_vacuum_cost_limit" = 1); CREATE TABLE With your patch: =# create table a1 (a text) with ( "Toast"."Autovacuum_vacuum_cost_limit" = 1); ERROR: 22023: unrecognized parameter "Autovacuum_vacuum_cost_limit" =# create table a2 (a text) with ( "Toast.Autovacuum_vacuum_cost_limit" = 1); ERROR: 22023: unrecognized parameter "Toast.Autovacuum_vacuum_cost_limit" =# create table a3 (a text) with ( "Toast".autovacuum_vacuum_cost_limit = 1); CREATE TABLE =# create table a4 (a text) with ( toast."Autovacuum_vacuum_cost_limit" = 1); ERROR: 22023: unrecognized parameter "Autovacuum_vacuum_cost_limit" LOCATION: parseRelOptions, reloptions.c:1103 So per my set of examples, the evaluation of the schema name should fail when creating relation a3 with your patch applied. As the patch does things, the experience for the user is not consistent, and the schema name goes through parser via reloption_elem using makeDefElemExtended, so this should use strcmp. -- Michael
Attachment
> On 11 Jan 2018, at 09:01, Michael Paquier <michael.paquier@gmail.com> wrote: > > On Wed, Jan 10, 2018 at 11:49:47PM +0100, Daniel Gustafsson wrote: >>> On 08 Jan 2018, at 17:27, Robert Haas <robertmhaas@gmail.com> wrote: >>> nor has anyone taken the trouble to list with precision all of the >>> places where the behavior will change. I think the latter is >>> absolutely indispensable, >> >> I had a look at the available commands in postgres and compiled a list >> of them in options.sql based on if they have options, and how those >> options and matched (case sensitive of insensitive). The queries in >> the file are nonsensical, they are just meant to show the handling of >> the options. The idea was to illustrate the impact of this proposal >> by running examples. Running this file with and without the patches >> applied shows the following commands being affected: >> >> <snip> >> >> The output with the patch is attached as options_patched.out, and the output >> from master as options_master.out. Diffing the two files is rather helpful I >> think. > > Thanks. This is saving me hours of lookups and testing during the > review, as now reviewers just have to map you test series with the code > modified. Well, being the one proposing the patch I should be the one spending those hours and not you. Sorry for not including in the original submission. > I can't help to notice that tests for code paths with > contrib modules are missing. This brings up the point: do we want those > tests to be in the patch? I left them out since they are version of ALTER TEXT SEARCH .. rather than new statements. > I would like to think that a special section > dedicated to option compatibility for each command would be welcome to > track which grammar is supported and which grammar is not supported. I’m not sure I follow? >> One open question from this excercise is how to write a good test for >> this. It can either be made part of the already existing test queries >> or a separate suite. I’m leaning on the latter simply because the >> case-flipping existing tests seems like something that can be cleaned >> up years from now accidentally because it looks odd. > > Adding them into src/test/regress/ sounds like a good plan to me. If there is interest in this patch now that the list exists and aids review, I can turn the list into a proper test that makes a little more sense than the current list which is rather aimed at helping reviewers. >> If you however combine the new and old syntax, the statement works and the WITH >> option wins by overwriting any previous value. The below statement creates an >> IMMUTABLE function: >> >> create function int42(cstring) returns int42 AS 'int4in' >> language internal strict volatile with (isstrict, iscachable); > > Here is another idea: nuking isstrict and iscachable from CREATE > FUNCTION syntax and forget about them. I would be tempted of the opinion > to do that before the rest. Thats certainly an option, I have no idea about the prevalence in real life production environments to have much an opinion to offer. > > -old_aggr_elem: IDENT '=' def_arg > +old_aggr_elem: PARALLEL '=' def_arg > + { > + $$ = makeDefElem("parallel", (Node *)$3, @1); > + } > + | IDENT '=' def_arg > Nit: alphabetical order. Fixed. > I have spent a couple of hours reviewing all the calls to pg_strcasecmp, > and the only thing I have noticed is in transformRelOptions(), where the > namespace string should be evaluated as well by strcmp, no? <snip> > So per my set of examples, the evaluation of the schema name should fail > when creating relation a3 with your patch applied. As the patch does > things, the experience for the user is not consistent, and the schema > name goes through parser via reloption_elem using makeDefElemExtended, > so this should use strcmp. I believe you are entirely correct, the attached v5 patch is updated with this behavior. The volatility patch is unchanged by this so didn’t submit a new version of that one. Thanks for the review! cheers ./daniel
Attachment
On Fri, Jan 12, 2018 at 11:35:48PM +0100, Daniel Gustafsson wrote: > On 11 Jan 2018, at 09:01, Michael Paquier <michael.paquier@gmail.com> wrote: >> I would like to think that a special section >> dedicated to option compatibility for each command would be welcome to >> track which grammar is supported and which grammar is not supported. > > I’m not sure I follow? > >>> One open question from this excercise is how to write a good test for >>> this. It can either be made part of the already existing test queries >>> or a separate suite. I’m leaning on the latter simply because the >>> case-flipping existing tests seems like something that can be cleaned >>> up years from now accidentally because it looks odd. >> >> Adding them into src/test/regress/ sounds like a good plan to me. > > If there is interest in this patch now that the list exists and aids review, I > can turn the list into a proper test that makes a little more sense than the > current list which is rather aimed at helping reviewers. Sorry if my words were hard to catch here. What I mean here is to add in each command's test file the set of commands which check the compatibility. There is no need to test all the options in my opinion, as just testing one option is enoughto show the purpose. So for example to cover the grounds of DefineAggregate(), you could add a set of commands in create_aggregate.sql. For DefineCollation(), those can go in collate.sql, etc. >> Here is another idea: nuking isstrict and iscachable from CREATE >> FUNCTION syntax and forget about them. I would be tempted of the opinion >> to do that before the rest. > > Thats certainly an option, I have no idea about the prevalence in real life > production environments to have much an opinion to offer. Please let me raise a new thread about this point with a proper patch. That's rather separate to the work you are doing here, even if those parameters are using pg_strcasecmp(). -- Michael
Attachment
> On 15 Jan 2018, at 02:33, Michael Paquier <michael.paquier@gmail.com> wrote: > On Fri, Jan 12, 2018 at 11:35:48PM +0100, Daniel Gustafsson wrote: >> On 11 Jan 2018, at 09:01, Michael Paquier <michael.paquier@gmail.com> wrote: >>>> One open question from this excercise is how to write a good test for >>>> this. It can either be made part of the already existing test queries >>>> or a separate suite. I’m leaning on the latter simply because the >>>> case-flipping existing tests seems like something that can be cleaned >>>> up years from now accidentally because it looks odd. >>> >>> Adding them into src/test/regress/ sounds like a good plan to me. >> >> If there is interest in this patch now that the list exists and aids review, I >> can turn the list into a proper test that makes a little more sense than the >> current list which is rather aimed at helping reviewers. > > Sorry if my words were hard to catch here. What I mean here is to add in > each command's test file the set of commands which check the > compatibility. There is no need to test all the options in my opinion, > as just testing one option is enoughto show the purpose. So for example > to cover the grounds of DefineAggregate(), you could add a set of > commands in create_aggregate.sql. For DefineCollation(), those can go in > collate.sql, etc. Gotcha. I’ve added some tests to the patch. The test for CREATE FUNCTION was omitted for now awaiting the outcome of the discussion around isStrict isCachable. Not sure how much they’re worth in "make check” though as it’s quite easy to add a new option checked with pg_strcasecmp which then isn’t tested. Still, it might aid review so definitely worth it. cheers ./daniel
Attachment
On Mon, Jan 22, 2018 at 12:48:45PM +0100, Daniel Gustafsson wrote: > Gotcha. I’ve added some tests to the patch. The test for CREATE > FUNCTION was omitted for now awaiting the outcome of the discussion > around isStrict and isCachable. That makes sense. > Not sure how much they’re worth in "make check” though as it’s quite > easy to add a new option checked with pg_strcasecmp which then isn’t > tested. Still, it might aid review so definitely worth it. Thanks for making those, this eases the review lookups. There are a couple of code paths that remained untested: - contrib/unaccent/ - contrib/dict_xsyn/ - contrib/dict_int/ - The combinations of toast reloptions is pretty particular as option namespaces also go through pg_strcasecmp, so I think that those should be tested as well (your patch includes a test for fillfactor, which is a good base by the way). - check_option for CREATE VIEW and ALTER VIEW SET. - ALTER TEXT SEARCH CONFIGURATION for copy/parser options. - CREATE TYPE AS RANGE with DefineRange(). There are as well two things I have spotted on the way: 1) fillRelOptions() can safely use strcmp. 2) indexam.sgml mentions using pg_strcasecmp() for consistency with the core code when defining amproperty for an index AM. Well, with this patch I think that for consistency with the core code that would involve using strcmp instead, extension developers can of course still use pg_strcasecmp. Those are changed as well in the attached, which applies on top of your v6. I have added as well in it the tests I spotted were missing. If this looks right to you, I am fine to switch this patch as ready for committer, without considering the issues with isCachable and isStrict in CREATE FUNCTION of course. -- Michael
Attachment
> On 23 Jan 2018, at 05:52, Michael Paquier <michael.paquier@gmail.com> wrote: > > On Mon, Jan 22, 2018 at 12:48:45PM +0100, Daniel Gustafsson wrote: > >> Not sure how much they’re worth in "make check” though as it’s quite >> easy to add a new option checked with pg_strcasecmp which then isn’t >> tested. Still, it might aid review so definitely worth it. > > Thanks for making those, this eases the review lookups. There are a > couple of code paths that remained untested: > - contrib/unaccent/ > - contrib/dict_xsyn/ > - contrib/dict_int/ > - The combinations of toast reloptions is pretty particular as option > namespaces also go through pg_strcasecmp, so I think that those should > be tested as well (your patch includes a test for fillfactor, which is a > good base by the way). > - check_option for CREATE VIEW and ALTER VIEW SET. > - ALTER TEXT SEARCH CONFIGURATION for copy/parser options. > - CREATE TYPE AS RANGE with DefineRange(). Thanks, those are good points. > There are as well two things I have spotted on the way: > 1) fillRelOptions() can safely use strcmp. Yes, I believe you’re right. > 2) indexam.sgml mentions using pg_strcasecmp() for consistency with the > core code when defining amproperty for an index AM. Well, with this > patch I think that for consistency with the core code that would involve > using strcmp instead, extension developers can of course still use > pg_strcasecmp. That part I’m less sure about, the propname will not be casefolded by the parser so pg_strcasecmp() should still be the recommendation here no? > Those are changed as well in the attached, which applies on top of your > v6. I have added as well in it the tests I spotted were missing. If this > looks right to you, I am fine to switch this patch as ready for > committer, without considering the issues with isCachable and isStrict > in CREATE FUNCTION of course. Apart from the amproperty hunk, I’m definately +1 on adding your patch on top of my v6 one. Thanks for all your help and review! cheers ./daniel
On Tue, Jan 23, 2018 at 04:22:22PM +0100, Daniel Gustafsson wrote: > On 23 Jan 2018, at 05:52, Michael Paquier <michael.paquier@gmail.com> wrote: >> 2) indexam.sgml mentions using pg_strcasecmp() for consistency with the >> core code when defining amproperty for an index AM. Well, with this >> patch I think that for consistency with the core code that would involve >> using strcmp instead, extension developers can of course still use >> pg_strcasecmp. > > That part I’m less sure about, the propname will not be casefolded by the > parser so pg_strcasecmp() should still be the recommendation here no? Yes, you are right. I had a brain fade here as all the option names here go through SQL-callable functions. >> Those are changed as well in the attached, which applies on top of your >> v6. I have added as well in it the tests I spotted were missing. If this >> looks right to you, I am fine to switch this patch as ready for >> committer, without considering the issues with isCachable and isStrict >> in CREATE FUNCTION of course. > > Apart from the amproperty hunk, I’m definately +1 on adding your patch on top > of my v6 one. Thanks for all your help and review! OK. Could you publish a v7? I will switch the entry as ready for committer. -- Michael
Attachment
> On 24 Jan 2018, at 02:37, Michael Paquier <michael.paquier@gmail.com> wrote: > On Tue, Jan 23, 2018 at 04:22:22PM +0100, Daniel Gustafsson wrote: >> On 23 Jan 2018, at 05:52, Michael Paquier <michael.paquier@gmail.com> wrote: >>> Those are changed as well in the attached, which applies on top of your >>> v6. I have added as well in it the tests I spotted were missing. If this >>> looks right to you, I am fine to switch this patch as ready for >>> committer, without considering the issues with isCachable and isStrict >>> in CREATE FUNCTION of course. >> >> Apart from the amproperty hunk, I’m definately +1 on adding your patch on top >> of my v6 one. Thanks for all your help and review! > > OK. Could you publish a v7? I will switch the entry as ready for > committer. Attached is a rebased v7 patch which has your amendments (minus propname) which passes make check without errors. The volatility patch is also rebased included, but there the discussion whether to keep or drop the deprecated syntax first needs to happen (started in your 20180115022748.GB1724@paquier.xyz mail). cheers ./daniel
Attachment
On Wed, Jan 24, 2018 at 09:47:50AM +0100, Daniel Gustafsson wrote: > Attached is a rebased v7 patch which has your amendments (minus > propname) which passes make check without errors. Confirmed. I am switching the status as ready for committer for volatility-v7.patch then. > The volatility patch is also rebased included, but there the > discussion whether to keep or drop the deprecated syntax first needs > to happen (started in your 20180115022748.GB1724@paquier.xyz mail). Yes, let's see what happens later for this thread. -- Michael
Attachment
Michael Paquier <michael.paquier@gmail.com> writes: > On Wed, Jan 24, 2018 at 09:47:50AM +0100, Daniel Gustafsson wrote: >> Attached is a rebased v7 patch which has your amendments (minus >> propname) which passes make check without errors. > Confirmed. I am switching the status as ready for committer for > volatility-v7.patch then. Poking through this, I notice that there are two reloptions-related "pg_strncasecmp" calls that did not get converted to "strncmp": reloptions.c:804 and reloptions.h:169. Is that an oversight, or intentional, and if intentional why? regards, tom lane
> On 26 Jan 2018, at 22:32, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Michael Paquier <michael.paquier@gmail.com> writes: >> On Wed, Jan 24, 2018 at 09:47:50AM +0100, Daniel Gustafsson wrote: >>> Attached is a rebased v7 patch which has your amendments (minus >>> propname) which passes make check without errors. > >> Confirmed. I am switching the status as ready for committer for >> volatility-v7.patch then. > > Poking through this, Thanks! > I notice that there are two reloptions-related > "pg_strncasecmp" calls that did not get converted to "strncmp": > reloptions.c:804 The way I read transformRelOptions(), oldOptions is not guaranteed to come from the parser (though in reality it probably will be). The namespace isn’t either but passing an uppercase namespace should never be valid AFAICT, hence the patch changing it to case sensitive comparison. > and reloptions.h:169. Oversight, completely missed that one. cheers ./daniel
Daniel Gustafsson <daniel@yesql.se> writes: >> On 26 Jan 2018, at 22:32, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I notice that there are two reloptions-related >> "pg_strncasecmp" calls that did not get converted to "strncmp": >> reloptions.c:804 > The way I read transformRelOptions(), oldOptions is not guaranteed to > come from the parser (though in reality it probably will be). Well, one response to that is that it should contain values that were deemed acceptable at some previous time. If we allowed catalog contents to be migrated physically across major releases, we'd need to worry about having mixed-case reloptions in a v11 catalog ... but we don't, so we won't. The old reloptions should always be ones that got through parseRelOptions before, and those now will always have to be exact case. Another response is that leaving it like this would mean that transformRelOptions and parseRelOptions have different opinions about whether two option names are the same or not, which seems to me to be exactly the same sort of bug hazard that you were on about at the beginning of this thread. > The namespace isn’t either > but passing an uppercase namespace should never be valid AFAICT, hence the > patch changing it to case sensitive comparison. Well, it's no more or less valid than an uppercase option name ... >> and reloptions.h:169. > Oversight, completely missed that one. It looks like no core code uses that macro, so it's got no effect unless some third-party code does ... but I changed it anyway. regards, tom lane
> On 26 Jan 2018, at 23:58, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >>> On 26 Jan 2018, at 22:32, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I notice that there are two reloptions-related >>> "pg_strncasecmp" calls that did not get converted to "strncmp": >>> reloptions.c:804 > >> The way I read transformRelOptions(), oldOptions is not guaranteed to >> come from the parser (though in reality it probably will be). > > Well, one response to that is that it should contain values that were > deemed acceptable at some previous time. If we allowed catalog contents > to be migrated physically across major releases, we'd need to worry about > having mixed-case reloptions in a v11 catalog ... but we don't, so we > won't. The old reloptions should always be ones that got through > parseRelOptions before, and those now will always have to be exact case. That’s a good point, the reloptions will only ever come from the same major version. > Another response is that leaving it like this would mean that > transformRelOptions and parseRelOptions have different opinions about > whether two option names are the same or not, which seems to me to be > exactly the same sort of bug hazard that you were on about at the > beginning of this thread. Completely agree. >> The namespace isn’t either >> but passing an uppercase namespace should never be valid AFAICT, hence the >> patch changing it to case sensitive comparison. > > Well, it's no more or less valid than an uppercase option name … Agreed. Taking a step back and thinking it’s clear that the comparison in the oldoptions loop should’ve been changed in the patch for it to be consistent with the original objective. cheers ./daniel
I've pushed this mostly as-is. I fixed the missed places in reloptions code as we discussed. I also took out the parser changes related to allowing unquoted PARALLEL in old-style CREATE AGGREGATE, because that is not a goal I consider worthy of adding extra grammar complexity. We don't document that PARALLEL works there, and there has never been any expectation that deprecated legacy syntax would grow new options as needed to have feature parity with the modern syntax. I don't feel a need to go out of our way to prevent new options from working in the old syntax, if they happen to, but I definitely don't want to expend code on making them do so. Accordingly, that regression test that expected PARALLEL to work in the old-style syntax was just ill-considered, and I changed it to use the new-style syntax instead. I also trimmed the new regression test cases a bit, as most of them seemed pretty duplicative. How many times do we need to verify that the lexer doesn't downcase quoted identifiers? I'm not really sure that checking this behavior from SQL is useful at all, actually. The interesting question is whether there are code paths that can reach these string comparisons with strings that *didn't* get processed as identifiers by the lexer, and these tests do basically nothing to prove that there aren't. Still, I think we can push this and wait to see if there are complaints. regards, tom lane
> On 27 Jan 2018, at 00:36, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I've pushed this mostly as-is. Thanks! > I also took out the parser changes related to > allowing unquoted PARALLEL in old-style CREATE AGGREGATE, because that > is not a goal I consider worthy of adding extra grammar complexity. > We don't document that PARALLEL works there, and there has never been > any expectation that deprecated legacy syntax would grow new options > as needed to have feature parity with the modern syntax. Ok, didn’t know old syntax wasn’t extended to support new options so I went after it having run into the regress tests using it. > I also trimmed the new regression test cases a bit, as most of them seemed > pretty duplicative. Makes sense, they were made verbose to aid review but were too chatty for inclusion. cheers ./daniel
On Fri, Jan 26, 2018 at 05:58:48PM -0500, Tom Lane wrote: > Daniel Gustafsson <daniel@yesql.se> writes: >> Oversight, completely missed that one. > > It looks like no core code uses that macro, so it's got no effect > unless some third-party code does ... but I changed it anyway. Good point. I missed this one as well. I have double-checked the core code and it seems to me that we are clear now. -- Michael
Attachment
On Fri, Jan 26, 2018 at 6:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I've pushed this mostly as-is. I fixed the missed places in reloptions > code as we discussed. I also took out the parser changes related to > allowing unquoted PARALLEL in old-style CREATE AGGREGATE, because that > is not a goal I consider worthy of adding extra grammar complexity. > We don't document that PARALLEL works there, and there has never been > any expectation that deprecated legacy syntax would grow new options > as needed to have feature parity with the modern syntax. I don't feel > a need to go out of our way to prevent new options from working in the > old syntax, if they happen to, but I definitely don't want to expend > code on making them do so. > > Accordingly, that regression test that expected PARALLEL to work in > the old-style syntax was just ill-considered, and I changed it to use > the new-style syntax instead. > > I also trimmed the new regression test cases a bit, as most of them seemed > pretty duplicative. How many times do we need to verify that the lexer > doesn't downcase quoted identifiers? I'm not really sure that checking > this behavior from SQL is useful at all, actually. The interesting > question is whether there are code paths that can reach these string > comparisons with strings that *didn't* get processed as identifiers by the > lexer, and these tests do basically nothing to prove that there aren't. > Still, I think we can push this and wait to see if there are complaints. I think it's a shame that the commit message didn't document (for the release notes) exactly which cases just got changed incompatibly. I admit that not many people are likely to get bitten by this, but I still think it's better if we're precise about what might cause a particular user to be in that set. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Feb 01, 2018 at 02:03:23PM -0500, Robert Haas wrote: > I think it's a shame that the commit message didn't document (for the > release notes) exactly which cases just got changed incompatibly. I > admit that not many people are likely to get bitten by this, but I > still think it's better if we're precise about what might cause a > particular user to be in that set. v7 posted in [1] of the patch was doing a pretty good job on this side because it included regression tests for all the code paths involved by the change. The final commit has shaved some of them, but here is a list for reference based on my notes: CREATE/ALTER TEXT SEARCH DICTIONARY CREATE TEXT SEARCH TEMPLATE CREATE TEXT SEARCH PARSER CREATE/ALTER OPERATOR CREATE COLLATION CREATE AGGREGATE CREATE/ALTER OPERATOR CREATE/ALTER TABLE CREATE TYPE CREATE/ALTER VIEW [1]: https://www.postgresql.org/message-id/62991614-9673-4276-99CC-6754E7A0572F%40yesql.se -- Michael