bison location reporting for potentially-empty list productions - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | bison location reporting for potentially-empty list productions |
Date | |
Msg-id | 10999.1349310564@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: bison location reporting for potentially-empty list productions
|
List | pgsql-hackers |
In the just-committed patch for CREATE SCHEMA IF NOT EXISTS, there is an error thrown by the grammar when IF NOT EXISTS is specified together with any schema-element clauses. I thought it would make more sense for the error cursor to point at the schema-element clauses, rather than at the IF NOT EXISTS as submitted. So the code looks like, eg, | CREATE SCHEMA IF_P NOT EXISTS ColId OptSchemaEltList ... if ($7 != NIL) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("CREATE SCHEMA IF NOT EXISTS cannot include schema elements"), parser_errposition(@7))); However, it turns out this actually results in a cursor pointing at the previous word: CREATE SCHEMA IF NOT EXISTS test_schema_1 -- fail, disallowed CREATE TABLE abc ( a serial, bint UNIQUE ); ERROR: CREATE SCHEMA IF NOT EXISTS cannot include schema elements LINE 1: CREATE SCHEMA IF NOT EXISTS test_schema_1 ^ After some study I think what is happening is that there's a deficiency in the location-tracking macro in gram.y: /* Location tracking support --- simpler than bison's default */ #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ if (N) \ (Current) = (Rhs)[1]; \ else \ (Current) = (Rhs)[0]; \ } while (0) OptSchemaEltList looks like this: OptSchemaEltList: OptSchemaEltList schema_stmt { $$ = lappend($1, $2); } | /* EMPTY */ { $$ = NIL; } ; When the macro is evaluated for the initial empty production for OptSchemaEltList, the "else" case causes it to index off the beginning of the array and pick up the location for the preceding token. Then, each succeeding reduction of the first part of the production copies that bogus value from the first RHS member item. So this same problem applies not only to OptSchemaEltList but to any case where we've formed a zero-or-more-element-list production in this style. So far as I can tell, there are no other cases in the current grammar where we make use of the position of a list nonterminal for error-reporting purposes, which is why we'd not noticed this before. But it seems likely to come up again. It seems clear to me now that the macro ought at least to be changed to #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ if (N) \ (Current) = (Rhs)[1]; \ else \ (Current) = -1; \ } while (0) so that the parse location attributed to an empty production is -1 (ie, unknown) rather than the current quite bogus behavior of marking it as the preceding token's location. (For one thing, what if there *isn't* a preceding token? That's not possible I think in the current grammar, but if it were possible this code would be indexing off the beginning of the locations array.) But that change wouldn't really fix the issue for CREATE SCHEMA --- the -1 would be propagated up to all instances of OptSchemaEltList even after we'd reduced the first production a few times, so that we'd end up printing no error cursor for this case. To produce a really useful cursor for this type of situation I think we would have to do something like this: #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ int i; (Current) = -1; \ for (i = 1; i <= (N); i++) { (Current) = (Rhs)[i]; \ if ((Current) >= 0) break; } } while (0) This is pretty ugly and seems likely to create a visible hit on the parser's speed (which is already not that good). I'm not sure it's worth it for something that seems to be a corner case --- anyway we've not hit it before in six years since the location tracking support was added. At this point I'm inclined to change the macro to the second case (with the -1) and accept a less good error cursor position for the CREATE SCHEMA error. Has anybody got a better idea? regards, tom lane
pgsql-hackers by date: