Re: Some surprising precedence behavior in PG's grammar - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: Some surprising precedence behavior in PG's grammar |
Date | |
Msg-id | 4DC1F227.707@dunslane.net Whole thread Raw |
In response to | Some surprising precedence behavior in PG's grammar (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Some surprising precedence behavior in PG's grammar
|
List | pgsql-hackers |
On 05/04/2011 07:39 PM, Tom Lane wrote: > While looking at the grammar's operator-precedence declarations in > connection with a recent pgsql-docs question, it struck me that this > declaration is a foot-gun waiting to go off: > > %nonassoc IS NULL_P TRUE_P FALSE_P UNKNOWN /* sets precedence for IS NULL, etc */ > > The only terminal that we actually need to attach precedence to for > IS NULL and related productions is "IS". The others are just listed > there to save attaching explicit %prec declarations to those productions. > This seems like a bad idea, because attaching a precedence to a terminal > symbol that doesn't absolutely have to have one is just asking for > trouble: it can cause bison to accept grammars that would better have > provoked a shift/reduce error, and to silently resolve the ambiguity in > a way that you maybe didn't expect. > > So I thought to myself that it'd be better to remove the unnecessary > precedence markings, and tried, with the attached patch. And behold, > I got a shift/reduce conflict: > > state 2788 > > 1569 b_expr: b_expr qual_Op . b_expr > 1571 | b_expr qual_Op . > > NULL_P shift, and go to state 1010 > NULL_P [reduce using rule 1571 (b_expr)] > > So the god of unintended consequences has been here already. What this > is telling us is that in examples such as > > CREATE TABLE foo (f1 int DEFAULT 10 %% NULL); > > it is not immediately clear to bison whether to shift upon seeing the > NULL (which leads to a parse tree that says %% is an infix operator with > arguments 10 and NULL), or to reduce (which leads to a parse tree that > says that %% is a postfix operator with argument 10, and NULL is a > column declaration constraint separate from the DEFAULT expression). > > If you try the experiment, you find out that the first interpretation is > preferred by the current grammar: > > ERROR: operator does not exist: integer %% unknown Yeah, IIRC the default action for a shift/reduce conflict is to shift, as it's usually the right thing to do. > Now, this is probably a good thing, because NULL as a column declaration > constraint is not SQL standard (only NOT NULL is), so we're resolving > the ambiguity in a way that's more likely to be SQL-compatible. But it > could be surprising to somebody who expected the other behavior, > especially since this seemingly-closely-related command is parsed the > other way: > > CREATE TABLE foo (f1 int DEFAULT 10 %% NOT NULL); > ERROR: operator does not exist: integer %% > > And the reason for that difference in behavior is that NOT has a > declared precedence lower than POSTFIXOP, whereas NULL has a declared > precedence that's higher. That comparison determines how bison resolves > the shift/reduce conflict. > > The fact that this behavior falls out of a precedence declaration that's > seemingly quite unrelated, and was surely not designed with this case in > mind, is a perfect example of why I say that precedence declarations are > hazardous. > > So I'd still like to get rid of the precedence markings for TRUE_P, > FALSE_P, and UNKNOWN, and will do so unless somebody has a really good > reason not to. (It looks like we could avoid marking ZONE, too.) But > I would be happier if we could also not mark NULL, because that's surely > used in a lot of other places, and could easily bite us a lot harder > than this. Can anyone think of an alternative way to resolve this > particular conflict without the blunt instrument of a precedence marking? > > My bison-fu is a bit rusty, but years ago I could do this stuff in my sleep. I'll be surprised if there isn't a way. If we do need a precedence setting for NULL_P, then I think it should probably be on its own and not sharing one with IS. If you don't solve it soon I'll take a look after I clear a couple of higher priority items from my list. cheers andrew
pgsql-hackers by date: