út 9. 4. 2024 v 0:55 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Erik Wienhold <ewie@ewie.name> writes: > On 2024-04-07 06:33 +0200, Tom Lane wrote: >> I suspect it'd be much more robust if we could remove the comment from >> the expr->query string. No idea how hard that is.
> I slept on it and I think this can be fixed by tracking the end of the > last token before THEN and use that instead of yylloc in the call to > plpgsql_append_source_text. We already already track the token length > in plpgsql_yyleng but don't make it available outside pl_scanner.c yet. > Attached v2 tries to do that. But it breaks other test cases, probably > because the calculation of endlocation is off. I'm missing something > here.
I poked at this and found that the failures occur when the patched code decides to trim an expression like "_r.v" to just "_r", naturally breaking the semantics completely. That happens because when plpgsql_yylex recognizes a compound token, it doesn't bother to adjust the token length to include the additional word(s). I vaguely remember having thought about that when writing the lookahead logic, and deciding that it wasn't worth the trouble -- but now it is. Up to now, the only thing we did with plpgsql_yyleng was to set the cutoff point for text reported by plpgsql_yyerror. Extending the token length changes reports like this:
regression=# do $$ declare r record; r.x$$; ERROR: syntax error at or near "r" LINE 1: do $$ declare r record; r.x$$; ^
to this:
regression=# do $$ declare r record; r.x$$; ERROR: syntax error at or near "r.x" LINE 1: do $$ declare r record; r.x$$; ^
which seems like strictly an improvement to me (the syntax error is premature EOF, which is after the "x"); but in any case it's minor enough to not be worth worrying about.
Looking around, I noticed that we *have* had a similar case in the past, which 4adead1d2 noticed and worked around by suppressing the whitespace-trimming action in read_sql_construct. We could probably reach a near-one-line fix for the current problem by passing trim=false in the CASE calls, but TBH that discovery just reinforces my feeling that we need a cleaner fix. The attached v3 reverts the make-trim-optional hack that 4adead1d2 added, since we don't need or want the manual trimming anymore.
With this in mind, I find the other manual whitespace trimming logic, in make_execsql_stmt(), quite scary; but it looks somewhat nontrivial to get rid of it. (The problem is that parsing of an INTO clause will leave us with a pushed-back token as next, and then we don't know where the end of the token before that is.) Since we don't currently do anything as crazy as combining execsql statements, I think the problem is only latent, but still...