Thread: Re: pure parsers and reentrant scanners

Re: pure parsers and reentrant scanners

From
Peter Eisentraut
Date:
On 17.12.24 01:46, Andreas Karlsson wrote:
> On 12/16/24 8:39 AM, Peter Eisentraut wrote:
>> I'll leave it at this for now and wait for some reviews.
> 
> I really like this work since it makes the code cleaner to read on top 
> of paving the way for threading.
> 
> Reviewed the patches and found a couple of issues.
> 
> - Shouldn't yyext in syncrep_scanner_init() be allocated on the heap? Or 
> at least on the stack but by the caller?

I think it's correct the way it is.  It's only a temporary space for the 
scanner, so we can allocate it in the innermost scope.

> - I think you have flipped the parameters of replication_yyerror(), see 
> attached fixup patch.

Good catch.  There was also a similar issue with syncrep_yyerror().

> - Some white space issues fixed in an attached fixup patch.

committed

> - Also fixed the static remaining variables in the replication parser in 
> an attached patch.

Thanks, I'll take a look at that.

> - There seems to be a lot left to do to make the plpgsql scanner 
> actually re-entrant so I do not think it would makes sense to commit the 
> patch which sets the re-entrant option before that is done.

Ok, we can hold that one back until the full stack including the parser 
is done.




Re: pure parsers and reentrant scanners

From
Tom Lane
Date:
I noticed that lapwing is bleating about

ccache gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla
-Wendif-labels-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g
-O2-fPIC -fvisibility=hidden -I. -I. -I../../src/include  -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS -D_GNU_SOURCE
-I/usr/include/libxml2 -I/usr/include/et  -c -o cubescan.o cubescan.c 
cubescan.c:1689:5: warning: no previous prototype for 'cube_yyget_column' [-Wmissing-prototypes]
cubescan.c:1765:6: warning: no previous prototype for 'cube_yyset_column' [-Wmissing-prototypes]

and likewise in segscan.c.  lapwing is using flex 2.5.35, so probably
this is the same bug worked around in parser/scan.l:

/*
 * Work around a bug in flex 2.5.35: it emits a couple of functions that
 * it forgets to emit declarations for.  Since we use -Wmissing-prototypes,
 * this would cause warnings.  Providing our own declarations should be
 * harmless even when the bug gets fixed.
 */
extern int    core_yyget_column(yyscan_t yyscanner);
extern void core_yyset_column(int column_no, yyscan_t yyscanner);

            regards, tom lane



Re: pure parsers and reentrant scanners

From
Peter Eisentraut
Date:
On 20.12.24 02:07, Tom Lane wrote:
> I noticed that lapwing is bleating about
> 
> ccache gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla
-Wendif-labels-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g
-O2-fPIC -fvisibility=hidden -I. -I. -I../../src/include  -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS -D_GNU_SOURCE
-I/usr/include/libxml2 -I/usr/include/et  -c -o cubescan.o cubescan.c
 
> cubescan.c:1689:5: warning: no previous prototype for 'cube_yyget_column' [-Wmissing-prototypes]
> cubescan.c:1765:6: warning: no previous prototype for 'cube_yyset_column' [-Wmissing-prototypes]
> 
> and likewise in segscan.c.  lapwing is using flex 2.5.35, so probably
> this is the same bug worked around in parser/scan.l:

Ok, we can fix that, but maybe this is also a good moment to think about 
whether that is useful.  I could not reproduce the issue with flex 
2.5.39.  I could find no download of flex 2.5.35.  The github site only 
offers back to 2.5.39, the sourceforce site back to 2.5.36.  lapwing 
says it's Debian 7.0, which went out of support in 2016 and out of 
super-duper-extended support in 2020.  It also doesn't have a supported 
OpenSSL version anymore, and IIRC, it has a weird old compiler that 
occasionally gives bogus warnings.  I think it's time to stop supporting 
this.




Re: pure parsers and reentrant scanners

From
Tom Lane
Date:
Peter Eisentraut <peter@eisentraut.org> writes:
> On 20.12.24 02:07, Tom Lane wrote:
>> I noticed that lapwing is bleating about
>> cubescan.c:1689:5: warning: no previous prototype for 'cube_yyget_column' [-Wmissing-prototypes]
>> cubescan.c:1765:6: warning: no previous prototype for 'cube_yyset_column' [-Wmissing-prototypes]
>> and likewise in segscan.c.  lapwing is using flex 2.5.35, so probably
>> this is the same bug worked around in parser/scan.l:

> Ok, we can fix that, but maybe this is also a good moment to think about
> whether that is useful.  I could not reproduce the issue with flex
> 2.5.39.  I could find no download of flex 2.5.35.  The github site only
> offers back to 2.5.39, the sourceforce site back to 2.5.36.  lapwing
> says it's Debian 7.0, which went out of support in 2016 and out of
> super-duper-extended support in 2020.  It also doesn't have a supported
> OpenSSL version anymore, and IIRC, it has a weird old compiler that
> occasionally gives bogus warnings.  I think it's time to stop supporting
> this.

OK, that's fair.  I do see lapwing called out a lot in the commit log,
though it's not clear how much of that is about 32-bitness and how
much about old tools.  It's surely still valuable to have i386
machines in the buildfarm, but I agree that supporting unobtainable
tool versions is a bit much.  Could we get that animal updated to
some newer OS version?

Presumably, we should also rip out the existing yyget_column and
yyset_column kluges in

src/backend/parser/scan.l: extern int      core_yyget_column(yyscan_t yyscanner);
src/bin/psql/psqlscanslash.l: extern int   slash_yyget_column(yyscan_t yyscanner);
src/bin/pgbench/exprscan.l: extern int     expr_yyget_column(yyscan_t yyscanner);
src/fe_utils/psqlscan.l: extern int        psql_yyget_column(yyscan_t yyscanner);

            regards, tom lane



Re: pure parsers and reentrant scanners

From
Julien Rouhaud
Date:
On Fri, Dec 20, 2024 at 11:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Could we get that animal updated to
> some newer OS version?

There is already adder animal that is running debian sid on i386.  The
only remaining interest in lapwing is to have older versions of
everything, so if that's useless I can just trash that vm.



Re: pure parsers and reentrant scanners

From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Fri, Dec 20, 2024 at 11:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Could we get that animal updated to
>> some newer OS version?

> There is already adder animal that is running debian sid on i386.  The
> only remaining interest in lapwing is to have older versions of
> everything, so if that's useless I can just trash that vm.

Hmm, sid is the opposite extreme no?  Maybe switching lapwing to
whatever is currently the oldest supported Debian release would
be a good answer.

            regards, tom lane



Re: pure parsers and reentrant scanners

From
Peter Eisentraut
Date:
On 20.12.24 16:35, Tom Lane wrote:
> Julien Rouhaud <rjuju123@gmail.com> writes:
>> On Fri, Dec 20, 2024 at 11:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Could we get that animal updated to
>>> some newer OS version?
> 
>> There is already adder animal that is running debian sid on i386.  The
>> only remaining interest in lapwing is to have older versions of
>> everything, so if that's useless I can just trash that vm.
> 
> Hmm, sid is the opposite extreme no?  Maybe switching lapwing to
> whatever is currently the oldest supported Debian release would
> be a good answer.

Yeah, Debian stable or oldstable on i386 could certainly be useful.




Re: pure parsers and reentrant scanners

From
Andreas Karlsson
Date:
On 12/19/24 9:57 PM, Peter Eisentraut wrote:
> Here is an updated patch set on top of what has been committed so far, 
> with all the issues you pointed out addressed.

Other than the discussion of how old versions of flex we should support 
I think this set of patches is ready to be committed. I looked at it 
again and everything looks good.

Andreas




Re: pure parsers and reentrant scanners

From
Heikki Linnakangas
Date:
On 26/12/2024 20:27, Peter Eisentraut wrote:
> On 22.12.24 22:43, Andreas Karlsson wrote:
>> On 12/19/24 9:57 PM, Peter Eisentraut wrote:
>>> Here is an updated patch set on top of what has been committed so 
>>> far, with all the issues you pointed out addressed.
>>
>> Other than the discussion of how old versions of flex we should 
>> support I think this set of patches is ready to be committed. I looked 
>> at it again and everything looks good.
> 
> I have committed these except the plpgsql one, which was still work in 
> progress.  But I have progressed on this now and also converted the 
> parser and put the local state into yyextra.  This gets rid of all 
> internal global state now.  The patches for this are attached.  It's a 
> lot of churn, but otherwise pretty standard stuff.

Looks good to me.

> Along the way I noticed that the flex documentation now recommends a 
> different way to set the yyextra type.  So I have changed the ones we 
> already have to that newer style.  I inspected the generated C code and 
> there wasn't any significant difference, so I'm not sure, but I figure 
> if we're making changes in this area we might as well use the modern style.

+1. According to the flex NEWS file, this syntax was added in flex 
2.5.34, and we already require 2.5.35.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: pure parsers and reentrant scanners

From
Peter Eisentraut
Date:
On 27.12.24 10:19, Heikki Linnakangas wrote:
> On 26/12/2024 20:27, Peter Eisentraut wrote:
>> On 22.12.24 22:43, Andreas Karlsson wrote:
>>> On 12/19/24 9:57 PM, Peter Eisentraut wrote:
>>>> Here is an updated patch set on top of what has been committed so 
>>>> far, with all the issues you pointed out addressed.
>>>
>>> Other than the discussion of how old versions of flex we should 
>>> support I think this set of patches is ready to be committed. I 
>>> looked at it again and everything looks good.
>>
>> I have committed these except the plpgsql one, which was still work in 
>> progress.  But I have progressed on this now and also converted the 
>> parser and put the local state into yyextra.  This gets rid of all 
>> internal global state now.  The patches for this are attached.  It's a 
>> lot of churn, but otherwise pretty standard stuff.
> 
> Looks good to me.
> 
>> Along the way I noticed that the flex documentation now recommends a 
>> different way to set the yyextra type.  So I have changed the ones we 
>> already have to that newer style.  I inspected the generated C code 
>> and there wasn't any significant difference, so I'm not sure, but I 
>> figure if we're making changes in this area we might as well use the 
>> modern style.
> 
> +1. According to the flex NEWS file, this syntax was added in flex 
> 2.5.34, and we already require 2.5.35.

These have been committed.  This concludes the main body of this work.

I'll let it take a lap around the buildfarm and will follow up in a few 
days on what if anything lapwing has to say about it in terms of 
warnings and what we want to do about it.




Re: pure parsers and reentrant scanners

From
Tom Lane
Date:
Peter Eisentraut <peter@eisentraut.org> writes:
> The second patch contemplates raising the minimum required flex version, 
> but what to?

Meh, let's just rip out the version check.  It's no longer very
relevant.  Nobody is going to be using anything older than 2.5.35.
While 2.5.35 produces compile warnings, it does still work, so
rejecting it with a changed version check seems unnecessary.

            regards, tom lane