Thread: pgbench client-side performance issue on large scripts

pgbench client-side performance issue on large scripts

From
"Daniel Verite"
Date:
Hi,

On large scripts, pgbench happens to consume a lot of CPU time.
For instance, with a script consisting of 50000 "SELECT 1;"
I see "pgbench -f 50k-select.sql" taking about 5.8 secs of CPU time,
out of a total time of 6.7 secs. When run with perf, this profile shows up:

  81,10%  pgbench  pgbench         [.] expr_scanner_get_lineno
   0,36%  pgbench  [unknown]         [k] 0xffffffffac90410b
   0,33%  pgbench  [unknown]         [k] 0xffffffffac904109
   0,23%  pgbench  libpq.so.5.18     [.] pqParseInput3
   0,21%  pgbench  [unknown]         [k] 0xffffffffac800080
   0,17%  pgbench  pgbench         [.] advanceConnectionState
   0,15%  pgbench  [unknown]         [k] 0xffffffffac904104
   0,14%  pgbench  libpq.so.5.18     [.] PQmakeEmptyPGresult

In ParseScript(), expr_scanner_get_lineno() is called for each line
with its current offset, and it scans the script from the beginning
up to the current line. I think that on the whole, parsing this script
ends up looking at (N*(N+1))/2 lines, which is 1.275 billion lines
if N=50000.
Since it only need the current line number in case of certain errors
with \gset, I've made the trivial attached fix calling
expr_scanner_get_lineno() only in these cases.
This moves the CPU consumption to a more reasonable 0.1s in the
above test case (with the drawback of having the line number pointing
one line after).

However, there is another caller, process_backslash_command()
which is not amenable to the same kind of easy fix. A large script
having backslash commands near the end is also penalized, and it
would be nice to fix that as well.

I wonder whether pgbench should materialize the current line number
in a variable, as psql does in pset.lineno. But given that there are
two different parsers in pgbench, maybe it's not the simplest.
Flex has yylineno but neither pgbench nor psql make use of it.
I thought I would ask before going further into this, as there might
be a better method that I don't see, being unfamiliar with that code
and with flex/bison.
WDYT?


Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/

Attachment

Re: pgbench client-side performance issue on large scripts

From
Tom Lane
Date:
"Daniel Verite" <daniel@manitou-mail.org> writes:
> On large scripts, pgbench happens to consume a lot of CPU time.
> For instance, with a script consisting of 50000 "SELECT 1;"
> I see "pgbench -f 50k-select.sql" taking about 5.8 secs of CPU time,
> out of a total time of 6.7 secs. When run with perf, this profile shows up:

You ran only a single execution of a 50K-line script?  This test
case feels a little bit artificial.  Having said that ...

> In ParseScript(), expr_scanner_get_lineno() is called for each line
> with its current offset, and it scans the script from the beginning
> up to the current line. I think that on the whole, parsing this script
> ends up looking at (N*(N+1))/2 lines, which is 1.275 billion lines
> if N=50000.

... yes, O(N^2) is not nice.  It has to be possible to do better.

> I wonder whether pgbench should materialize the current line number
> in a variable, as psql does in pset.lineno. But given that there are
> two different parsers in pgbench, maybe it's not the simplest.
> Flex has yylineno but neither pgbench nor psql make use of it.

Yeah, we do rely on yylineno in bootscanner.l and ecpg, but not
elsewhere; not sure if there's a performance reason for that.
I see that plpgsql has a hand-rolled version (look for cur_line_num)
that perhaps could be stolen.

            regards, tom lane



Re: pgbench client-side performance issue on large scripts

From
Tom Lane
Date:
I wrote:
> Yeah, we do rely on yylineno in bootscanner.l and ecpg, but not
> elsewhere; not sure if there's a performance reason for that.

Ah: the flex manual specifically calls out "%option yylineno"
as something that has a moderate performance cost.  So that's
why we don't use it except in non-performance-critical scanners.

Now, it could be argued that pgbench's script scanner doesn't
rise to that level of performance-criticalness, especially not
if we're paying the cost of counting newlines some other way.
I'm not excited about doing a lot of performance analysis here
to decide that.  I think we could steal plpgsql's idea to
keep the code structure basically the same while avoiding the
O(N^2) repeat scans, and that should be enough.

            regards, tom lane



Re: pgbench client-side performance issue on large scripts

From
"Daniel Verite"
Date:
    Tom Lane wrote:

> > I see "pgbench -f 50k-select.sql" taking about 5.8 secs of CPU time,
> > out of a total time of 6.7 secs. When run with perf, this profile shows up:
>
> You ran only a single execution of a 50K-line script?  This test
> case feels a little bit artificial.  Having said that ...

I guess my use case is unusual, otherwise the O(N^2) parse
time would have been noticed sooner, but it's genuine.

I want to see how much a long sequence of statement inside a
pipeline differs between pgbench and psql with
the proposed patch at [1] that implements pipelining in psql.
psql would not actually send statements until \endpipeline is
reached, whereas pgbench does.
In fact I'd be inclined to push much more statements in the pipeline
than 50k, but then the parse time issue really kicks in.

For the moment I'll stay with my quick fix, then l'll try
to come up with something to replace expr_scanner_get_lineno() .


[1] https://commitfest.postgresql.org/patch/5407/

Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/



Re: pgbench client-side performance issue on large scripts

From
Tom Lane
Date:
"Daniel Verite" <daniel@manitou-mail.org> writes:
> For the moment I'll stay with my quick fix, then l'll try
> to come up with something to replace expr_scanner_get_lineno() .

I got nerd-sniped by this question and spent some time looking into
it.  ParseScript has got worse problems than just being slow: it's
actively buggy.  Notice that start_offset is set only once before
entering the loop, and doesn't change thereafter.  How is it that
we're getting sane line numbers at all?  The reason is that (1) if
we've not called yylex() at all yet, expr_scanner_offset() gives the
distance to the end of the string, since the yytext-ending NUL it's
looking for isn't there yet; and (2) expr_scanner_get_lineno() treats
the given start_offset as an upper bound, and won't complain if it
finds the NUL earlier than that.  So it gave the desired
line-number-of-the-current-token on all iterations after the first,
but on the first time through we get the line number of the script
end.  You can only see that in the case of \gset as the first command,
and I guess nobody noticed it yet.

Furthermore, it's not only ParseScript that's got O(N^2) problems;
so does process_backslash_command.  Your test case didn't show that
up, but a test with 50K backslash commands would.  We were actually
doing a strlen() of the whole string for each word of a backslash
command.  strlen() is likely faster than expr_scanner_get_lineno(),
but it's not so fast that O(N^2) effects don't matter.

The attached patch gets rid of both expr_scanner_offset() and
expr_scanner_get_lineno() altogether, in favor of using a new
function I added to psqlscan.l.  That uses the idea from plpgsql
of tracking the last-detected line end so that we don't have to
rescan prior lines over and over.  On my machine, parsing 50K-line
scripts goes from more than 10 seconds to perhaps 50 ms.

            regards, tom lane

From 542c081fd2fe2b60713b6fabbc59598c449f4c9e Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Tue, 25 Feb 2025 16:30:48 -0500
Subject: [PATCH v1] Get rid of O(N^2) script-parsing overhead in pgbench.

pgbench wants to record the starting line number of each command
in its scripts.  It was computing that by scanning from the script
start and counting newlines, so that O(N^2) work had to be done
for an N-command script.  In a script with 50K lines, this adds
up to about 10 seconds on my machine.

To add insult to injury, the results were subtly wrong, because
expr_scanner_offset() scanned to find the NUL that flex inserts
at the end of the current token --- and before the first yylex
call, no such NUL has been inserted.  So we ended by computing the
script's last line number not its first one.  This was visible only
in case of \gset at the start of a script, which perhaps accounts
for the lack of complaints.

To fix, steal an idea from plpgsql and track the most recently
identified line start as we advance through the script.  This
is potentially still O(N^2) with sufficiently long lines, but
that seems unlikely to be a problem in practice.  Also adjust
a couple of other places that were invoking scans from script
start when they didn't really need to.  I made a new psqlscan
function psql_scan_get_location() that replaces both
expr_scanner_offset() and expr_scanner_get_lineno(), since
in practice expr_scanner_get_lineno() was only being invoked
to find the line number of the current parsing offset.

Reported-by: Daniel Verite <daniel@manitou-mail.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/84a8a89e-adb8-47a9-9d34-c13f7150ee45@manitou-mail.org
---
 src/bin/pgbench/exprscan.l          | 74 ++++++++++-------------------
 src/bin/pgbench/pgbench.c           | 14 +++---
 src/bin/pgbench/pgbench.h           |  4 +-
 src/fe_utils/psqlscan.l             | 54 +++++++++++++++++++++
 src/include/fe_utils/psqlscan.h     |  3 ++
 src/include/fe_utils/psqlscan_int.h |  4 ++
 6 files changed, 93 insertions(+), 60 deletions(-)

diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 8943a52e9f0..b4800937691 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -271,10 +271,14 @@ void
 expr_yyerror_more(yyscan_t yyscanner, const char *message, const char *more)
 {
     PsqlScanState state = yyget_extra(yyscanner);
-    int            error_detection_offset = expr_scanner_offset(state) - 1;
+    int            lineno;
+    int            error_detection_offset;
     YYSTYPE        lval;
     char       *full_line;

+    psql_scan_get_location(state, &lineno, &error_detection_offset);
+    error_detection_offset--;
+
     /*
      * While parsing an expression, we may not have collected the whole line
      * yet from the input source.  Lex till EOL so we can report whole line.
@@ -289,7 +293,6 @@ expr_yyerror_more(yyscan_t yyscanner, const char *message, const char *more)
     /* Extract the line, trimming trailing newline if any */
     full_line = expr_scanner_get_substring(state,
                                            expr_start_offset,
-                                           expr_scanner_offset(state),
                                            true);

     syntax_error(expr_source, expr_lineno, full_line, expr_command,
@@ -336,12 +339,15 @@ expr_lex_one_word(PsqlScanState state, PQExpBuffer word_buf, int *offset)
     /* And lex. */
     lexresult = yylex(&lval, state->scanner);

-    /*
-     * Save start offset of word, if any.  We could do this more efficiently,
-     * but for now this seems fine.
-     */
+    /* Save start offset of word, if any. */
     if (lexresult)
-        *offset = expr_scanner_offset(state) - word_buf->len;
+    {
+        int            lineno;
+        int            end_offset;
+
+        psql_scan_get_location(state, &lineno, &end_offset);
+        *offset = end_offset - word_buf->len;
+    }
     else
         *offset = -1;

@@ -404,65 +410,35 @@ expr_scanner_finish(yyscan_t yyscanner)
 }

 /*
- * Get offset from start of string to end of current lexer token.
+ * Get a malloc'd copy of the lexer input string from start_offset
+ * to end of current lexer token.  If chomp is true, drop any trailing
+ * newline(s).
  *
  * We rely on the knowledge that flex modifies the scan buffer by storing
  * a NUL at the end of the current token (yytext).  Note that this might
  * not work quite right if we were parsing a sub-buffer, but since pgbench
- * never invokes that functionality, it doesn't matter.
- */
-int
-expr_scanner_offset(PsqlScanState state)
-{
-    return strlen(state->scanbuf);
-}
-
-/*
- * Get a malloc'd copy of the lexer input string from start_offset
- * to just before end_offset.  If chomp is true, drop any trailing
- * newline(s).
+ * never invokes that functionality, it doesn't matter.  Also, this will
+ * give the wrong answer (the whole remainder of the input) if called
+ * before any yylex() call has been done.
  */
 char *
 expr_scanner_get_substring(PsqlScanState state,
-                           int start_offset, int end_offset,
+                           int start_offset,
                            bool chomp)
 {
     char       *result;
-    const char *scanptr = state->scanbuf + start_offset;
-    int            slen = end_offset - start_offset;

-    Assert(slen >= 0);
-    Assert(end_offset <= strlen(state->scanbuf));
+    result = pg_strdup(state->scanbuf + start_offset);

     if (chomp)
     {
+        size_t        slen = strlen(result);
+
         while (slen > 0 &&
-               (scanptr[slen - 1] == '\n' || scanptr[slen - 1] == '\r'))
+               (result[slen - 1] == '\n' || result[slen - 1] == '\r'))
             slen--;
+        result[slen] = '\0';
     }

-    result = (char *) pg_malloc(slen + 1);
-    memcpy(result, scanptr, slen);
-    result[slen] = '\0';
-
     return result;
 }
-
-/*
- * Get the line number associated with the given string offset
- * (which must not be past the end of where we've lexed to).
- */
-int
-expr_scanner_get_lineno(PsqlScanState state, int offset)
-{
-    int            lineno = 1;
-    const char *p = state->scanbuf;
-
-    while (*p && offset > 0)
-    {
-        if (*p == '\n')
-            lineno++;
-        p++, offset--;
-    }
-    return lineno;
-}
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index fdc957fa34d..38f8bc11bcd 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5690,8 +5690,8 @@ process_backslash_command(PsqlScanState sstate, const char *source)
     initPQExpBuffer(&word_buf);

     /* Remember location of the backslash */
-    start_offset = expr_scanner_offset(sstate) - 1;
-    lineno = expr_scanner_get_lineno(sstate, start_offset);
+    psql_scan_get_location(sstate, &lineno, &start_offset);
+    start_offset--;

     /* Collect first word of command */
     if (!expr_lex_one_word(sstate, &word_buf, &word_offset))
@@ -5747,7 +5747,6 @@ process_backslash_command(PsqlScanState sstate, const char *source)
         my_command->first_line =
             expr_scanner_get_substring(sstate,
                                        start_offset,
-                                       expr_scanner_offset(sstate),
                                        true);

         expr_scanner_finish(yyscanner);
@@ -5777,7 +5776,6 @@ process_backslash_command(PsqlScanState sstate, const char *source)
     my_command->first_line =
         expr_scanner_get_substring(sstate,
                                    start_offset,
-                                   expr_scanner_offset(sstate),
                                    true);

     if (my_command->meta == META_SLEEP)
@@ -5952,8 +5950,6 @@ ParseScript(const char *script, const char *desc, int weight)
     PQExpBufferData line_buf;
     int            alloc_num;
     int            index;
-    int            lineno;
-    int            start_offset;

 #define COMMANDS_ALLOC_NUM 128
     alloc_num = COMMANDS_ALLOC_NUM;
@@ -5977,7 +5973,6 @@ ParseScript(const char *script, const char *desc, int weight)
      * stdstrings should be true, which is a bit riskier.
      */
     psql_scan_setup(sstate, script, strlen(script), 0, true);
-    start_offset = expr_scanner_offset(sstate) - 1;

     initPQExpBuffer(&line_buf);

@@ -5985,12 +5980,15 @@ ParseScript(const char *script, const char *desc, int weight)

     for (;;)
     {
+        int            lineno;
+        int            start_offset;
         PsqlScanResult sr;
         promptStatus_t prompt;
         Command    *command = NULL;

         resetPQExpBuffer(&line_buf);
-        lineno = expr_scanner_get_lineno(sstate, start_offset);
+
+        psql_scan_get_location(sstate, &lineno, &start_offset);

         sr = psql_scan(sstate, &line_buf, &prompt);

diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index f6a883611c5..0ba216e5f72 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -149,11 +149,9 @@ extern yyscan_t expr_scanner_init(PsqlScanState state,
                                   const char *source, int lineno, int start_offset,
                                   const char *command);
 extern void expr_scanner_finish(yyscan_t yyscanner);
-extern int    expr_scanner_offset(PsqlScanState state);
 extern char *expr_scanner_get_substring(PsqlScanState state,
-                                        int start_offset, int end_offset,
+                                        int start_offset,
                                         bool chomp);
-extern int    expr_scanner_get_lineno(PsqlScanState state, int offset);

 extern void syntax_error(const char *source, int lineno, const char *line,
                          const char *command, const char *msg,
diff --git a/src/fe_utils/psqlscan.l b/src/fe_utils/psqlscan.l
index b3c6b88e9ca..f2bcaa4cc3a 100644
--- a/src/fe_utils/psqlscan.l
+++ b/src/fe_utils/psqlscan.l
@@ -1079,6 +1079,10 @@ psql_scan_setup(PsqlScanState state,
     /* Set lookaside data in case we have to map unsafe encoding */
     state->curline = state->scanbuf;
     state->refline = state->scanline;
+
+    /* Initialize state for psql_scan_get_location() */
+    state->cur_line_no = 0;        /* yylex not called yet */
+    state->cur_line_start = state->scanbuf;
 }

 /*
@@ -1136,6 +1140,10 @@ psql_scan(PsqlScanState state,
     /* And lex. */
     lexresult = yylex(NULL, state->scanner);

+    /* Notify psql_scan_get_location() that a yylex call has been made. */
+    if (state->cur_line_no == 0)
+        state->cur_line_no = 1;
+
     /*
      * Check termination state and return appropriate result info.
      */
@@ -1311,6 +1319,52 @@ psql_scan_in_quote(PsqlScanState state)
         state->start_state != xqs;
 }

+/*
+ * Return the current scanning location (end+1 of last scanned token),
+ * as a line number counted from 1 and an offset from string start.
+ *
+ * This considers only the outermost input string, and therefore is of
+ * limited use for programs that use psqlscan_push_new_buffer().
+ *
+ * It would be a bit easier probably to use "%option yylineno" to count
+ * lines, but the flex manual says that has a performance cost, and only
+ * a minority of programs using psqlscan have need for this functionality.
+ * So we implement it ourselves without adding overhead to the lexer itself.
+ */
+void
+psql_scan_get_location(PsqlScanState state,
+                       int *lineno, int *offset)
+{
+    const char *line_end;
+
+    /*
+     * We rely on flex's having stored a NUL after the current token in
+     * scanbuf.  Therefore we must specially handle the state before yylex()
+     * has been called, when obviously that won't have happened yet.
+     */
+    if (state->cur_line_no == 0)
+    {
+        *lineno = 1;
+        *offset = 0;
+        return;
+    }
+
+    /*
+     * Advance cur_line_no/cur_line_start if any full lines have been read.
+     * Doing this prevents the operation from being O(N^2) for long inputs.
+     */
+    while ((line_end = strchr(state->cur_line_start, '\n')) != NULL)
+    {
+        state->cur_line_no++;
+        state->cur_line_start = line_end + 1;
+    }
+
+    /* Report current location. */
+    *lineno = state->cur_line_no;
+    *offset = (state->cur_line_start - state->scanbuf) +
+        strlen(state->cur_line_start);
+}
+
 /*
  * Push the given string onto the stack of stuff to scan.
  *
diff --git a/src/include/fe_utils/psqlscan.h b/src/include/fe_utils/psqlscan.h
index 81f792b1733..39d2065fe98 100644
--- a/src/include/fe_utils/psqlscan.h
+++ b/src/include/fe_utils/psqlscan.h
@@ -87,4 +87,7 @@ extern void psql_scan_reselect_sql_lexer(PsqlScanState state);

 extern bool psql_scan_in_quote(PsqlScanState state);

+extern void psql_scan_get_location(PsqlScanState state,
+                                   int *lineno, int *offset);
+
 #endif                            /* PSQLSCAN_H */
diff --git a/src/include/fe_utils/psqlscan_int.h b/src/include/fe_utils/psqlscan_int.h
index 37798114873..453735c9dc8 100644
--- a/src/include/fe_utils/psqlscan_int.h
+++ b/src/include/fe_utils/psqlscan_int.h
@@ -104,6 +104,10 @@ typedef struct PsqlScanStateData
     const char *curline;        /* actual flex input string for cur buf */
     const char *refline;        /* original data for cur buffer */

+    /* status for psql_scan_get_location() */
+    int            cur_line_no;    /* current line#, or 0 if no yylex done */
+    const char *cur_line_start; /* start of cur_line_no'th line in scanbuf */
+
     /*
      * All this state lives across successive input lines, until explicitly
      * reset by psql_scan_reset.  start_state is adopted by yylex() on entry,
--
2.43.5


Re: pgbench client-side performance issue on large scripts

From
Tom Lane
Date:
I wrote:
> I got nerd-sniped by this question and spent some time looking into
> it.

On second look, I'd failed to absorb your point about how the main
loop of ParseScript doesn't need the line number at all; only if
it's a backslash command are we going to use that.  So we can
move the calculation to be done only after we see a backslash.

I'd spent a little time worrying about how the calculation was
really giving a wrong line number: typically, it'd return the
line number of the previous semicolon, since we haven't lexed
any further than that.  That could be fixed with more code,
but it's pretty pointless if we don't need the value in the
first place.

Also, I did a tiny bit of micro-optimization in the first
patch to remove the hazard that it'd still be O(N^2) with
very long input lines.

            regards, tom lane

From 116fe4191f81736f55d851de9115218d3e3f132c Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Tue, 25 Feb 2025 18:13:51 -0500
Subject: [PATCH v2 1/2] Get rid of O(N^2) script-parsing overhead in pgbench.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

pgbench wants to record the starting line number of each command
in its scripts.  It was computing that by scanning from the script
start and counting newlines, so that O(N^2) work had to be done
for an N-command script.  In a script with 50K lines, this adds
up to about 10 seconds on my machine.

To add insult to injury, the results were subtly wrong, because
expr_scanner_offset() scanned to find the NUL that flex inserts
at the end of the current token --- and before the first yylex
call, no such NUL has been inserted.  So we ended by computing the
script's last line number not its first one.  This was visible only
in case of \gset at the start of a script, which perhaps accounts
for the lack of complaints.

To fix, steal an idea from plpgsql and track the current lexer
ending position and line count as we advance through the script.
(It's a bit simpler than plpgsql since we can't need to back up.)
Also adjust a couple of other places that were invoking scans
from script start when they didn't really need to.  I made a new
psqlscan function psql_scan_get_location() that replaces both
expr_scanner_offset() and expr_scanner_get_lineno(), since in
practice expr_scanner_get_lineno() was only being invoked to find
the line number of the current lexer end position.

Reported-by: Daniel Vérité <daniel@manitou-mail.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/84a8a89e-adb8-47a9-9d34-c13f7150ee45@manitou-mail.org
---
 src/bin/pgbench/exprscan.l          | 74 ++++++++++-------------------
 src/bin/pgbench/pgbench.c           | 14 +++---
 src/bin/pgbench/pgbench.h           |  4 +-
 src/fe_utils/psqlscan.l             | 54 +++++++++++++++++++++
 src/include/fe_utils/psqlscan.h     |  3 ++
 src/include/fe_utils/psqlscan_int.h |  4 ++
 6 files changed, 93 insertions(+), 60 deletions(-)

diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 8943a52e9f0..b4800937691 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -271,10 +271,14 @@ void
 expr_yyerror_more(yyscan_t yyscanner, const char *message, const char *more)
 {
     PsqlScanState state = yyget_extra(yyscanner);
-    int            error_detection_offset = expr_scanner_offset(state) - 1;
+    int            lineno;
+    int            error_detection_offset;
     YYSTYPE        lval;
     char       *full_line;

+    psql_scan_get_location(state, &lineno, &error_detection_offset);
+    error_detection_offset--;
+
     /*
      * While parsing an expression, we may not have collected the whole line
      * yet from the input source.  Lex till EOL so we can report whole line.
@@ -289,7 +293,6 @@ expr_yyerror_more(yyscan_t yyscanner, const char *message, const char *more)
     /* Extract the line, trimming trailing newline if any */
     full_line = expr_scanner_get_substring(state,
                                            expr_start_offset,
-                                           expr_scanner_offset(state),
                                            true);

     syntax_error(expr_source, expr_lineno, full_line, expr_command,
@@ -336,12 +339,15 @@ expr_lex_one_word(PsqlScanState state, PQExpBuffer word_buf, int *offset)
     /* And lex. */
     lexresult = yylex(&lval, state->scanner);

-    /*
-     * Save start offset of word, if any.  We could do this more efficiently,
-     * but for now this seems fine.
-     */
+    /* Save start offset of word, if any. */
     if (lexresult)
-        *offset = expr_scanner_offset(state) - word_buf->len;
+    {
+        int            lineno;
+        int            end_offset;
+
+        psql_scan_get_location(state, &lineno, &end_offset);
+        *offset = end_offset - word_buf->len;
+    }
     else
         *offset = -1;

@@ -404,65 +410,35 @@ expr_scanner_finish(yyscan_t yyscanner)
 }

 /*
- * Get offset from start of string to end of current lexer token.
+ * Get a malloc'd copy of the lexer input string from start_offset
+ * to end of current lexer token.  If chomp is true, drop any trailing
+ * newline(s).
  *
  * We rely on the knowledge that flex modifies the scan buffer by storing
  * a NUL at the end of the current token (yytext).  Note that this might
  * not work quite right if we were parsing a sub-buffer, but since pgbench
- * never invokes that functionality, it doesn't matter.
- */
-int
-expr_scanner_offset(PsqlScanState state)
-{
-    return strlen(state->scanbuf);
-}
-
-/*
- * Get a malloc'd copy of the lexer input string from start_offset
- * to just before end_offset.  If chomp is true, drop any trailing
- * newline(s).
+ * never invokes that functionality, it doesn't matter.  Also, this will
+ * give the wrong answer (the whole remainder of the input) if called
+ * before any yylex() call has been done.
  */
 char *
 expr_scanner_get_substring(PsqlScanState state,
-                           int start_offset, int end_offset,
+                           int start_offset,
                            bool chomp)
 {
     char       *result;
-    const char *scanptr = state->scanbuf + start_offset;
-    int            slen = end_offset - start_offset;

-    Assert(slen >= 0);
-    Assert(end_offset <= strlen(state->scanbuf));
+    result = pg_strdup(state->scanbuf + start_offset);

     if (chomp)
     {
+        size_t        slen = strlen(result);
+
         while (slen > 0 &&
-               (scanptr[slen - 1] == '\n' || scanptr[slen - 1] == '\r'))
+               (result[slen - 1] == '\n' || result[slen - 1] == '\r'))
             slen--;
+        result[slen] = '\0';
     }

-    result = (char *) pg_malloc(slen + 1);
-    memcpy(result, scanptr, slen);
-    result[slen] = '\0';
-
     return result;
 }
-
-/*
- * Get the line number associated with the given string offset
- * (which must not be past the end of where we've lexed to).
- */
-int
-expr_scanner_get_lineno(PsqlScanState state, int offset)
-{
-    int            lineno = 1;
-    const char *p = state->scanbuf;
-
-    while (*p && offset > 0)
-    {
-        if (*p == '\n')
-            lineno++;
-        p++, offset--;
-    }
-    return lineno;
-}
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index fdc957fa34d..38f8bc11bcd 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5690,8 +5690,8 @@ process_backslash_command(PsqlScanState sstate, const char *source)
     initPQExpBuffer(&word_buf);

     /* Remember location of the backslash */
-    start_offset = expr_scanner_offset(sstate) - 1;
-    lineno = expr_scanner_get_lineno(sstate, start_offset);
+    psql_scan_get_location(sstate, &lineno, &start_offset);
+    start_offset--;

     /* Collect first word of command */
     if (!expr_lex_one_word(sstate, &word_buf, &word_offset))
@@ -5747,7 +5747,6 @@ process_backslash_command(PsqlScanState sstate, const char *source)
         my_command->first_line =
             expr_scanner_get_substring(sstate,
                                        start_offset,
-                                       expr_scanner_offset(sstate),
                                        true);

         expr_scanner_finish(yyscanner);
@@ -5777,7 +5776,6 @@ process_backslash_command(PsqlScanState sstate, const char *source)
     my_command->first_line =
         expr_scanner_get_substring(sstate,
                                    start_offset,
-                                   expr_scanner_offset(sstate),
                                    true);

     if (my_command->meta == META_SLEEP)
@@ -5952,8 +5950,6 @@ ParseScript(const char *script, const char *desc, int weight)
     PQExpBufferData line_buf;
     int            alloc_num;
     int            index;
-    int            lineno;
-    int            start_offset;

 #define COMMANDS_ALLOC_NUM 128
     alloc_num = COMMANDS_ALLOC_NUM;
@@ -5977,7 +5973,6 @@ ParseScript(const char *script, const char *desc, int weight)
      * stdstrings should be true, which is a bit riskier.
      */
     psql_scan_setup(sstate, script, strlen(script), 0, true);
-    start_offset = expr_scanner_offset(sstate) - 1;

     initPQExpBuffer(&line_buf);

@@ -5985,12 +5980,15 @@ ParseScript(const char *script, const char *desc, int weight)

     for (;;)
     {
+        int            lineno;
+        int            start_offset;
         PsqlScanResult sr;
         promptStatus_t prompt;
         Command    *command = NULL;

         resetPQExpBuffer(&line_buf);
-        lineno = expr_scanner_get_lineno(sstate, start_offset);
+
+        psql_scan_get_location(sstate, &lineno, &start_offset);

         sr = psql_scan(sstate, &line_buf, &prompt);

diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index f6a883611c5..0ba216e5f72 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -149,11 +149,9 @@ extern yyscan_t expr_scanner_init(PsqlScanState state,
                                   const char *source, int lineno, int start_offset,
                                   const char *command);
 extern void expr_scanner_finish(yyscan_t yyscanner);
-extern int    expr_scanner_offset(PsqlScanState state);
 extern char *expr_scanner_get_substring(PsqlScanState state,
-                                        int start_offset, int end_offset,
+                                        int start_offset,
                                         bool chomp);
-extern int    expr_scanner_get_lineno(PsqlScanState state, int offset);

 extern void syntax_error(const char *source, int lineno, const char *line,
                          const char *command, const char *msg,
diff --git a/src/fe_utils/psqlscan.l b/src/fe_utils/psqlscan.l
index b3c6b88e9ca..0978a515b7a 100644
--- a/src/fe_utils/psqlscan.l
+++ b/src/fe_utils/psqlscan.l
@@ -1079,6 +1079,10 @@ psql_scan_setup(PsqlScanState state,
     /* Set lookaside data in case we have to map unsafe encoding */
     state->curline = state->scanbuf;
     state->refline = state->scanline;
+
+    /* Initialize state for psql_scan_get_location() */
+    state->cur_line_no = 0;        /* yylex not called yet */
+    state->cur_line_ptr = state->scanbuf;
 }

 /*
@@ -1136,6 +1140,10 @@ psql_scan(PsqlScanState state,
     /* And lex. */
     lexresult = yylex(NULL, state->scanner);

+    /* Notify psql_scan_get_location() that a yylex call has been made. */
+    if (state->cur_line_no == 0)
+        state->cur_line_no = 1;
+
     /*
      * Check termination state and return appropriate result info.
      */
@@ -1311,6 +1319,52 @@ psql_scan_in_quote(PsqlScanState state)
         state->start_state != xqs;
 }

+/*
+ * Return the current scanning location (end+1 of last scanned token),
+ * as a line number counted from 1 and an offset from string start.
+ *
+ * This considers only the outermost input string, and therefore is of
+ * limited use for programs that use psqlscan_push_new_buffer().
+ *
+ * It would be a bit easier probably to use "%option yylineno" to count
+ * lines, but the flex manual says that has a performance cost, and only
+ * a minority of programs using psqlscan have need for this functionality.
+ * So we implement it ourselves without adding overhead to the lexer itself.
+ */
+void
+psql_scan_get_location(PsqlScanState state,
+                       int *lineno, int *offset)
+{
+    const char *line_end;
+
+    /*
+     * We rely on flex's having stored a NUL after the current token in
+     * scanbuf.  Therefore we must specially handle the state before yylex()
+     * has been called, when obviously that won't have happened yet.
+     */
+    if (state->cur_line_no == 0)
+    {
+        *lineno = 1;
+        *offset = 0;
+        return;
+    }
+
+    /*
+     * Advance cur_line_no/cur_line_ptr past whatever has been lexed so far.
+     * Doing this prevents the operation from being O(N^2) for long inputs.
+     */
+    while ((line_end = strchr(state->cur_line_ptr, '\n')) != NULL)
+    {
+        state->cur_line_no++;
+        state->cur_line_ptr = line_end + 1;
+    }
+    state->cur_line_ptr += strlen(state->cur_line_ptr);
+
+    /* Report current location. */
+    *lineno = state->cur_line_no;
+    *offset = state->cur_line_ptr - state->scanbuf;
+}
+
 /*
  * Push the given string onto the stack of stuff to scan.
  *
diff --git a/src/include/fe_utils/psqlscan.h b/src/include/fe_utils/psqlscan.h
index 81f792b1733..39d2065fe98 100644
--- a/src/include/fe_utils/psqlscan.h
+++ b/src/include/fe_utils/psqlscan.h
@@ -87,4 +87,7 @@ extern void psql_scan_reselect_sql_lexer(PsqlScanState state);

 extern bool psql_scan_in_quote(PsqlScanState state);

+extern void psql_scan_get_location(PsqlScanState state,
+                                   int *lineno, int *offset);
+
 #endif                            /* PSQLSCAN_H */
diff --git a/src/include/fe_utils/psqlscan_int.h b/src/include/fe_utils/psqlscan_int.h
index 37798114873..2a3a9d7c82a 100644
--- a/src/include/fe_utils/psqlscan_int.h
+++ b/src/include/fe_utils/psqlscan_int.h
@@ -104,6 +104,10 @@ typedef struct PsqlScanStateData
     const char *curline;        /* actual flex input string for cur buf */
     const char *refline;        /* original data for cur buffer */

+    /* status for psql_scan_get_location() */
+    int            cur_line_no;    /* current line#, or 0 if no yylex done */
+    const char *cur_line_ptr;    /* points into cur_line_no'th line in scanbuf */
+
     /*
      * All this state lives across successive input lines, until explicitly
      * reset by psql_scan_reset.  start_state is adopted by yylex() on entry,
--
2.43.5

From cea342c2cf2a3db299cac520299cb83ad8016ec6 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Tue, 25 Feb 2025 19:10:12 -0500
Subject: [PATCH v2 2/2] Avoid unnecessary computation of pgbench's script line
 number.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

ParseScript only needs the lineno for meta-commands, so let's not
bother computing it otherwise.  While this doesn't save much given
the previous patch, there's no point in doing unnecessary work.
While we're at it, avoid calling psql_scan_get_location() twice for
a meta-command.

One reason for making this change is that the line number computed
in ParseScript's main loop was actually wrong in most cases: it
would point just past the semicolon of the previous SQL command,
not at what the user thinks the current command's line number is.
We could add some code to skip whitespace before capturing the line
number, but it would be pretty pointless at present.  Just move the
call to avoid the temptation to rely on that value.  (Once we've
lexed the backslash, the line number will be right.)

This change also means that pgbench never inquires about the
location before it's lexed something, so that the care taken in
the previous patch to behave sanely in that case is unnecessary.
It seems best to keep that logic, though, as future callers
might depend on it.

Author: Daniel Vérité <daniel@manitou-mail.org>
Discussion: https://postgr.es/m/84a8a89e-adb8-47a9-9d34-c13f7150ee45@manitou-mail.org
---
 src/bin/pgbench/pgbench.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 38f8bc11bcd..bf099aab278 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5677,22 +5677,17 @@ postprocess_sql_command(Command *my_command)
  * At call, we have scanned only the initial backslash.
  */
 static Command *
-process_backslash_command(PsqlScanState sstate, const char *source)
+process_backslash_command(PsqlScanState sstate, const char *source,
+                          int lineno, int start_offset)
 {
     Command    *my_command;
     PQExpBufferData word_buf;
     int            word_offset;
     int            offsets[MAX_ARGS];    /* offsets of argument words */
-    int            start_offset;
-    int            lineno;
     int            j;

     initPQExpBuffer(&word_buf);

-    /* Remember location of the backslash */
-    psql_scan_get_location(sstate, &lineno, &start_offset);
-    start_offset--;
-
     /* Collect first word of command */
     if (!expr_lex_one_word(sstate, &word_buf, &word_offset))
     {
@@ -5980,16 +5975,12 @@ ParseScript(const char *script, const char *desc, int weight)

     for (;;)
     {
-        int            lineno;
-        int            start_offset;
         PsqlScanResult sr;
         promptStatus_t prompt;
         Command    *command = NULL;

         resetPQExpBuffer(&line_buf);

-        psql_scan_get_location(sstate, &lineno, &start_offset);
-
         sr = psql_scan(sstate, &line_buf, &prompt);

         /* If we collected a new SQL command, process that */
@@ -6002,7 +5993,15 @@ ParseScript(const char *script, const char *desc, int weight)
         /* If we reached a backslash, process that */
         if (sr == PSCAN_BACKSLASH)
         {
-            command = process_backslash_command(sstate, desc);
+            int            lineno;
+            int            start_offset;
+
+            /* Capture location of the backslash */
+            psql_scan_get_location(sstate, &lineno, &start_offset);
+            start_offset--;
+
+            command = process_backslash_command(sstate, desc,
+                                                lineno, start_offset);

             if (command)
             {
--
2.43.5


Re: pgbench client-side performance issue on large scripts

From
"Daniel Verite"
Date:
    Tom Lane wrote:

> > I got nerd-sniped by this question and spent some time looking into
> > it.

Thank you for the patch! LGTM.


Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/



Re: pgbench client-side performance issue on large scripts

From
Tom Lane
Date:
"Daniel Verite" <daniel@manitou-mail.org> writes:
>     Tom Lane wrote:
>> I got nerd-sniped by this question and spent some time looking into
>> it.

> Thank you for the patch! LGTM.

Thanks for reviewing!  Pushed after a tiny bit more polishing.

            regards, tom lane