Thread: Performance issues with v18 SQL-language-function changes

Performance issues with v18 SQL-language-function changes

From

Tom Lane

Date:

13 April, 22:23:24

In the thread that led up to commit 0dca5d68d [1], we'd convinced
ourselves that the new implementation was faster than the old.
So I was sad to discover that there are common cases where it's
a good bit slower.  We'd focused too much on test methods like

do $$
begin
  for i in 1..10000000 loop
    perform fx((random()*100)::int);
  end loop;
end;
$$;

The thing about this test case is that the SQL function under
test is executed only once per calling query (i.e., per PERFORM).
That doesn't allow the old implementation to amortize anything.
If you instead test cases like

create function fx(p_summa bigint) returns text immutable strict
return ltrim(to_char(p_summa, '999 999 999 999 999 999 999 999'));

explain analyze select fx(i) from generate_series(1,1000000) as i(i);

you arrive at the rude discovery that 0dca5d68d is about 50% slower
than 0dca5d68d^, because the old implementation builds a plan for fx()
only once and then re-uses it throughout the query.  So does the new
implementation, but it has added GetCachedPlan overhead.  Moreover,
I made the unforced error of deciding that we could tear down and
rebuild the SQLFunctionCache and subsidiary data structures for
each call.  That overhead is relatively minor in comparison to the
cost of parsing and planning the function; but when comparing cases
where there's no repeated parsing and planning, it becomes
significant.

Hence, the attached patch series reverts that decision and goes back
to the old method of having the SQLFunctionCache and some associated
objects live as long as the FmgrInfo does (without, however, the
poorly-controlled memory leaks of the old code).  In my testing
this gets us from a 50% penalty down to about 5%, which I think is
acceptable given the other benefits 0dca5d68d brought us.

I'm inclined to argue that, seeing that 0dca5d68d was mainly intended
as a performance feature, this performance loss is a bug that we
need to do something about even though we're post-feature-freeze.
We could either revert 0dca5d68d or apply the attached.  I'd prefer
the latter.

(There are other things we could do to try to reduce the overhead
further, such as trying to not build a Tuplestore or JunkFilter in
simple cases.  But that seems like new work not a fix for a bad
decision in existing work, so I think it's out of scope for v18.)

Comments?

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/8216639.NyiUUSuA9g%40aivenlaptop

From c87a37d00b847e86e5286c9c46668410019b0043 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 13 Apr 2025 13:37:46 -0400
Subject: [PATCH v1 1/5] Minor performance improvement for SQL-language
 functions.

Late in the development of commit 0dca5d68d, I added a step to copy
the result tlist we extract from the cached final query, because
I was afraid that that might not last as long as the JunkFilter that
we're passing it off to.  However, that turns out to cost a noticeable
number of cycles, and it's really quite unnecessary because the
JunkFilter will not examine that tlist after it's been created.
(ExecFindJunkAttribute would use it, but we don't use that function
on this JunkFilter.)  Hence, remove the copy step.  For safety,
reset the might-become-dangling jf_targetList pointer to NIL.

In passing, remove DR_sqlfunction.cxt, which we don't use anymore;
it's confusing because it's not entirely clear which context it
ought to point at.
---
 src/backend/executor/functions.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 53ff614d87b..d3f05c7d2c7 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -45,7 +45,6 @@ typedef struct
 {
     DestReceiver pub;            /* publicly-known function pointers */
     Tuplestorestate *tstore;    /* where to put result tuples */
-    MemoryContext cxt;            /* context containing tstore */
     JunkFilter *filter;            /* filter to convert tuple type */
 } DR_sqlfunction;

@@ -787,12 +786,6 @@ init_execution_state(SQLFunctionCachePtr fcache)
          */
         resulttlist = get_sql_fn_result_tlist(plansource->query_list);

-        /*
-         * We need to make a copy to ensure that it doesn't disappear
-         * underneath us due to plancache invalidation.
-         */
-        resulttlist = copyObject(resulttlist);
-
         /*
          * If the result is composite, *and* we are returning the whole tuple
          * result, we need to insert nulls for any dropped columns.  In the
@@ -807,6 +800,17 @@ init_execution_state(SQLFunctionCachePtr fcache)
                                                               slot);
         else
             fcache->junkFilter = ExecInitJunkFilter(resulttlist, slot);
+
+        /*
+         * The resulttlist tree belongs to the plancache and might disappear
+         * underneath us due to plancache invalidation.  While we could
+         * forestall that by copying it, that'd just be a waste of cycles,
+         * because the junkfilter doesn't need it anymore.  (It'd only be used
+         * by ExecFindJunkAttribute(), which we don't use here.)  To ensure
+         * there's not a dangling pointer laying about, clear the junkFilter's
+         * pointer.
+         */
+        fcache->junkFilter->jf_targetList = NIL;
     }

     if (fcache->func->returnsTuple)
@@ -1245,7 +1249,6 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
         myState = (DR_sqlfunction *) dest;
         Assert(myState->pub.mydest == DestSQLFunction);
         myState->tstore = fcache->tstore;
-        myState->cxt = CurrentMemoryContext;
         myState->filter = fcache->junkFilter;
     }
     else
--
2.43.5

From 7d427b796978a45ca668aede83e0705a81df7f3f Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 13 Apr 2025 13:57:16 -0400
Subject: [PATCH v1 2/5] Make functions.c mostly run in a short-lived memory
 context.

Previously, much of this code ran with CurrentMemoryContext set
to be the function's fcontext, so that we tended to leak a lot of
stuff there.  Commit 0dca5d68d dealt with that by releasing the
fcontext at the completion of each SQL function call, but we'd
like to go back to the previous approach of allowing the fcontext
to be query-lifespan.  To control the leakage problem, rearrange
the code so that we mostly run in the memory context that fmgr_sql
is called in (which we expect to be short-lived).  Notably, this
means that parsing/planning is all done in the short-lived context
and doesn't leak cruft into fcontext.

This patch also fixes the allocation of execution_state records
so that we don't leak them across executions.  I set that up
with a re-usable array that contains at least as many
execution_state structs as we need for the current querytree.
The chain structure is still there, but it's not really doing
much for us, and maybe somebody will be motivated to get rid
of it.  I'm not though.

This incidentally also moves the call of BlessTupleDesc to be
with the code that creates the JunkFilter.  That doesn't make
much difference now, but a later patch will reduce the number
of times the JunkFilter gets made, and we needn't bless the
results any more often than that.

We still leak a fair amount in fcontext, particularly when
executing utility statements, but that's material for a
separate patch step; the point here is only to get rid of
unintentional allocations in fcontext.
---
 src/backend/executor/functions.c | 153 +++++++++++++++++++------------
 1 file changed, 95 insertions(+), 58 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index d3f05c7d2c7..b264060d33d 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -55,7 +55,9 @@ typedef struct
  *
  * The "next" fields chain together all the execution_state records generated
  * from a single original parsetree.  (There will only be more than one in
- * case of rule expansion of the original parsetree.)
+ * case of rule expansion of the original parsetree.)  The chain structure is
+ * quite vestigial at this point, because we allocate the records in an array
+ * for ease of memory management.  But we'll get rid of it some other day.
  */
 typedef enum
 {
@@ -158,17 +160,20 @@ typedef struct SQLFunctionCache

     /*
      * While executing a particular query within the function, cplan is the
-     * CachedPlan we've obtained for that query, and eslist is a list of
+     * CachedPlan we've obtained for that query, and eslist is a chain of
      * execution_state records for the individual plans within the CachedPlan.
      * next_query_index is the 0-based index of the next CachedPlanSource to
      * get a CachedPlan from.
      */
     CachedPlan *cplan;            /* Plan for current query, if any */
     ResourceOwner cowner;        /* CachedPlan is registered with this owner */
-    execution_state *eslist;    /* execution_state records */
     int            next_query_index;    /* index of next CachedPlanSource to run */

-    /* if positive, this is the index of the query we're processing */
+    execution_state *eslist;    /* chain of execution_state records */
+    execution_state *esarray;    /* storage for eslist */
+    int            esarray_len;    /* allocated length of esarray[] */
+
+    /* if positive, this is the 1-based index of the query we're processing */
     int            error_query_index;

     MemoryContext fcontext;        /* memory context holding this struct and all
@@ -212,13 +217,12 @@ static void sql_delete_callback(CachedFunction *cfunc);
 static void sql_postrewrite_callback(List *querytree_list, void *arg);
 static void postquel_start(execution_state *es, SQLFunctionCachePtr fcache);
 static bool postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache);
-static void postquel_end(execution_state *es);
+static void postquel_end(execution_state *es, SQLFunctionCachePtr fcache);
 static void postquel_sub_params(SQLFunctionCachePtr fcache,
                                 FunctionCallInfo fcinfo);
 static Datum postquel_get_single_result(TupleTableSlot *slot,
                                         FunctionCallInfo fcinfo,
-                                        SQLFunctionCachePtr fcache,
-                                        MemoryContext resultcontext);
+                                        SQLFunctionCachePtr fcache);
 static void sql_compile_error_callback(void *arg);
 static void sql_exec_error_callback(void *arg);
 static void ShutdownSQLFunction(Datum arg);
@@ -658,12 +662,11 @@ init_execution_state(SQLFunctionCachePtr fcache)
     CachedPlanSource *plansource;
     execution_state *preves = NULL;
     execution_state *lasttages = NULL;
+    int            nstmts;
     ListCell   *lc;

     /*
-     * Clean up after previous query, if there was one.  Note that we just
-     * leak the old execution_state records until end of function execution;
-     * there aren't likely to be enough of them to matter.
+     * Clean up after previous query, if there was one.
      */
     if (fcache->cplan)
     {
@@ -704,6 +707,22 @@ init_execution_state(SQLFunctionCachePtr fcache)
                                   fcache->cowner,
                                   NULL);

+    /*
+     * If necessary, make esarray[] bigger to hold the needed state.
+     */
+    nstmts = list_length(fcache->cplan->stmt_list);
+    if (nstmts > fcache->esarray_len)
+    {
+        if (fcache->esarray == NULL)
+            fcache->esarray = (execution_state *)
+                MemoryContextAlloc(fcache->fcontext,
+                                   sizeof(execution_state) * nstmts);
+        else
+            fcache->esarray = repalloc_array(fcache->esarray,
+                                             execution_state, nstmts);
+        fcache->esarray_len = nstmts;
+    }
+
     /*
      * Build execution_state list to match the number of contained plans.
      */
@@ -740,7 +759,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
                             CreateCommandName((Node *) stmt))));

         /* OK, build the execution_state for this query */
-        newes = (execution_state *) palloc(sizeof(execution_state));
+        newes = &fcache->esarray[foreach_current_index(lc)];
         if (preves)
             preves->next = newes;
         else
@@ -775,9 +794,14 @@ init_execution_state(SQLFunctionCachePtr fcache)
      */
     if (fcache->func->rettype != VOIDOID)
     {
-        TupleTableSlot *slot = MakeSingleTupleTableSlot(NULL,
-                                                        &TTSOpsMinimalTuple);
+        TupleTableSlot *slot;
         List       *resulttlist;
+        MemoryContext oldcontext;
+
+        /* The result slot and JunkFilter must be in the fcontext */
+        oldcontext = MemoryContextSwitchTo(fcache->fcontext);
+
+        slot = MakeSingleTupleTableSlot(NULL, &TTSOpsMinimalTuple);

         /*
          * Re-fetch the (possibly modified) output tlist of the final
@@ -811,14 +835,17 @@ init_execution_state(SQLFunctionCachePtr fcache)
          * pointer.
          */
         fcache->junkFilter->jf_targetList = NIL;
-    }

-    if (fcache->func->returnsTuple)
-    {
         /* Make sure output rowtype is properly blessed */
-        BlessTupleDesc(fcache->junkFilter->jf_resultSlot->tts_tupleDescriptor);
+        if (fcache->func->returnsTuple)
+            BlessTupleDesc(fcache->junkFilter->jf_resultSlot->tts_tupleDescriptor);
+
+        MemoryContextSwitchTo(oldcontext);
     }
-    else if (fcache->func->returnsSet && type_is_rowtype(fcache->func->rettype))
+
+    if (fcache->func->returnsSet &&
+        !fcache->func->returnsTuple &&
+        type_is_rowtype(fcache->func->rettype))
     {
         /*
          * Returning rowtype as if it were scalar --- materialize won't work.
@@ -1230,12 +1257,23 @@ static void
 postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 {
     DestReceiver *dest;
+    MemoryContext oldcontext;

     Assert(es->qd == NULL);

     /* Caller should have ensured a suitable snapshot is active */
     Assert(ActiveSnapshotSet());

+    /*
+     * Run the sub-executor in a child of fcontext.  The sub-executor is
+     * responsible for deleting per-tuple information.  (XXX in the case of a
+     * long-lived FmgrInfo, this policy potentially causes memory leakage, but
+     * it's not very clear where we could keep stuff instead.  Fortunately,
+     * there are few if any cases where set-returning functions are invoked
+     * via FmgrInfos that would outlive the calling query.)
+     */
+    oldcontext = MemoryContextSwitchTo(fcache->fcontext);
+
     /*
      * If this query produces the function result, send its output to the
      * tuplestore; else discard any output.
@@ -1285,6 +1323,8 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
     }

     es->status = F_EXEC_RUN;
+
+    MemoryContextSwitchTo(oldcontext);
 }

 /* Run one execution_state; either to completion or to first result row */
@@ -1293,6 +1333,10 @@ static bool
 postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache)
 {
     bool        result;
+    MemoryContext oldcontext;
+
+    /* Run the sub-executor in a child of fcontext */
+    oldcontext = MemoryContextSwitchTo(fcache->fcontext);

     if (es->qd->operation == CMD_UTILITY)
     {
@@ -1320,13 +1364,20 @@ postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache)
         result = (count == 0 || es->qd->estate->es_processed == 0);
     }

+    MemoryContextSwitchTo(oldcontext);
+
     return result;
 }

 /* Shut down execution of one execution_state node */
 static void
-postquel_end(execution_state *es)
+postquel_end(execution_state *es, SQLFunctionCachePtr fcache)
 {
+    MemoryContext oldcontext;
+
+    /* Run the sub-executor in a child of fcontext */
+    oldcontext = MemoryContextSwitchTo(fcache->fcontext);
+
     /* mark status done to ensure we don't do ExecutorEnd twice */
     es->status = F_EXEC_DONE;

@@ -1341,9 +1392,16 @@ postquel_end(execution_state *es)

     FreeQueryDesc(es->qd);
     es->qd = NULL;
+
+    MemoryContextSwitchTo(oldcontext);
 }

-/* Build ParamListInfo array representing current arguments */
+/*
+ * Build ParamListInfo array representing current arguments.
+ *
+ * This must be called in the fcontext so that the results have adequate
+ * lifespan.
+ */
 static void
 postquel_sub_params(SQLFunctionCachePtr fcache,
                     FunctionCallInfo fcinfo)
@@ -1398,25 +1456,22 @@ postquel_sub_params(SQLFunctionCachePtr fcache,
 /*
  * Extract the SQL function's value from a single result row.  This is used
  * both for scalar (non-set) functions and for each row of a lazy-eval set
- * result.
+ * result.  We expect the current memory context to be that of the caller
+ * of fmgr_sql.
  */
 static Datum
 postquel_get_single_result(TupleTableSlot *slot,
                            FunctionCallInfo fcinfo,
-                           SQLFunctionCachePtr fcache,
-                           MemoryContext resultcontext)
+                           SQLFunctionCachePtr fcache)
 {
     Datum        value;
-    MemoryContext oldcontext;

     /*
      * Set up to return the function value.  For pass-by-reference datatypes,
-     * be sure to allocate the result in resultcontext, not the current memory
-     * context (which has query lifespan).  We can't leave the data in the
-     * TupleTableSlot because we intend to clear the slot before returning.
+     * be sure to copy the result into the current context.  We can't leave
+     * the data in the TupleTableSlot because we intend to clear the slot
+     * before returning.
      */
-    oldcontext = MemoryContextSwitchTo(resultcontext);
-
     if (fcache->func->returnsTuple)
     {
         /* We must return the whole tuple as a Datum. */
@@ -1427,7 +1482,7 @@ postquel_get_single_result(TupleTableSlot *slot,
     {
         /*
          * Returning a scalar, which we have to extract from the first column
-         * of the SELECT result, and then copy into result context if needed.
+         * of the SELECT result, and then copy into current context if needed.
          */
         value = slot_getattr(slot, 1, &(fcinfo->isnull));

@@ -1435,8 +1490,6 @@ postquel_get_single_result(TupleTableSlot *slot,
             value = datumCopy(value, fcache->func->typbyval, fcache->func->typlen);
     }

-    MemoryContextSwitchTo(oldcontext);
-
     return value;
 }

@@ -1450,7 +1503,6 @@ fmgr_sql(PG_FUNCTION_ARGS)
     SQLFunctionLink *flink;
     ErrorContextCallback sqlerrcontext;
     MemoryContext tscontext;
-    MemoryContext oldcontext;
     bool        randomAccess;
     bool        lazyEvalOK;
     bool        pushed_snapshot;
@@ -1507,20 +1559,14 @@ fmgr_sql(PG_FUNCTION_ARGS)
      * Build tuplestore to hold results, if we don't have one already.  Make
      * sure it's in a suitable context.
      */
-    oldcontext = MemoryContextSwitchTo(tscontext);
-
     if (!fcache->tstore)
-        fcache->tstore = tuplestore_begin_heap(randomAccess, false, work_mem);
+    {
+        MemoryContext oldcontext;

-    /*
-     * Switch to context in which the fcache lives.  The sub-executor is
-     * responsible for deleting per-tuple information.  (XXX in the case of a
-     * long-lived FmgrInfo, this policy potentially causes memory leakage, but
-     * it's not very clear where we could keep stuff instead.  Fortunately,
-     * there are few if any cases where set-returning functions are invoked
-     * via FmgrInfos that would outlive the calling query.)
-     */
-    MemoryContextSwitchTo(fcache->fcontext);
+        oldcontext = MemoryContextSwitchTo(tscontext);
+        fcache->tstore = tuplestore_begin_heap(randomAccess, false, work_mem);
+        MemoryContextSwitchTo(oldcontext);
+    }

     /*
      * Find first unfinished execution_state.  If none, advance to the next
@@ -1598,7 +1644,7 @@ fmgr_sql(PG_FUNCTION_ARGS)
          * don't care about fetching any more result rows.
          */
         if (completed || !fcache->func->returnsSet)
-            postquel_end(es);
+            postquel_end(es, fcache);

         /*
          * Break from loop if we didn't shut down (implying we got a
@@ -1657,8 +1703,7 @@ fmgr_sql(PG_FUNCTION_ARGS)
             if (!tuplestore_gettupleslot(fcache->tstore, true, false, slot))
                 elog(ERROR, "failed to fetch lazy-eval tuple");
             /* Extract the result as a datum, and copy out from the slot */
-            result = postquel_get_single_result(slot, fcinfo,
-                                                fcache, oldcontext);
+            result = postquel_get_single_result(slot, fcinfo, fcache);
             /* Clear the tuplestore, but keep it for next time */
             /* NB: this might delete the slot's content, but we don't care */
             tuplestore_clear(fcache->tstore);
@@ -1716,12 +1761,7 @@ fmgr_sql(PG_FUNCTION_ARGS)
             fcache->tstore = NULL;
             /* must copy desc because execSRF.c will free it */
             if (fcache->junkFilter)
-            {
-                /* setDesc must be allocated in suitable context */
-                MemoryContextSwitchTo(tscontext);
                 rsi->setDesc = CreateTupleDescCopy(fcache->junkFilter->jf_cleanTupType);
-                MemoryContextSwitchTo(fcache->fcontext);
-            }

             fcinfo->isnull = true;
             result = (Datum) 0;
@@ -1746,8 +1786,7 @@ fmgr_sql(PG_FUNCTION_ARGS)
             /* Re-use the junkfilter's output slot to fetch back the tuple */
             slot = fcache->junkFilter->jf_resultSlot;
             if (tuplestore_gettupleslot(fcache->tstore, true, false, slot))
-                result = postquel_get_single_result(slot, fcinfo,
-                                                    fcache, oldcontext);
+                result = postquel_get_single_result(slot, fcinfo, fcache);
             else
             {
                 fcinfo->isnull = true;
@@ -1770,8 +1809,6 @@ fmgr_sql(PG_FUNCTION_ARGS)
     if (pushed_snapshot)
         PopActiveSnapshot();

-    MemoryContextSwitchTo(oldcontext);
-
     /*
      * If we've gone through every command in the function, we are done.
      * Release the cache to start over again on next call.
@@ -1889,7 +1926,7 @@ ShutdownSQLFunction(Datum arg)
                 if (!fcache->func->readonly_func)
                     PushActiveSnapshot(es->qd->snapshot);

-                postquel_end(es);
+                postquel_end(es, fcache);

                 if (!fcache->func->readonly_func)
                     PopActiveSnapshot();
--
2.43.5

From 68fe63f11e10a6ececd1056747b6a7c8d6969c7b Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 13 Apr 2025 13:59:19 -0400
Subject: [PATCH v1 3/5] Split some storage out to separate subcontexts of
 fcontext.

Put the JunkFilter and its result slot (and thence also
some subsidiary data such as the result tupledesc) into a
separate subcontext "jfcontext".  This doesn't accomplish
a lot at this point, because we make a new JunkFilter each
time through the SQL function.  However, the plan is to make
the fcontext live longer, and that raises the possibility
that we'll need a new JunkFilter because the plan for the
result-generating query changes.  A separate context makes
it easy to free the obsoleted data when that happens.

Also, instead of always running the sub-executor in fcontext,
make a separate context for it if we're doing lazy eval of
a SRF, and otherwise just run it inside CurrentMemoryContext.

The combination of these steps reduces the expected size of
fcontext enough that we can use ALLOCSET_SMALL_SIZES.
---
 src/backend/executor/functions.c | 82 ++++++++++++++++++++++++++------
 1 file changed, 67 insertions(+), 15 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index b264060d33d..12121ad74e7 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -107,6 +107,8 @@ typedef struct execution_state
  * which we free at completion.  In non-returnsSet mode, this is just a child
  * of the call-time context.  In returnsSet mode, it is made a child of the
  * FmgrInfo's fn_mcxt so that it will survive between fmgr_sql calls.
+ * The fcontext may have subsidiary contexts jfcontext and/or subcontext,
+ * which have somewhat shorter lifespans.
  *
  * 3. SQLFunctionLink is a tiny struct that just holds pointers to
  * the SQLFunctionHashEntry and the current SQLFunctionCache (if any).
@@ -151,6 +153,7 @@ typedef struct SQLFunctionCache
     bool        lazyEvalOK;        /* true if lazyEval is safe */
     bool        shutdown_reg;    /* true if registered shutdown callback */
     bool        lazyEval;        /* true if using lazyEval for result query */
+    bool        ownSubcontext;    /* is subcontext really a separate context? */

     ParamListInfo paramLI;        /* Param list representing current args */

@@ -178,6 +181,9 @@ typedef struct SQLFunctionCache

     MemoryContext fcontext;        /* memory context holding this struct and all
                                  * subsidiary data */
+    MemoryContext jfcontext;    /* subsidiary memory context holding
+                                 * junkFilter, result slot, and related data */
+    MemoryContext subcontext;    /* subsidiary memory context for sub-executor */
 } SQLFunctionCache;

 typedef SQLFunctionCache *SQLFunctionCachePtr;
@@ -617,8 +623,8 @@ init_sql_fcache(FunctionCallInfo fcinfo, bool lazyEvalOK)
      */
     pcontext = func->returnsSet ? finfo->fn_mcxt : CurrentMemoryContext;
     fcontext = AllocSetContextCreate(pcontext,
-                                     "SQL function execution",
-                                     ALLOCSET_DEFAULT_SIZES);
+                                     "SQL function cache",
+                                     ALLOCSET_SMALL_SIZES);

     oldcontext = MemoryContextSwitchTo(fcontext);

@@ -791,6 +797,9 @@ init_execution_state(SQLFunctionCachePtr fcache)
      * nothing to coerce to.  (XXX Frequently, the JunkFilter isn't doing
      * anything very interesting, but much of this module expects it to be
      * there anyway.)
+     *
+     * The JunkFilter, its result slot, and its tupledesc are kept in a
+     * subsidiary memory context so that we can free them easily when needed.
      */
     if (fcache->func->rettype != VOIDOID)
     {
@@ -798,8 +807,14 @@ init_execution_state(SQLFunctionCachePtr fcache)
         List       *resulttlist;
         MemoryContext oldcontext;

-        /* The result slot and JunkFilter must be in the fcontext */
-        oldcontext = MemoryContextSwitchTo(fcache->fcontext);
+        /* Create or reset the jfcontext */
+        if (fcache->jfcontext == NULL)
+            fcache->jfcontext = AllocSetContextCreate(fcache->fcontext,
+                                                      "SQL function junkfilter",
+                                                      ALLOCSET_SMALL_SIZES);
+        else
+            MemoryContextReset(fcache->jfcontext);
+        oldcontext = MemoryContextSwitchTo(fcache->jfcontext);

         slot = MakeSingleTupleTableSlot(NULL, &TTSOpsMinimalTuple);

@@ -1265,14 +1280,46 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
     Assert(ActiveSnapshotSet());

     /*
-     * Run the sub-executor in a child of fcontext.  The sub-executor is
-     * responsible for deleting per-tuple information.  (XXX in the case of a
-     * long-lived FmgrInfo, this policy potentially causes memory leakage, but
-     * it's not very clear where we could keep stuff instead.  Fortunately,
-     * there are few if any cases where set-returning functions are invoked
-     * via FmgrInfos that would outlive the calling query.)
+     * In lazyEval mode for a SRF, we must run the sub-executor in a child of
+     * fcontext, so that it can survive across multiple calls to fmgr_sql.
+     * (XXX in the case of a long-lived FmgrInfo, this policy potentially
+     * causes memory leakage, but it's not very clear where we could keep
+     * stuff instead.  Fortunately, there are few if any cases where
+     * set-returning functions are invoked via FmgrInfos that would outlive
+     * the calling query.)  Otherwise, we're going to run it to completion
+     * before exiting fmgr_sql, so it can perfectly well run in the caller's
+     * context.
      */
-    oldcontext = MemoryContextSwitchTo(fcache->fcontext);
+    if (es->lazyEval && fcache->func->returnsSet)
+    {
+        fcache->subcontext = AllocSetContextCreate(fcache->fcontext,
+                                                   "SQL function execution",
+                                                   ALLOCSET_DEFAULT_SIZES);
+        fcache->ownSubcontext = true;
+    }
+    else if (es->stmt->commandType == CMD_UTILITY)
+    {
+        /*
+         * The code path using a sub-executor is pretty good about cleaning up
+         * cruft, since the executor will make its own sub-context.  We don't
+         * really need an additional layer of sub-context in that case.
+         * However, if this is a utility statement, it won't make its own
+         * sub-context, so it seems advisable to make one that we can free on
+         * completion.
+         */
+        fcache->subcontext = AllocSetContextCreate(CurrentMemoryContext,
+                                                   "SQL function execution",
+                                                   ALLOCSET_DEFAULT_SIZES);
+        fcache->ownSubcontext = true;
+    }
+    else
+    {
+        fcache->subcontext = CurrentMemoryContext;
+        fcache->ownSubcontext = false;
+    }
+
+    /* Switch into the selected subcontext (might be a no-op) */
+    oldcontext = MemoryContextSwitchTo(fcache->subcontext);

     /*
      * If this query produces the function result, send its output to the
@@ -1335,8 +1382,8 @@ postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache)
     bool        result;
     MemoryContext oldcontext;

-    /* Run the sub-executor in a child of fcontext */
-    oldcontext = MemoryContextSwitchTo(fcache->fcontext);
+    /* Run the sub-executor in subcontext */
+    oldcontext = MemoryContextSwitchTo(fcache->subcontext);

     if (es->qd->operation == CMD_UTILITY)
     {
@@ -1375,8 +1422,8 @@ postquel_end(execution_state *es, SQLFunctionCachePtr fcache)
 {
     MemoryContext oldcontext;

-    /* Run the sub-executor in a child of fcontext */
-    oldcontext = MemoryContextSwitchTo(fcache->fcontext);
+    /* Run the sub-executor in subcontext */
+    oldcontext = MemoryContextSwitchTo(fcache->subcontext);

     /* mark status done to ensure we don't do ExecutorEnd twice */
     es->status = F_EXEC_DONE;
@@ -1394,6 +1441,11 @@ postquel_end(execution_state *es, SQLFunctionCachePtr fcache)
     es->qd = NULL;

     MemoryContextSwitchTo(oldcontext);
+
+    /* Delete the subcontext, if it's actually a separate context */
+    if (fcache->ownSubcontext)
+        MemoryContextDelete(fcache->subcontext);
+    fcache->subcontext = NULL;
 }

 /*
--
2.43.5

From 55fff2b9ec8358897dd777b315fae15ecc0d7dbe Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 13 Apr 2025 14:09:34 -0400
Subject: [PATCH v1 4/5] Make SQLFunctionCache long-lived again.

At this point, the only data structures we allocate directly in
fcontext are the SQLFunctionCache struct itself, the ParamListInfo
struct, and the execution_state array, all of which are small and
perfectly capable of being re-used across executions of the same
FmgrInfo.  Hence, let's give them the same lifespan as the FmgrInfo.
This step gets rid of the separate SQLFunctionLink struct and makes
fn_extra point to SQLFunctionCache again.  We also get rid of the
separate fcontext memory context and allocate these items directly
in fn_mcxt.

For notational simplicity, SQLFunctionCache still has an fcontext
field, but it's just a copy of fn_mcxt.

The motivation for this is to allow these structures to live as
long as the FmgrInfo and be re-used across calls, restoring the
original design without its propensity for memory leaks.  This
gets rid of some per-call overhead that we added in 0dca5d68d.

We also make an effort to re-use the JunkFilter and result slot.
Those might need to change if the function definition changes,
so we compromise by rebuilding them if the cached plan changes.

This also moves the tuplestore into fn_mcxt so that it can be
re-used across calls, again undoing a change made in 0dca5d68d.
---
 src/backend/executor/functions.c | 260 +++++++++++++------------------
 1 file changed, 105 insertions(+), 155 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 12121ad74e7..135fddda3fc 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -76,7 +76,7 @@ typedef struct execution_state


 /*
- * Data associated with a SQL-language function is kept in three main
+ * Data associated with a SQL-language function is kept in two main
  * data structures:
  *
  * 1. SQLFunctionHashEntry is a long-lived (potentially session-lifespan)
@@ -85,7 +85,7 @@ typedef struct execution_state
  * of plans for the query(s) within the function.  A SQLFunctionHashEntry is
  * potentially shared across multiple concurrent executions of the function,
  * so it must contain no execution-specific state; but its use_count must
- * reflect the number of SQLFunctionLink structs pointing at it.
+ * reflect the number of SQLFunctionCache structs pointing at it.
  * If the function's pg_proc row is updated, we throw away and regenerate
  * the SQLFunctionHashEntry and subsidiary data.  (Also note that if the
  * function is polymorphic or used as a trigger, there is a separate
@@ -99,22 +99,13 @@ typedef struct execution_state
  *    * hcontext ("hash context") holds everything else belonging to the
  *      SQLFunctionHashEntry.
  *
- * 2. SQLFunctionCache lasts for the duration of a single execution of
- * the SQL function.  (In "lazyEval" mode, this might span multiple calls of
- * fmgr_sql.)  It holds a reference to the CachedPlan for the current query,
- * and other data that is execution-specific.  The SQLFunctionCache itself
- * as well as its subsidiary data are kept in fcontext ("function context"),
- * which we free at completion.  In non-returnsSet mode, this is just a child
- * of the call-time context.  In returnsSet mode, it is made a child of the
- * FmgrInfo's fn_mcxt so that it will survive between fmgr_sql calls.
- * The fcontext may have subsidiary contexts jfcontext and/or subcontext,
- * which have somewhat shorter lifespans.
- *
- * 3. SQLFunctionLink is a tiny struct that just holds pointers to
- * the SQLFunctionHashEntry and the current SQLFunctionCache (if any).
+ * 2. SQLFunctionCache is subsidiary data for a single FmgrInfo struct.
  * It is pointed to by the fn_extra field of the FmgrInfo struct, and is
- * always allocated in the FmgrInfo's fn_mcxt.  Its purpose is to reduce
- * the cost of repeat lookups of the SQLFunctionHashEntry.
+ * always allocated in the FmgrInfo's fn_mcxt.  It holds a reference to
+ * the CachedPlan for the current query, and other execution-specific data.
+ * A few subsidiary items such as the ParamListInfo object are also kept
+ * directly in fn_mcxt (which is also called fcontext here).  But most
+ * subsidiary data is in jfcontext or subcontext.
  */

 typedef struct SQLFunctionHashEntry
@@ -160,11 +151,15 @@ typedef struct SQLFunctionCache
     Tuplestorestate *tstore;    /* where we accumulate result tuples */

     JunkFilter *junkFilter;        /* will be NULL if function returns VOID */
+    int            jf_generation;    /* tracks whether junkFilter is up-to-date */

     /*
      * While executing a particular query within the function, cplan is the
      * CachedPlan we've obtained for that query, and eslist is a chain of
      * execution_state records for the individual plans within the CachedPlan.
+     * If eslist is not NULL at entry to fmgr_sql, then we are resuming
+     * execution of a lazyEval-mode set-returning function.
+     *
      * next_query_index is the 0-based index of the next CachedPlanSource to
      * get a CachedPlan from.
      */
@@ -184,22 +179,12 @@ typedef struct SQLFunctionCache
     MemoryContext jfcontext;    /* subsidiary memory context holding
                                  * junkFilter, result slot, and related data */
     MemoryContext subcontext;    /* subsidiary memory context for sub-executor */
-} SQLFunctionCache;
-
-typedef SQLFunctionCache *SQLFunctionCachePtr;
-
-/* Struct pointed to by FmgrInfo.fn_extra for a SQL function */
-typedef struct SQLFunctionLink
-{
-    /* Permanent pointer to associated SQLFunctionHashEntry */
-    SQLFunctionHashEntry *func;
-
-    /* Transient pointer to SQLFunctionCache, used only if returnsSet */
-    SQLFunctionCache *fcache;

     /* Callback to release our use-count on the SQLFunctionHashEntry */
     MemoryContextCallback mcb;
-} SQLFunctionLink;
+} SQLFunctionCache;
+
+typedef SQLFunctionCache *SQLFunctionCachePtr;


 /* non-export function prototypes */
@@ -232,7 +217,7 @@ static Datum postquel_get_single_result(TupleTableSlot *slot,
 static void sql_compile_error_callback(void *arg);
 static void sql_exec_error_callback(void *arg);
 static void ShutdownSQLFunction(Datum arg);
-static void RemoveSQLFunctionLink(void *arg);
+static void RemoveSQLFunctionCache(void *arg);
 static void check_sql_fn_statement(List *queryTreeList);
 static bool check_sql_stmt_retval(List *queryTreeList,
                                   Oid rettype, TupleDesc rettupdesc,
@@ -548,26 +533,23 @@ init_sql_fcache(FunctionCallInfo fcinfo, bool lazyEvalOK)
     FmgrInfo   *finfo = fcinfo->flinfo;
     SQLFunctionHashEntry *func;
     SQLFunctionCache *fcache;
-    SQLFunctionLink *flink;
-    MemoryContext pcontext;
-    MemoryContext fcontext;
-    MemoryContext oldcontext;

     /*
-     * If this is the first execution for this FmgrInfo, set up a link struct
-     * (initially containing null pointers).  The link must live as long as
+     * If this is the first execution for this FmgrInfo, set up a cache struct
+     * (initially containing null pointers).  The cache must live as long as
      * the FmgrInfo, so it goes in fn_mcxt.  Also set up a memory context
      * callback that will be invoked when fn_mcxt is deleted.
      */
-    flink = finfo->fn_extra;
-    if (flink == NULL)
+    fcache = finfo->fn_extra;
+    if (fcache == NULL)
     {
-        flink = (SQLFunctionLink *)
-            MemoryContextAllocZero(finfo->fn_mcxt, sizeof(SQLFunctionLink));
-        flink->mcb.func = RemoveSQLFunctionLink;
-        flink->mcb.arg = flink;
-        MemoryContextRegisterResetCallback(finfo->fn_mcxt, &flink->mcb);
-        finfo->fn_extra = flink;
+        fcache = (SQLFunctionCache *)
+            MemoryContextAllocZero(finfo->fn_mcxt, sizeof(SQLFunctionCache));
+        fcache->fcontext = finfo->fn_mcxt;
+        fcache->mcb.func = RemoveSQLFunctionCache;
+        fcache->mcb.arg = fcache;
+        MemoryContextRegisterResetCallback(finfo->fn_mcxt, &fcache->mcb);
+        finfo->fn_extra = fcache;
     }

     /*
@@ -576,10 +558,10 @@ init_sql_fcache(FunctionCallInfo fcinfo, bool lazyEvalOK)
      * SQLFunctionHashEntry: we want to run to completion using the function's
      * initial definition.
      */
-    if (flink->fcache != NULL)
+    if (fcache->eslist != NULL)
     {
-        Assert(flink->fcache->func == flink->func);
-        return flink->fcache;
+        Assert(fcache->func != NULL);
+        return fcache;
     }

     /*
@@ -590,7 +572,7 @@ init_sql_fcache(FunctionCallInfo fcinfo, bool lazyEvalOK)
      */
     func = (SQLFunctionHashEntry *)
         cached_function_compile(fcinfo,
-                                (CachedFunction *) flink->func,
+                                (CachedFunction *) fcache->func,
                                 sql_compile_callback,
                                 sql_delete_callback,
                                 sizeof(SQLFunctionHashEntry),
@@ -598,60 +580,38 @@ init_sql_fcache(FunctionCallInfo fcinfo, bool lazyEvalOK)
                                 false);

     /*
-     * Install the hash pointer in the SQLFunctionLink, and increment its use
+     * Install the hash pointer in the SQLFunctionCache, and increment its use
      * count to reflect that.  If cached_function_compile gave us back a
      * different hash entry than we were using before, we must decrement that
      * one's use count.
      */
-    if (func != flink->func)
+    if (func != fcache->func)
     {
-        if (flink->func != NULL)
+        if (fcache->func != NULL)
         {
-            Assert(flink->func->cfunc.use_count > 0);
-            flink->func->cfunc.use_count--;
+            Assert(fcache->func->cfunc.use_count > 0);
+            fcache->func->cfunc.use_count--;
         }
-        flink->func = func;
+        fcache->func = func;
         func->cfunc.use_count++;
+        /* Assume we need to rebuild the junkFilter */
+        fcache->junkFilter = NULL;
     }

-    /*
-     * Create memory context that holds all the SQLFunctionCache data.  If we
-     * return a set, we must keep this in whatever context holds the FmgrInfo
-     * (anything shorter-lived risks leaving a dangling pointer in flink). But
-     * in a non-SRF we'll delete it before returning, and there's no need for
-     * it to outlive the caller's context.
-     */
-    pcontext = func->returnsSet ? finfo->fn_mcxt : CurrentMemoryContext;
-    fcontext = AllocSetContextCreate(pcontext,
-                                     "SQL function cache",
-                                     ALLOCSET_SMALL_SIZES);
-
-    oldcontext = MemoryContextSwitchTo(fcontext);
-
-    /*
-     * Create the struct proper, link it to func and fcontext.
-     */
-    fcache = (SQLFunctionCache *) palloc0(sizeof(SQLFunctionCache));
-    fcache->func = func;
-    fcache->fcontext = fcontext;
-    fcache->lazyEvalOK = lazyEvalOK;
-
-    /*
-     * If we return a set, we must link the fcache into fn_extra so that we
-     * can find it again during future calls.  But in a non-SRF there is no
-     * need to link it into fn_extra at all.  Not doing so removes the risk of
-     * having a dangling pointer in a long-lived FmgrInfo.
-     */
-    if (func->returnsSet)
-        flink->fcache = fcache;
-
     /*
      * We're beginning a new execution of the function, so convert params to
      * appropriate format.
      */
     postquel_sub_params(fcache, fcinfo);

-    MemoryContextSwitchTo(oldcontext);
+    /* Also reset lazyEval state for the new execution. */
+    fcache->lazyEvalOK = lazyEvalOK;
+    fcache->lazyEval = false;
+
+    /* Also reset data about where we are in the function. */
+    fcache->eslist = NULL;
+    fcache->next_query_index = 0;
+    fcache->error_query_index = 0;

     return fcache;
 }
@@ -798,10 +758,15 @@ init_execution_state(SQLFunctionCachePtr fcache)
      * anything very interesting, but much of this module expects it to be
      * there anyway.)
      *
+     * Normally we can re-use the JunkFilter across executions, but if the
+     * plan for the last CachedPlanSource changed, we'd better rebuild it.
+     *
      * The JunkFilter, its result slot, and its tupledesc are kept in a
      * subsidiary memory context so that we can free them easily when needed.
      */
-    if (fcache->func->rettype != VOIDOID)
+    if (fcache->func->rettype != VOIDOID &&
+        (fcache->junkFilter == NULL ||
+         fcache->jf_generation != fcache->cplan->generation))
     {
         TupleTableSlot *slot;
         List       *resulttlist;
@@ -855,6 +820,9 @@ init_execution_state(SQLFunctionCachePtr fcache)
         if (fcache->func->returnsTuple)
             BlessTupleDesc(fcache->junkFilter->jf_resultSlot->tts_tupleDescriptor);

+        /* Mark the JunkFilter as up-to-date */
+        fcache->jf_generation = fcache->cplan->generation;
+
         MemoryContextSwitchTo(oldcontext);
     }

@@ -1448,12 +1416,7 @@ postquel_end(execution_state *es, SQLFunctionCachePtr fcache)
     fcache->subcontext = NULL;
 }

-/*
- * Build ParamListInfo array representing current arguments.
- *
- * This must be called in the fcontext so that the results have adequate
- * lifespan.
- */
+/* Build ParamListInfo array representing current arguments */
 static void
 postquel_sub_params(SQLFunctionCachePtr fcache,
                     FunctionCallInfo fcinfo)
@@ -1467,8 +1430,13 @@ postquel_sub_params(SQLFunctionCachePtr fcache,

         if (fcache->paramLI == NULL)
         {
+            /* First time through: build a persistent ParamListInfo struct */
+            MemoryContext oldcontext;
+
+            oldcontext = MemoryContextSwitchTo(fcache->fcontext);
             paramLI = makeParamList(nargs);
             fcache->paramLI = paramLI;
+            MemoryContextSwitchTo(oldcontext);
         }
         else
         {
@@ -1552,9 +1520,7 @@ Datum
 fmgr_sql(PG_FUNCTION_ARGS)
 {
     SQLFunctionCachePtr fcache;
-    SQLFunctionLink *flink;
     ErrorContextCallback sqlerrcontext;
-    MemoryContext tscontext;
     bool        randomAccess;
     bool        lazyEvalOK;
     bool        pushed_snapshot;
@@ -1581,23 +1547,17 @@ fmgr_sql(PG_FUNCTION_ARGS)
                      errmsg("set-valued function called in context that cannot accept a set")));
         randomAccess = rsi->allowedModes & SFRM_Materialize_Random;
         lazyEvalOK = !(rsi->allowedModes & SFRM_Materialize_Preferred);
-        /* tuplestore must have query lifespan */
-        tscontext = rsi->econtext->ecxt_per_query_memory;
     }
     else
     {
         randomAccess = false;
         lazyEvalOK = true;
-        /* tuplestore needn't outlive caller context */
-        tscontext = CurrentMemoryContext;
     }

     /*
      * Initialize fcache if starting a fresh execution.
      */
     fcache = init_sql_fcache(fcinfo, lazyEvalOK);
-    /* init_sql_fcache also ensures we have a SQLFunctionLink */
-    flink = fcinfo->flinfo->fn_extra;

     /*
      * Now we can set up error traceback support for ereport()
@@ -1608,14 +1568,15 @@ fmgr_sql(PG_FUNCTION_ARGS)
     error_context_stack = &sqlerrcontext;

     /*
-     * Build tuplestore to hold results, if we don't have one already.  Make
-     * sure it's in a suitable context.
+     * Build tuplestore to hold results, if we don't have one already.  We
+     * want to re-use the tuplestore across calls, so it needs to live in
+     * fcontext.
      */
     if (!fcache->tstore)
     {
         MemoryContext oldcontext;

-        oldcontext = MemoryContextSwitchTo(tscontext);
+        oldcontext = MemoryContextSwitchTo(fcache->fcontext);
         fcache->tstore = tuplestore_begin_heap(randomAccess, false, work_mem);
         MemoryContextSwitchTo(oldcontext);
     }
@@ -1773,7 +1734,7 @@ fmgr_sql(PG_FUNCTION_ARGS)
             {
                 RegisterExprContextCallback(rsi->econtext,
                                             ShutdownSQLFunction,
-                                            PointerGetDatum(flink));
+                                            PointerGetDatum(fcache));
                 fcache->shutdown_reg = true;
             }
         }
@@ -1797,7 +1758,7 @@ fmgr_sql(PG_FUNCTION_ARGS)
             {
                 UnregisterExprContextCallback(rsi->econtext,
                                               ShutdownSQLFunction,
-                                              PointerGetDatum(flink));
+                                              PointerGetDatum(fcache));
                 fcache->shutdown_reg = false;
             }
         }
@@ -1823,7 +1784,7 @@ fmgr_sql(PG_FUNCTION_ARGS)
             {
                 UnregisterExprContextCallback(rsi->econtext,
                                               ShutdownSQLFunction,
-                                              PointerGetDatum(flink));
+                                              PointerGetDatum(fcache));
                 fcache->shutdown_reg = false;
             }
         }
@@ -1862,17 +1823,11 @@ fmgr_sql(PG_FUNCTION_ARGS)
         PopActiveSnapshot();

     /*
-     * If we've gone through every command in the function, we are done.
-     * Release the cache to start over again on next call.
+     * If we've gone through every command in the function, we are done. Reset
+     * state to start over again on next call.
      */
     if (es == NULL)
-    {
-        if (fcache->tstore)
-            tuplestore_end(fcache->tstore);
-        Assert(fcache->cplan == NULL);
-        flink->fcache = NULL;
-        MemoryContextDelete(fcache->fcontext);
-    }
+        fcache->eslist = NULL;

     error_context_stack = sqlerrcontext.previous;

@@ -1958,67 +1913,62 @@ sql_exec_error_callback(void *arg)
 static void
 ShutdownSQLFunction(Datum arg)
 {
-    SQLFunctionLink *flink = (SQLFunctionLink *) DatumGetPointer(arg);
-    SQLFunctionCachePtr fcache = flink->fcache;
+    SQLFunctionCachePtr fcache = (SQLFunctionCachePtr) DatumGetPointer(arg);
+    execution_state *es;

-    if (fcache != NULL)
+    es = fcache->eslist;
+    while (es)
     {
-        execution_state *es;
-
-        /* Make sure we don't somehow try to do this twice */
-        flink->fcache = NULL;
-
-        es = fcache->eslist;
-        while (es)
+        /* Shut down anything still running */
+        if (es->status == F_EXEC_RUN)
         {
-            /* Shut down anything still running */
-            if (es->status == F_EXEC_RUN)
-            {
-                /* Re-establish active snapshot for any called functions */
-                if (!fcache->func->readonly_func)
-                    PushActiveSnapshot(es->qd->snapshot);
+            /* Re-establish active snapshot for any called functions */
+            if (!fcache->func->readonly_func)
+                PushActiveSnapshot(es->qd->snapshot);

-                postquel_end(es, fcache);
+            postquel_end(es, fcache);

-                if (!fcache->func->readonly_func)
-                    PopActiveSnapshot();
-            }
-            es = es->next;
+            if (!fcache->func->readonly_func)
+                PopActiveSnapshot();
         }
+        es = es->next;
+    }
+    fcache->eslist = NULL;

-        /* Release tuplestore if we have one */
-        if (fcache->tstore)
-            tuplestore_end(fcache->tstore);
+    /* Release tuplestore if we have one */
+    if (fcache->tstore)
+        tuplestore_end(fcache->tstore);
+    fcache->tstore = NULL;

-        /* Release CachedPlan if we have one */
-        if (fcache->cplan)
-            ReleaseCachedPlan(fcache->cplan, fcache->cowner);
+    /* Release CachedPlan if we have one */
+    if (fcache->cplan)
+        ReleaseCachedPlan(fcache->cplan, fcache->cowner);
+    fcache->cplan = NULL;

-        /* Release the cache */
-        MemoryContextDelete(fcache->fcontext);
-    }
     /* execUtils will deregister the callback... */
+    fcache->shutdown_reg = false;
 }

 /*
  * MemoryContext callback function
  *
- * We register this in the memory context that contains a SQLFunctionLink
+ * We register this in the memory context that contains a SQLFunctionCache
  * struct.  When the memory context is reset or deleted, we release the
- * reference count (if any) that the link holds on the long-lived hash entry.
+ * reference count (if any) that the cache holds on the long-lived hash entry.
  * Note that this will happen even during error aborts.
  */
 static void
-RemoveSQLFunctionLink(void *arg)
+RemoveSQLFunctionCache(void *arg)
 {
-    SQLFunctionLink *flink = (SQLFunctionLink *) arg;
+    SQLFunctionCache *fcache = (SQLFunctionCache *) arg;

-    if (flink->func != NULL)
+    /* Release reference count on SQLFunctionHashEntry */
+    if (fcache->func != NULL)
     {
-        Assert(flink->func->cfunc.use_count > 0);
-        flink->func->cfunc.use_count--;
+        Assert(fcache->func->cfunc.use_count > 0);
+        fcache->func->cfunc.use_count--;
         /* This should be unnecessary, but let's just be sure: */
-        flink->func = NULL;
+        fcache->func = NULL;
     }
 }

--
2.43.5

From 5704b126123a71a41062b9ab502564842fc3ddc9 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 13 Apr 2025 14:10:28 -0400
Subject: [PATCH v1 5/5] Cache typlens of a SQL function's input arguments.

This gets rid of repetitive get_typlen() calls in postquel_sub_params,
which show up as costing 1% or so of the runtime in simple test cases.
---
 src/backend/executor/functions.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 135fddda3fc..e0bca7cb81c 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -116,6 +116,7 @@ typedef struct SQLFunctionHashEntry
     char       *src;            /* function body text (for error msgs) */

     SQLFunctionParseInfoPtr pinfo;    /* data for parser callback hooks */
+    int16       *argtyplen;        /* lengths of the input argument types */

     Oid            rettype;        /* actual return type */
     int16        typlen;            /* length of the return type */
@@ -1100,6 +1101,15 @@ sql_compile_callback(FunctionCallInfo fcinfo,
                                             PG_GET_COLLATION());
     MemoryContextSwitchTo(oldcontext);

+    /*
+     * Now that we have the resolved argument types, collect their typlens for
+     * use in postquel_sub_params.
+     */
+    func->argtyplen = (int16 *)
+        MemoryContextAlloc(hcontext, func->pinfo->nargs * sizeof(int16));
+    for (int i = 0; i < func->pinfo->nargs; i++)
+        func->argtyplen[i] = get_typlen(func->pinfo->argtypes[i]);
+
     /*
      * And of course we need the function body text.
      */
@@ -1427,6 +1437,7 @@ postquel_sub_params(SQLFunctionCachePtr fcache,
     {
         ParamListInfo paramLI;
         Oid           *argtypes = fcache->func->pinfo->argtypes;
+        int16       *argtyplen = fcache->func->argtyplen;

         if (fcache->paramLI == NULL)
         {
@@ -1463,7 +1474,7 @@ postquel_sub_params(SQLFunctionCachePtr fcache,
             prm->isnull = fcinfo->args[i].isnull;
             prm->value = MakeExpandedObjectReadOnly(fcinfo->args[i].value,
                                                     prm->isnull,
-                                                    get_typlen(argtypes[i]));
+                                                    argtyplen[i]);
             /* Allow the value to be substituted into custom plans */
             prm->pflags = PARAM_FLAG_CONST;
             prm->ptype = argtypes[i];
--
2.43.5

Re: Performance issues with v18 SQL-language-function changes

From

Robert Haas

Date:

14 April, 17:38:29

On Sun, Apr 13, 2025 at 3:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> create function fx(p_summa bigint) returns text immutable strict
> return ltrim(to_char(p_summa, '999 999 999 999 999 999 999 999'));
>
> explain analyze select fx(i) from generate_series(1,1000000) as i(i);
>
> you arrive at the rude discovery that 0dca5d68d is about 50% slower
> than 0dca5d68d^, because the old implementation builds a plan for fx()
> only once and then re-uses it throughout the query.

I agree that we should do something about this. I haven't reviewed
your patches but the approach sounds broadly reasonable.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Performance issues with v18 SQL-language-function changes

From

Pavel Stehule

Date:

14 April, 18:39:56

po 14. 4. 2025 v 16:38 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:

On Sun, Apr 13, 2025 at 3:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> create function fx(p_summa bigint) returns text immutable strict
> return ltrim(to_char(p_summa, '999 999 999 999 999 999 999 999'));
>
> explain analyze select fx(i) from generate_series(1,1000000) as i(i);
>
> you arrive at the rude discovery that 0dca5d68d is about 50% slower
> than 0dca5d68d^, because the old implementation builds a plan for fx()
> only once and then re-uses it throughout the query.

I agree that we should do something about this. I haven't reviewed
your patches but the approach sounds broadly reasonable.

I can confirm that all tests passed, and patched code is about 5% faster than the current master (tested on my slower notebook). So it should to fix performance regression where it was it against pg17 (it was about 2%) (tested without assertions)

Regards

Pavel

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Performance issues with v18 SQL-language-function changes

From

Bruce Momjian

Date:

16 April, 20:27:17

On Mon, Apr 14, 2025 at 10:38:29AM -0400, Robert Haas wrote:
> On Sun, Apr 13, 2025 at 3:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > create function fx(p_summa bigint) returns text immutable strict
> > return ltrim(to_char(p_summa, '999 999 999 999 999 999 999 999'));
> >
> > explain analyze select fx(i) from generate_series(1,1000000) as i(i);
> >
> > you arrive at the rude discovery that 0dca5d68d is about 50% slower
> > than 0dca5d68d^, because the old implementation builds a plan for fx()
> > only once and then re-uses it throughout the query.
> 
> I agree that we should do something about this. I haven't reviewed
> your patches but the approach sounds broadly reasonable.

Yep, we went down the road in PG 18 to convert syntax, and now we have
to fix this, or we have to revert all the PG 18 syntax changes, which
seems like a step backward.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.

Re: Performance issues with v18 SQL-language-function changes

From

Tom Lane

Date:

16 April, 20:43:46

Bruce Momjian <bruce@momjian.us> writes:
> On Mon, Apr 14, 2025 at 10:38:29AM -0400, Robert Haas wrote:
>> I agree that we should do something about this. I haven't reviewed
>> your patches but the approach sounds broadly reasonable.

> Yep, we went down the road in PG 18 to convert syntax, and now we have
> to fix this, or we have to revert all the PG 18 syntax changes, which
> seems like a step backward.

I'm confused?  0dca5d68d didn't have anything to do with
syntax changes, just with when planning happens.

            regards, tom lane

Re: Performance issues with v18 SQL-language-function changes

From

Pavel Stehule

Date:

16 April, 20:50:44

st 16. 4. 2025 v 19:43 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Bruce Momjian <bruce@momjian.us> writes:
> On Mon, Apr 14, 2025 at 10:38:29AM -0400, Robert Haas wrote:
>> I agree that we should do something about this. I haven't reviewed
>> your patches but the approach sounds broadly reasonable.

> Yep, we went down the road in PG 18 to convert syntax, and now we have
> to fix this, or we have to revert all the PG 18 syntax changes, which
> seems like a step backward.

I'm confused? 0dca5d68d didn't have anything to do with
syntax changes, just with when planning happens.

yes, and for most cases it has significant performance benefits. For a few corner cases, there can be some slowdown that can be fixed by last Tom patches.

Regards

Pavel

regards, tom lane

Re: Performance issues with v18 SQL-language-function changes

From

Bruce Momjian

Date:

16 April, 21:53:38

On Wed, Apr 16, 2025 at 01:43:46PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Mon, Apr 14, 2025 at 10:38:29AM -0400, Robert Haas wrote:
> >> I agree that we should do something about this. I haven't reviewed
> >> your patches but the approach sounds broadly reasonable.
> 
> > Yep, we went down the road in PG 18 to convert syntax, and now we have
> > to fix this, or we have to revert all the PG 18 syntax changes, which
> > seems like a step backward.
> 
> I'm confused?  0dca5d68d didn't have anything to do with
> syntax changes, just with when planning happens.

I was referencing the contrib initialization functions you converted to
use SQL-standard function bodies:

    commit 68ff25eef12
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Date:   Sun Dec 29 14:58:05 2024 -0500
    
        contrib/pageinspect: Use SQL-standard function bodies.
    
        In the same spirit as 969bbd0fa, 13e3796c9, 3f323eba8.
    
        Tom Lane and Ronan Dunklau
    
        Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop

I thought that was what you were saying were now slower;  maybe I was
confused.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.

Re: Performance issues with v18 SQL-language-function changes

From

Tom Lane

Date:

16 April, 22:13:20

Bruce Momjian <bruce@momjian.us> writes:
> On Wed, Apr 16, 2025 at 01:43:46PM -0400, Tom Lane wrote:
>> I'm confused?  0dca5d68d didn't have anything to do with
>> syntax changes, just with when planning happens.

> I was referencing the contrib initialization functions you converted to
> use SQL-standard function bodies:

Nah, that's not really relevant.  The speed concerns I have here
are mostly independent of whether the SQL function is written
in string or SQL-standard form.  Also, I think all of those
contrib functions that are at all performance-relevant are
capable of being inlined, and so wouldn't reach this code anyway.

            regards, tom lane

Re: Performance issues with v18 SQL-language-function changes

From

Alexander Lakhin

Date:

30 April, 23:00:01

Hello Tom,

Sorry if I've chosen the wrong thread, but starting from 0313c5dc6,
the following script:
CREATE TYPE aggtype AS (a int, b text);
CREATE FUNCTION aggfns_trans(aggtype[], integer, text) RETURNS aggtype[] LANGUAGE sql AS 'SELECT 
array_append($1,ROW($2,$3)::aggtype)';
CREATE AGGREGATE aggfns(integer, text) (SFUNC = public.aggfns_trans, STYPE = public.aggtype[], INITCOND = '{}');

CREATE TABLE t(i int,  k int);
INSERT INTO t SELECT 1, 2 FROM generate_series(1, 4000);

SET statement_timeout = '10s';
SELECT aggfns(i, repeat('x', 8192)) OVER (PARTITION BY i) FROM t;

crashes the server for me like this:
corrupted size vs. prev_size while consolidating
2025-04-30 19:40:04.209 UTC [286426] LOG:  client backend (PID 286441) was terminated by signal 6: Aborted
2025-04-30 19:40:04.209 UTC [286426] DETAIL:  Failed process was running: SELECT aggfns(i, repeat('x', 8192)) OVER 
(PARTITION BY i) FROM t;

(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x000073cc15c4526e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x000073cc15c288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x000073cc15c297b6 in __libc_message_impl (fmt=fmt@entry=0x73cc15dce8d7 "%s\n") at
../sysdeps/posix/libc_fatal.c:132
#6  0x000073cc15ca8fe5 in malloc_printerr (str=str@entry=0x73cc15dd1b30 "corrupted size vs. prev_size while 
consolidating") at ./malloc/malloc.c:5772
#7  0x000073cc15cab144 in _int_free_merge_chunk (av=0x73cc15e03ac0 <main_arena>, p=0x5cb3ac57b2c0, size=12541904) at 
./malloc/malloc.c:4695
#8  0x000073cc15cadd9e in __GI___libc_free (mem=mem@entry=0x5cb3acd73290) at ./malloc/malloc.c:3398
#9  0x00005cb3a0c2db4c in AllocSetFree (pointer=0x5cb3acd732c8) at aset.c:1107
#10 0x00005cb3a0c381f8 in pfree (pointer=<optimized out>) at mcxt.c:241
#11 0x00005cb3a067de98 in heap_free_minimal_tuple (mtup=<optimized out>) at heaptuple.c:1532
#12 0x00005cb3a08b86a1 in tts_minimal_clear (slot=0x5cb3ac576fb0) at execTuples.c:532
#13 0x00005cb3a08ab16e in ExecClearTuple (slot=0x5cb3ac576fb0) at ../../../src/include/executor/tuptable.h:460
#14 ExecFilterJunk (junkfilter=<optimized out>, slot=<optimized out>) at execJunk.c:277
#15 0x00005cb3a08bdb03 in sqlfunction_receive (slot=<optimized out>, self=0x5cb3ac525ce0) at functions.c:2597
#16 0x00005cb3a08ab4e7 in ExecutePlan (dest=0x5cb3ac525ce0, direction=<optimized out>, numberTuples=1, sendTuples=true,

operation=CMD_SELECT,
     queryDesc=0x5cb3ac525d30) at execMain.c:1814
...

With some script modifications, I observed also other memory-context-
related crashes.

(Probably this effect is achieved just because of the performance
optimization -- I haven't look deeper yet.)

This issue is discovered with SQLsmith.

Best regards,
Alexander Lakhin
Neon (https://neon.tech)

Re: Performance issues with v18 SQL-language-function changes

From

Tom Lane

Date:

30 April, 23:37:16

Alexander Lakhin <exclusion@gmail.com> writes:
> Sorry if I've chosen the wrong thread, but starting from 0313c5dc6,
> the following script:
> ...
> crashes the server for me like this:
> corrupted size vs. prev_size while consolidating

Yup, duplicated here.  Thanks for the report!

            regards, tom lane

Re: Performance issues with v18 SQL-language-function changes

From

Tom Lane

Date:

01 May, 03:18:01

Alexander Lakhin <exclusion@gmail.com> writes:
> Sorry if I've chosen the wrong thread, but starting from 0313c5dc6,
> the following script:
> CREATE TYPE aggtype AS (a int, b text);
> CREATE FUNCTION aggfns_trans(aggtype[], integer, text) RETURNS aggtype[] LANGUAGE sql AS 'SELECT
> array_append($1,ROW($2,$3)::aggtype)';
> CREATE AGGREGATE aggfns(integer, text) (SFUNC = public.aggfns_trans, STYPE = public.aggtype[], INITCOND = '{}');

> CREATE TABLE t(i int,  k int);
> INSERT INTO t SELECT 1, 2 FROM generate_series(1, 4000);

> SET statement_timeout = '10s';
> SELECT aggfns(i, repeat('x', 8192)) OVER (PARTITION BY i) FROM t;

> crashes the server for me like this:
> corrupted size vs. prev_size while consolidating

Hmm.  What seems to be going on here is that once the aggfns_trans()
result gets large enough that the SQL-function-result tuplestore
decides to spill to disk, when we pull the result tuple back out
of the tuplestore with tuplestore_gettupleslot we end up with the
jf_resultSlot holding a should-free tuple pointer that points into
the tuplestore's storage.  After tuplestore_clear that is a dangling
pointer, and the next use of the jf_resultSlot fails while trying to
free the tuple.

So the attached fixes it for me, but I'm still mighty confused
because I don't understand why it didn't fail in the old code.
This logic doesn't seem noticeably different from before, and
there's even a very old comment (in the SRF path) alleging that

    /* NB: this might delete the slot's content, but we don't care */

Now we *do* care, but what changed?

(As an aside, seems like tuplestore is leaving money on the table,
because there's hardly any point in spilling to disk when it never
holds more than one tuple.  But that's not something to change now.)

            regards, tom lane

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e0bca7cb81c..37f616280c6 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1728,8 +1728,9 @@ fmgr_sql(PG_FUNCTION_ARGS)
                 elog(ERROR, "failed to fetch lazy-eval tuple");
             /* Extract the result as a datum, and copy out from the slot */
             result = postquel_get_single_result(slot, fcinfo, fcache);
+            /* Clear the slot, in case it points into the tuplestore */
+            ExecClearTuple(slot);
             /* Clear the tuplestore, but keep it for next time */
-            /* NB: this might delete the slot's content, but we don't care */
             tuplestore_clear(fcache->tstore);

             /*
@@ -1810,7 +1811,11 @@ fmgr_sql(PG_FUNCTION_ARGS)
             /* Re-use the junkfilter's output slot to fetch back the tuple */
             slot = fcache->junkFilter->jf_resultSlot;
             if (tuplestore_gettupleslot(fcache->tstore, true, false, slot))
+            {
                 result = postquel_get_single_result(slot, fcinfo, fcache);
+                /* Clear the slot, in case it points into the tuplestore */
+                ExecClearTuple(slot);
+            }
             else
             {
                 fcinfo->isnull = true;

Re: Performance issues with v18 SQL-language-function changes

From

Tom Lane

Date:

01 May, 17:19:06

I wrote:
> Hmm.  What seems to be going on here is that once the aggfns_trans()
> result gets large enough that the SQL-function-result tuplestore
> decides to spill to disk, when we pull the result tuple back out
> of the tuplestore with tuplestore_gettupleslot we end up with the
> jf_resultSlot holding a should-free tuple pointer that points into
> the tuplestore's storage.  After tuplestore_clear that is a dangling
> pointer, and the next use of the jf_resultSlot fails while trying to
> free the tuple.

I still haven't figured out why this wasn't a problem in the old
version of functions.c.  However, I did realize that my pending
patch at [1] gets rid of the problem in another way, by removing
functions.c's use of tuplestore_gettupleslot altogether.  Now
I'm tempted to just push that, instead of applying a band-aid that
will leave v18 doing this differently from both earlier and later
branches.

            regards, tom lane

[1] https://www.postgresql.org/message-id/2443532.1744919968%40sss.pgh.pa.us

Re: Performance issues with v18 SQL-language-function changes

From

Tom Lane

Date:

01 May, 22:50:36

I wrote:
> I still haven't figured out why this wasn't a problem in the old
> version of functions.c.

Oh, got it: I was misremembering the API contract for
tuplestore_gettupleslot.  There are two possible ways for it to
load the slot with a tuple:

* tuple points into the tuplestore's memory, and the slot's
should_free flag is false.  In this case, tuplestore_clear
leaves the slot pointing to garbage, but it doesn't matter
because we won't try to access nor free the garbage.  (This
is the case that comment was talking about.)

* tuple points to a tuple palloc'd in *the caller's memory
context* --- not the tuplestore's memory as I was thinking.
In this case the slot's should_free flag is true.

Thus the bug arises because the caller's memory context is now the
relatively short-lived per-tuple context that fmgr_sql is called in,
where in prior versions it was the SQL function's long-lived fcontext.
The slot we're putting it in is also long-lived.  So what happens
is that on the next fmgr_sql call, after the per-tuple context has
been reset and thus discarded the palloc'd tuple, the slot still
remembers that it's supposed to free the tuple and does so.  Kaboom.
But in released branches, the slot and the palloc'd tuple are in the
same context and there's no such hazard.

The quick-hack patch I posted upthread fixes this by ensuring we
clear the slot (and thus let it pfree the tuple) before context
reset of the per-tuple context wipes the tuple from underneath it.

I'm still inclined to fix this by using the patch from [1] though.
Aside from the argument that it'd be better if v18 were like later
branches here, I now realize that there's at least one false statement
in my argument at [1]:

>> Given the lack of field complaints over the many years it's been
>> like this, there doesn't seem to be a live problem.  I think the
>> explanation for that is
>> (1) those mechanisms are never used to call set-returning functions,
>> (2) therefore, the tuplestore will not be called on to hold more
>> than one result row,
>> (3) therefore, it won't get large enough that it wants to spill
>> to disk,
>> (4) therefore, its potentially dangling resowner pointer is never
>> used.
>> However, this is an uncomfortably shaky chain of assumptions.

This example has shown that tuplestore.c is capable of spilling to
disk even if it is storing only a single tuple, which means (3) is
wrong, which means the tuplestore *can* touch its resowner pointer
if the function result is wide enough.  I wonder if this means we
have a reachable bug in released branches.

            regards, tom lane

[1] https://www.postgresql.org/message-id/2443532.1744919968%40sss.pgh.pa.us