Running contrib/postgres_fdw's regression tests under Valgrind shows
two different sources of memory leaks. One is a basically-cosmetic
issue in InitPgFdwOptions, but the other is real and troublesome.
The DirectModify code path relies on PG_TRY blocks to ensure that it
releases the PGresult for the foreign modify command, but that can't
work because (at least in cases with RETURNING data) the PGresult has
to survive across successive calls to postgresIterateDirectModify.
If an error occurs in the query in between those steps, we have no
opportunity to clean up.
I thought of fixing this by using a memory context reset callback
to ensure that the PGresult is cleaned up when the executor's context
goes away, and that seems to work nicely (see 0001 attached).
However, I feel like this is just a POC, because now that we have that
concept we might be able to use it elsewhere in postgres_fdw to
eliminate most or even all of its reliance on PG_TRY. That should be
faster as well as much less bug-prone. But I'm putting it up at this
stage for comments, in case anyone thinks this is not the direction to
head in.
0002 attached deals with the other thing. If you apply these
on top of my valgrind-cleanup patches at [1], you'll see that
contrib/postgres_fdw's tests go through leak-free.
regards, tom lane
[1] https://www.postgresql.org/message-id/2884224.1748035274%40sss.pgh.pa.us
From 9b88de523dbede9888b695e660992f803c0110ee Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 23 May 2025 20:36:08 -0400
Subject: [PATCH v1 1/2] Fix memory leakage in postgres_fdw's DirectModify code
path.
postgres_fdw tries to use PG_TRY blocks to ensure that it will
eventually free the PGresult created by the remote modify command.
However, it's fundamentally impossible for this scheme to work
reliably when there's RETURNING data, because the query could fail
in between invocations of postgres_fdw's DirectModify methods.
There is at least one instance of exactly this situation in the
regression tests, and the ensuing session-lifespan leak is visible
under Valgrind.
We can improve matters by using a memory context reset callback
attached to the ExecutorState context. That ensures that the
PGresult will be freed when the ExecutorState context is torn
down, even if control never reaches postgresEndDirectModify.
Also, by switching to this approach, we can eliminate some
PG_TRY overhead since it's no longer necessary to be so
cautious about errors.
I added a MemoryContextUnregisterResetCallback function to mcxt.c,
which we hadn't needed before. This fix could have been made without
that, but the logic is simpler this way. Also I think we might be
able to use the same idea to make some other postgres_fdw code less
fragile, and having an unregister capability will help there.
---
contrib/postgres_fdw/postgres_fdw.c | 53 ++++++++++++++---------------
src/backend/utils/mmgr/mcxt.c | 39 +++++++++++++++++++--
src/include/utils/palloc.h | 2 ++
3 files changed, 64 insertions(+), 30 deletions(-)
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 331f3fc088d..bd3f3e669fb 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -240,6 +240,7 @@ typedef struct PgFdwDirectModifyState
PGresult *result; /* result for query */
int num_tuples; /* # of result tuples */
int next_tuple; /* index of next one to return */
+ MemoryContextCallback result_cb; /* ensures result will get freed */
Relation resultRel; /* relcache entry for the target relation */
AttrNumber *attnoMap; /* array of attnums of input user columns */
AttrNumber ctidAttno; /* attnum of input ctid column */
@@ -2817,7 +2818,13 @@ postgresEndDirectModify(ForeignScanState *node)
return;
/* Release PGresult */
- PQclear(dmstate->result);
+ if (dmstate->result)
+ {
+ MemoryContextUnregisterResetCallback(GetMemoryChunkContext(dmstate),
+ &dmstate->result_cb);
+ PQclear(dmstate->result);
+ dmstate->result = NULL;
+ }
/* Release remote connection */
ReleaseConnection(dmstate->conn);
@@ -4591,13 +4598,19 @@ execute_dml_stmt(ForeignScanState *node)
/*
* Get the result, and check for success.
*
- * We don't use a PG_TRY block here, so be careful not to throw error
- * without releasing the PGresult.
+ * We use a memory context callback to ensure that the PGresult will be
+ * released, even if the query fails somewhere that's outside our control.
*/
+ Assert(dmstate->result == NULL);
dmstate->result = pgfdw_get_result(dmstate->conn);
+ dmstate->result_cb.func = (MemoryContextCallbackFunction) PQclear;
+ dmstate->result_cb.arg = dmstate->result;
+ MemoryContextRegisterResetCallback(GetMemoryChunkContext(dmstate),
+ &dmstate->result_cb);
+
if (PQresultStatus(dmstate->result) !=
(dmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
- pgfdw_report_error(ERROR, dmstate->result, dmstate->conn, true,
+ pgfdw_report_error(ERROR, dmstate->result, dmstate->conn, false,
dmstate->query);
/* Get the number of rows affected. */
@@ -4641,30 +4654,16 @@ get_returning_data(ForeignScanState *node)
}
else
{
- /*
- * On error, be sure to release the PGresult on the way out. Callers
- * do not have PG_TRY blocks to ensure this happens.
- */
- PG_TRY();
- {
- HeapTuple newtup;
-
- newtup = make_tuple_from_result_row(dmstate->result,
- dmstate->next_tuple,
- dmstate->rel,
- dmstate->attinmeta,
- dmstate->retrieved_attrs,
- node,
- dmstate->temp_cxt);
- ExecStoreHeapTuple(newtup, slot, false);
- }
- PG_CATCH();
- {
- PQclear(dmstate->result);
- PG_RE_THROW();
- }
- PG_END_TRY();
+ HeapTuple newtup;
+ newtup = make_tuple_from_result_row(dmstate->result,
+ dmstate->next_tuple,
+ dmstate->rel,
+ dmstate->attinmeta,
+ dmstate->retrieved_attrs,
+ node,
+ dmstate->temp_cxt);
+ ExecStoreHeapTuple(newtup, slot, false);
/* Get the updated/deleted tuple. */
if (dmstate->rel)
resultSlot = slot;
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 15fa4d0a55e..ce01dce9861 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -560,9 +560,7 @@ MemoryContextDeleteChildren(MemoryContext context)
* the specified context, since that means it will automatically be freed
* when no longer needed.
*
- * There is no API for deregistering a callback once registered. If you
- * want it to not do anything anymore, adjust the state pointed to by its
- * "arg" to indicate that.
+ * Note that callers can assume this cannot fail.
*/
void
MemoryContextRegisterResetCallback(MemoryContext context,
@@ -577,6 +575,41 @@ MemoryContextRegisterResetCallback(MemoryContext context,
context->isReset = false;
}
+/*
+ * MemoryContextUnregisterResetCallback
+ * Undo the effects of MemoryContextRegisterResetCallback.
+ *
+ * This can be used if a callback's effects are no longer required
+ * at some point before the context has been reset/deleted. It is the
+ * caller's responsibility to pfree the callback struct (if needed).
+ *
+ * An assertion failure occurs if the callback was not registered.
+ * We could alternatively define that case as a no-op, but that seems too
+ * likely to mask programming errors such as passing the wrong context.
+ */
+void
+MemoryContextUnregisterResetCallback(MemoryContext context,
+ MemoryContextCallback *cb)
+{
+ MemoryContextCallback *prev,
+ *cur;
+
+ Assert(MemoryContextIsValid(context));
+
+ for (prev = NULL, cur = context->reset_cbs; cur != NULL;
+ prev = cur, cur = cur->next)
+ {
+ if (cur != cb)
+ continue;
+ if (prev)
+ prev->next = cur->next;
+ else
+ context->reset_cbs = cur->next;
+ return;
+ }
+ Assert(false);
+}
+
/*
* MemoryContextCallResetCallbacks
* Internal function to call all registered callbacks for context.
diff --git a/src/include/utils/palloc.h b/src/include/utils/palloc.h
index e1b42267b22..039b9cba61a 100644
--- a/src/include/utils/palloc.h
+++ b/src/include/utils/palloc.h
@@ -133,6 +133,8 @@ MemoryContextSwitchTo(MemoryContext context)
/* Registration of memory context reset/delete callbacks */
extern void MemoryContextRegisterResetCallback(MemoryContext context,
MemoryContextCallback *cb);
+extern void MemoryContextUnregisterResetCallback(MemoryContext context,
+ MemoryContextCallback *cb);
/*
* These are like standard strdup() except the copied string is
--
2.43.5
From 61f267f2ebaaadfa3ba1214f2aa2c64d3467c11a Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 23 May 2025 20:51:03 -0400
Subject: [PATCH v1 2/2] Silence leakage complaint about postgres_fdw's
InitPgFdwOptions.
Valgrind complains that the PQconninfoOption array returned by libpq
is leaked. We apparently believed that we could suppress that warning
by storing that array's address in a static variable. However, modern
C compilers are bright enough to optimize the static variable away.
We could escalate that arms race by making the variable global.
But on the whole it seems better to revise the code so that it
can free libpq's result properly. The only thing that costs
us is copying the parameter-name keywords; which seems like a
pretty negligible cost in a function that runs at most once per
process.
---
contrib/postgres_fdw/option.c | 33 ++++++++++++---------------------
1 file changed, 12 insertions(+), 21 deletions(-)
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index c2f936640bc..d6fa89bad93 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -21,6 +21,7 @@
#include "libpq/libpq-be.h"
#include "postgres_fdw.h"
#include "utils/guc.h"
+#include "utils/memutils.h"
#include "utils/varlena.h"
/*
@@ -39,12 +40,6 @@ typedef struct PgFdwOption
*/
static PgFdwOption *postgres_fdw_options;
-/*
- * Valid options for libpq.
- * Allocated and filled in InitPgFdwOptions.
- */
-static PQconninfoOption *libpq_options;
-
/*
* GUC parameters
*/
@@ -239,6 +234,7 @@ static void
InitPgFdwOptions(void)
{
int num_libpq_opts;
+ PQconninfoOption *libpq_options;
PQconninfoOption *lopt;
PgFdwOption *popt;
@@ -307,8 +303,8 @@ InitPgFdwOptions(void)
* Get list of valid libpq options.
*
* To avoid unnecessary work, we get the list once and use it throughout
- * the lifetime of this backend process. We don't need to care about
- * memory context issues, because PQconndefaults allocates with malloc.
+ * the lifetime of this backend process. Hence, we'll allocate it in
+ * TopMemoryContext.
*/
libpq_options = PQconndefaults();
if (!libpq_options) /* assume reason for failure is OOM */
@@ -325,19 +321,11 @@ InitPgFdwOptions(void)
/*
* Construct an array which consists of all valid options for
* postgres_fdw, by appending FDW-specific options to libpq options.
- *
- * We use plain malloc here to allocate postgres_fdw_options because it
- * lives as long as the backend process does. Besides, keeping
- * libpq_options in memory allows us to avoid copying every keyword
- * string.
*/
postgres_fdw_options = (PgFdwOption *)
- malloc(sizeof(PgFdwOption) * num_libpq_opts +
- sizeof(non_libpq_options));
- if (postgres_fdw_options == NULL)
- ereport(ERROR,
- (errcode(ERRCODE_FDW_OUT_OF_MEMORY),
- errmsg("out of memory")));
+ MemoryContextAlloc(TopMemoryContext,
+ sizeof(PgFdwOption) * num_libpq_opts +
+ sizeof(non_libpq_options));
popt = postgres_fdw_options;
for (lopt = libpq_options; lopt->keyword; lopt++)
@@ -355,8 +343,8 @@ InitPgFdwOptions(void)
if (strncmp(lopt->keyword, "oauth_", strlen("oauth_")) == 0)
continue;
- /* We don't have to copy keyword string, as described above. */
- popt->keyword = lopt->keyword;
+ popt->keyword = MemoryContextStrdup(TopMemoryContext,
+ lopt->keyword);
/*
* "user" and any secret options are allowed only on user mappings.
@@ -371,6 +359,9 @@ InitPgFdwOptions(void)
popt++;
}
+ /* Done with libpq's output structure. */
+ PQconninfoFree(libpq_options);
+
/* Append FDW-specific options and dummy terminator. */
memcpy(popt, non_libpq_options, sizeof(non_libpq_options));
}
--
2.43.5