Thread: Support prepared statement invalidation when result types change
The cached plan for a prepared statements can get invalidated when DDL changes the tables used in the query, or when search_path changes. When this happens the prepared statement can still be executed, but it will be replanned in the new context. This means that the prepared statement will do something different e.g. in case of search_path changes it will select data from a completely different table. This won't throw an error, because it is considered the responsibility of the operator and query writers that the query will still do the intended thing. However, we would throw an error if the the result of the query is of a different type than it was before: ERROR: cached plan must not change result type This requirement was not documented anywhere and it can thus be a surprising error to hit. But it's actually not needed for this to be an error, as long as we send the correct RowDescription there does not have to be a problem for clients when the result types or column counts change. This patch starts to allow a prepared statement to continue to work even when the result type changes. Without this change all clients that automatically prepare queries as a performance optimization will need to handle or avoid the error somehow, often resulting in deallocating and re-preparing queries when its usually not necessary. With this change connection poolers can also safely prepare the same query only once on a connection and share this one prepared query across clients that prepared that exact same query. Some relevant previous discussions: [1]: https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com [2]: https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type [3]: https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error [4]: https://github.com/pgjdbc/pgjdbc/pull/451 [5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551 [6]: https://github.com/jackc/pgx/issues/927 [7]: https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2 [8]: https://github.com/rails/rails/issues/12330
Attachment
On Sat, Aug 26, 2023 at 1:58 AM Jelte Fennema <me@jeltef.nl> wrote: > > The cached plan for a prepared statements can get invalidated when DDL > changes the tables used in the query, or when search_path changes. When > this happens the prepared statement can still be executed, but it will > be replanned in the new context. This means that the prepared statement > will do something different e.g. in case of search_path changes it will > select data from a completely different table. This won't throw an > error, because it is considered the responsibility of the operator and > query writers that the query will still do the intended thing. > > However, we would throw an error if the the result of the query is of a > different type than it was before: > ERROR: cached plan must not change result type > > This requirement was not documented anywhere and it > can thus be a surprising error to hit. But it's actually not needed for > this to be an error, as long as we send the correct RowDescription there > does not have to be a problem for clients when the result types or > column counts change. > > This patch starts to allow a prepared statement to continue to work even > when the result type changes. > > Without this change all clients that automatically prepare queries as a > performance optimization will need to handle or avoid the error somehow, > often resulting in deallocating and re-preparing queries when its > usually not necessary. With this change connection poolers can also > safely prepare the same query only once on a connection and share this > one prepared query across clients that prepared that exact same query. > > Some relevant previous discussions: > [1]: https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com > [2]: https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type > [3]: https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error > [4]: https://github.com/pgjdbc/pgjdbc/pull/451 > [5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551 > [6]: https://github.com/jackc/pgx/issues/927 > [7]: https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2 > [8]: https://github.com/rails/rails/issues/12330 prepared statement with no parameters, tested many cases (add column, change column data type, rename column, set default, set not null), it worked as expected. With parameters, it also works, only a tiny issue with error reporting. prepstmt2 | PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest WHERE q1 = $1; | {bigint} | {bigint,bigint,bigint} ERROR: column "q1" does not exist at character 61 HINT: Perhaps you meant to reference the column "pcachetest.x1". STATEMENT: execute prepstmt2(1); I think "character 61" refer to "PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest WHERE q1 = $1;" so maybe the STATEMENT is slightly misleading.
On 25.08.2023 8:57 PM, Jelte Fennema wrote: > The cached plan for a prepared statements can get invalidated when DDL > changes the tables used in the query, or when search_path changes. When > this happens the prepared statement can still be executed, but it will > be replanned in the new context. This means that the prepared statement > will do something different e.g. in case of search_path changes it will > select data from a completely different table. This won't throw an > error, because it is considered the responsibility of the operator and > query writers that the query will still do the intended thing. > > However, we would throw an error if the the result of the query is of a > different type than it was before: > ERROR: cached plan must not change result type > > This requirement was not documented anywhere and it > can thus be a surprising error to hit. But it's actually not needed for > this to be an error, as long as we send the correct RowDescription there > does not have to be a problem for clients when the result types or > column counts change. > > This patch starts to allow a prepared statement to continue to work even > when the result type changes. > > Without this change all clients that automatically prepare queries as a > performance optimization will need to handle or avoid the error somehow, > often resulting in deallocating and re-preparing queries when its > usually not necessary. With this change connection poolers can also > safely prepare the same query only once on a connection and share this > one prepared query across clients that prepared that exact same query. > > Some relevant previous discussions: > [1]: https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com > [2]: https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type > [3]: https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error > [4]: https://github.com/pgjdbc/pgjdbc/pull/451 > [5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551 > [6]: https://github.com/jackc/pgx/issues/927 > [7]: https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2 > [8]: https://github.com/rails/rails/issues/12330 The following assignment of format is not corrects: /* Do nothing if portal won't return tuples */ if (portal->tupDesc == NULL) + { + /* + * For SELECT like queries we delay filling in the tupDesc until after + * PortalRunUtility, because we don't know what rows an EXECUTE + * command will return. Because we delay setting tupDesc, we also need + * to delay setting formats. We do this in a pretty hacky way, by + * temporarily setting the portal formats to the passed in formats. + * Then once we fill in tupDesc, we call PortalSetResultFormat again + * with portal->formats to fill in the final formats value. + */ + if (portal->strategy == PORTAL_UTIL_SELECT) + portal->formats = formats; return; because it is create in other memory context: postgres.c: /* Done storing stuff in portal's context */ MemoryContextSwitchTo(oldContext); ... /* Get the result format codes */ numRFormats = pq_getmsgint(input_message, 2); if (numRFormats > 0) { rformats = palloc_array(int16, numRFormats); for (int i = 0; i < numRFormats; i++) rformats[i] = pq_getmsgint(input_message, 2); } It has to be copied as below: portal->formats = (int16 *) MemoryContextAlloc(portal->portalContext, natts * sizeof(int16)); memcpy(portal->formats, formats, natts * sizeof(int16)); or alternatively MemoryContextSwitchTo(oldContext) should be moved after initialization of rformat
On Mon, 28 Aug 2023 at 15:05, Konstantin Knizhnik <knizhnik@garret.ru> wrote: > The following assignment of format is not corrects: > > It has to be copied as below: > > portal->formats = (int16 *) > MemoryContextAlloc(portal->portalContext, > natts * sizeof(int16)); > memcpy(portal->formats, formats, natts * sizeof(int16)); I attached a new version of the patch where I now did this. But I also moved the code around quite a bit, since all this tupDesc/format delaying is only needed for exec_simple_query. The original changes actually broke some prepared statements that were using protocol level Bind messages.
Attachment
On Mon, 28 Aug 2023 at 11:27, jian he <jian.universality@gmail.com> wrote: > With parameters, it also works, only a tiny issue with error reporting. > > prepstmt2 | PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest > WHERE q1 = $1; | {bigint} | {bigint,bigint,bigint} > ERROR: column "q1" does not exist at character 61 > HINT: Perhaps you meant to reference the column "pcachetest.x1". > STATEMENT: execute prepstmt2(1); > > I think "character 61" refer to "PREPARE prepstmt2(bigint) AS SELECT * > FROM pcachetest WHERE q1 = $1;" > so maybe the STATEMENT is slightly misleading. Could you share the full set of commands that cause the reporting issue? I don't think my changes should impact this reporting, so I'm curious if this is a new issue, or an already existing one.
On Tue, Aug 29, 2023 at 12:51 AM Jelte Fennema <me@jeltef.nl> wrote: > > Could you share the full set of commands that cause the reporting > issue? I don't think my changes should impact this reporting, so I'm > curious if this is a new issue, or an already existing one. I didn't apply your v2 patch. full set of commands: -------------------- regression=# CREATE TEMP TABLE pcachetest AS SELECT * FROM int8_tbl; SELECT 5 regression=# PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest WHERE q1 = $1;' PREPARE regression=# alter table pcachetest rename q1 to x; ALTER TABLE regression=# EXECUTE prepstmt2(123); 2023-08-29 17:23:59.148 CST [1382074] ERROR: column "q1" does not exist at character 61 2023-08-29 17:23:59.148 CST [1382074] HINT: Perhaps you meant to reference the column "pcachetest.q2". 2023-08-29 17:23:59.148 CST [1382074] STATEMENT: EXECUTE prepstmt2(123); ERROR: column "q1" does not exist HINT: Perhaps you meant to reference the column "pcachetest.q2". --------------------------
On Tue, 29 Aug 2023 at 11:29, jian he <jian.universality@gmail.com> wrote: > regression=# CREATE TEMP TABLE pcachetest AS SELECT * FROM int8_tbl; > SELECT 5 > regression=# PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest > WHERE q1 = $1;' > PREPARE > regression=# alter table pcachetest rename q1 to x; > ALTER TABLE > regression=# EXECUTE prepstmt2(123); > 2023-08-29 17:23:59.148 CST [1382074] ERROR: column "q1" does not > exist at character 61 > 2023-08-29 17:23:59.148 CST [1382074] HINT: Perhaps you meant to > reference the column "pcachetest.q2". > 2023-08-29 17:23:59.148 CST [1382074] STATEMENT: EXECUTE prepstmt2(123); > ERROR: column "q1" does not exist > HINT: Perhaps you meant to reference the column "pcachetest.q2". Thank you for the full set of commands. In that case the issue you're describing is completely separate from this patch. The STATEMENT: part of the log will always show the query that was received by the server. This behaviour was already present even without my patch (I double checked with PG15.3).
Another similar issue was reported on the PgBouncer prepared statement PR[1]. It's related to an issue that was already reported on the mailinglist[2]. It turns out that invalidation of the argument types is also important in some cases. The newly added 3rd patch in this series addresses that issue. [1]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1309454695 [2]: https://www.postgresql.org/message-id/flat/CA%2Bmi_8YAGf9qibDFTRNKgaTwaBa1OUcteKqLAxfMmKFbo3GHZg%40mail.gmail.com
Attachment
When running the Postgres JDBC tests with this patchset I found dumb mistake in my last patch where I didn't initialize the contents of orig_params correctly. This new patchset fixes that.
Attachment
On Tue, Sep 12, 2023, at 10:17 AM, Jelte Fennema wrote:
When running the Postgres JDBC tests with this patchset I found dumbmistake in my last patch where I didn't initialize the contents oforig_params correctly. This new patchset fixes that.
0001:
Don't you want to execute this code path only for EXECUTE command?
PORTAL_UTIL_SELECT includes other commands such as CALL, FETCH, SHOW, and
EXPLAIN. If so, check if commandTag is CMDTAG_EXECUTE.
Regarding tupDesc, you don't need to call FreeTupleDesc instead you can modify
PortalStart as
case PORTAL_UTIL_SELECT:
/*
* We don't set snapshot here, because PortalRunUtility will
* take care of it if needed.
*/
{
PlannedStmt *pstmt = PortalGetPrimaryStmt(portal);
Assert(pstmt->commandType == CMD_UTILITY);
/*
* tupDesc will be filled by FillPortalStore later because
* it might change due to replanning when ExecuteQuery calls
* GetCachedPlan.
*/
if (portal->commandTag != CMDTAG_EXECUTE)
portal->tupDesc = UtilityTupleDescriptor(pstmt->utilityStmt);
}
Regarding the commit message, ...if the the result... should be fixed. The
sentence "it's actually not needed..." could be "It doesn't need to be an error
as long as it sends the RowDescription...". The sentence "This patch starts to
allow a prepared ..." could be "This patch allows a prepared ...".
0002:
You should remove this comment because it refers to the option you are
removing.
- plan->cursor_options,
- false); /* not fixed result */
+ plan->cursor_options); /* not fixed result */
You should also remove the sentence that refers to fixed_result in
CompleteCachedPlan.
* cursor_options: options bitmask to pass to planner
* fixed_result: true to disallow future changes in query's result tupdesc
*/
void
CompleteCachedPlan(CachedPlanSource *plansource,
List *querytree_list,
MemoryContext querytree_context,
0003:
You should initialize the new parameters (orig_param_types and orig_num_params)
in CreateCachedPlan. One suggestion is to move the following code to
CompleteCachedPlan because plansource->param_types are assigned there.
@@ -108,6 +108,10 @@ PrepareQuery(ParseState *pstate, PrepareStmt *stmt,
argtypes[i++] = toid;
}
+
+ plansource->orig_num_params = nargs;
+ plansource->orig_param_types = MemoryContextAlloc(plansource->context, nargs * sizeof(Oid));
+ memcpy(plansource->orig_param_types, argtypes, nargs * sizeof(Oid));
}
This comment is confusing. Since the new function
(GetCachedPlanFromRevalidated) contains almost all code from GetCachedPlan, its
content is the same as the *previous* GetCachedPlan function. You could expand
this comment a bit to make it clear that it contains the logic to decide
between generic x custom plan. I don't like the function name but have only a
not-so-good suggestion: GetCachedPlanAfterRevalidate. I also don't like the
revalidationResult as a variable name. Why don't you keep qlist? Or use a name
near query-tree list (query_list? qtlist?). s/te caller/the caller/
+ * GetCachedPlanFromRevalidated: is the same as get GetCachedPlan, but requires
+ * te caller to first revalidate the query. This is needed for callers that
+ * need to use the revalidated plan to generate boundParams.
+ */
+CachedPlan *
+GetCachedPlanFromRevalidated(CachedPlanSource *plansource,
+ ParamListInfo boundParams,
+ ResourceOwner owner,
+ QueryEnvironment *queryEnv,
+ List *revalidationResult)
Are these names accurate? The original are the current ones; new ones are
"runtime" data. It would be good to explain why a new array is required.
Oid *param_types; /* array of parameter type OIDs, or NULL */
int num_params; /* length of param_types array */
+ Oid *orig_param_types; /* array of original parameter type OIDs,
+ * or NULL */
+ int orig_num_params; /* length of orig_param_types array */
You should expand the commit message a bit. Explain this feature. Inform the
behavior change.
Hi,
This requirement was not documented anywhere and it
can thus be a surprising error to hit. But it's actually not needed for
this to be an error, as long as we send the correct RowDescription there
does not have to be a problem for clients when the result types or
column counts change.
What if a client has *cached* an old version of RowDescription
and the server changed it to something new and sent resultdata
with the new RowDescription. Will the client still be able to work
expectly?
I don't hope my concern is right since I didn't go through any of
the drivers in detail, but I hope my concern is expressed correctly.
Best Regards
Andy Fan
@Euler thanks for the review. I addressed the feedback. On Fri, 15 Sept 2023 at 01:41, Andy Fan <zhihui.fan1213@gmail.com> wrote: > What if a client has *cached* an old version of RowDescription > and the server changed it to something new and sent resultdata > with the new RowDescription. Will the client still be able to work > expectly? It depends a bit on the exact change. For instance a column being added to the end of the resultdata shouldn't be a problem. And that is actually quite a common case for this issue: 1. PREPARE p as (SELECT * FROM t); 2. ALTER TABLE t ADD COLUMN ... 3. EXECUTE p But type changes of existing columns might cause issues when the RowDescription is cached. But such changes also cause issues now. Currently the prepared statement becomes unusable when this happens (returning errors every time). With this patch it's at least possible to have prepared statements continue working in many cases. Furthermore caching RowDescription is also not super useful, most clients request it every time because it does not require an extra round trip, so there's almost no overhead in requesting it. Clients caching ParameterDescription seems more useful because fetching the parameter types does require an extra round trip. So caching it could cause errors with 0003. But right now if the argument types need to change it gives an error every time when executing the prepared statement. So I believe 0003 is still an improvement over the status quo, because there are many cases where the client knows that the types might have changed and it thus needs to re-fetch the ParameterDescription: the most common case is changing the search_path. And there's also cases where even a cached ParamaterDescription will work fine: e.g. the type is changed but the encoding stays the same (e.g. drop + create an enum, or text/varchar, or the text encoding of int and bigint)
Attachment
On Mon, 18 Sept 2023 at 18:01, Jelte Fennema-Nio <me@jeltef.nl> wrote: > > @Euler thanks for the review. I addressed the feedback. > > On Fri, 15 Sept 2023 at 01:41, Andy Fan <zhihui.fan1213@gmail.com> wrote: > > What if a client has *cached* an old version of RowDescription > > and the server changed it to something new and sent resultdata > > with the new RowDescription. Will the client still be able to work > > expectly? > > It depends a bit on the exact change. For instance a column being > added to the end of the resultdata shouldn't be a problem. And that is > actually quite a common case for this issue: > 1. PREPARE p as (SELECT * FROM t); > 2. ALTER TABLE t ADD COLUMN ... > 3. EXECUTE p > > But type changes of existing columns might cause issues when the > RowDescription is cached. But such changes also cause issues now. > Currently the prepared statement becomes unusable when this happens > (returning errors every time). With this patch it's at least possible > to have prepared statements continue working in many cases. > Furthermore caching RowDescription is also not super useful, most > clients request it every time because it does not require an extra > round trip, so there's almost no overhead in requesting it. > > Clients caching ParameterDescription seems more useful because > fetching the parameter types does require an extra round trip. So > caching it could cause errors with 0003. But right now if the argument > types need to change it gives an error every time when executing the > prepared statement. So I believe 0003 is still an improvement over the > status quo, because there are many cases where the client knows that > the types might have changed and it thus needs to re-fetch the > ParameterDescription: the most common case is changing the > search_path. And there's also cases where even a cached > ParamaterDescription will work fine: e.g. the type is changed but the > encoding stays the same (e.g. drop + create an enum, or text/varchar, > or the text encoding of int and bigint) One of the test has aborted in CFBot at [1] with: [05:26:16.214] Core was generated by `postgres: postgres regression_pg_stat_statements [local] EXECUTE '. [05:26:16.214] Program terminated with signal SIGABRT, Aborted. [05:26:16.214] #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 [05:26:16.214] Download failed: Invalid argument. Continuing without source file ./signal/../sysdeps/unix/sysv/linux/raise.c. [05:26:16.392] [05:26:16.392] Thread 1 (Thread 0x7fbe1d997a40 (LWP 28738)): [05:26:16.392] #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 .... .... [05:26:36.911] #5 0x000055c5aa523e71 in RevalidateCachedQuery (plansource=0x55c5ac811cf0, queryEnv=queryEnv@entry=0x0) at ../src/backend/utils/cache/plancache.c:730 [05:26:36.911] num_params = 0 [05:26:36.911] param_types = 0x55c5ac860438 [05:26:36.911] snapshot_set = false [05:26:36.911] rawtree = 0x55c5ac859f08 [05:26:36.911] tlist = <optimized out> [05:26:36.911] qlist = <optimized out> [05:26:36.911] resultDesc = <optimized out> [05:26:36.911] querytree_context = <optimized out> [05:26:36.911] oldcxt = <optimized out> [05:26:36.911] #6 0x000055c5a9de0262 in ExplainExecuteQuery (execstmt=0x55c5ac6d9d38, into=0x0, es=0x55c5ac859648, queryString=0x55c5ac6d91e0 "EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st6;", params=0x0, queryEnv=0x0) at ../src/backend/commands/prepare.c:594 [05:26:36.911] entry = 0x55c5ac839ba8 [05:26:36.911] query_string = <optimized out> [05:26:36.911] cplan = <optimized out> [05:26:36.911] plan_list = <optimized out> [05:26:36.911] p = <optimized out> [05:26:36.911] paramLI = 0x0 [05:26:36.911] estate = 0x0 [05:26:36.911] planstart = {ticks = <optimized out>} [05:26:36.911] planduration = {ticks = 1103806595203} [05:26:36.911] bufusage_start = {shared_blks_hit = 1, shared_blks_read = 16443, shared_blks_dirtied = 16443, shared_blks_written = 8, local_blks_hit = 94307489783240, local_blks_read = 94307451894117, local_blks_dirtied = 94307489783240, local_blks_written = 140727004487184, temp_blks_read = 0, temp_blks_written = 94307489783416, shared_blk_read_time = {ticks = 0}, shared_blk_write_time = {ticks = 94307489780192}, local_blk_read_time = {ticks = 0}, local_blk_write_time = {ticks = 94307491025040}, temp_blk_read_time = {ticks = 0}, temp_blk_write_time = {ticks = 0}} [05:26:36.911] bufusage = {shared_blks_hit = 140727004486192, shared_blks_read = 140061866319832, shared_blks_dirtied = 8, shared_blks_written = 94307447196988, local_blks_hit = 34359738376, local_blks_read = 94307489783240, local_blks_dirtied = 70622147264512, local_blks_written = 94307491357144, temp_blks_read = 140061866319832, temp_blks_written = 94307489783240, shared_blk_read_time = {ticks = 140727004486192}, shared_blk_write_time = {ticks = 140061866319832}, local_blk_read_time = {ticks = 8}, local_blk_write_time = {ticks = 94307489783240}, temp_blk_read_time = {ticks = 0}, temp_blk_write_time = {ticks = 94307447197163}} [05:26:36.911] revalidationResult = <optimized out> [05:26:36.911] #7 0x000055c5a9daa387 in ExplainOneUtility (utilityStmt=0x55c5ac6d9d38, into=into@entry=0x0, es=es@entry=0x55c5ac859648, queryString=queryString@entry=0x55c5ac6d91e0 "EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st6;", params=params@entry=0x0, queryEnv=queryEnv@entry=0x0) at ../src/backend/commands/explain.c:495 [05:26:36.911] __func__ = "ExplainOneUtility" [1] - https://cirrus-ci.com/task/5770112389611520?logs=cores#L71 Regards, Vignesh
On Mon, Sep 18, 2023 at 1:31 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> Furthermore caching RowDescription is also not super useful, most
> clients request it every time because it does not require an extra
> round trip, so there's almost no overhead in requesting it.
> clients request it every time because it does not require an extra
> round trip, so there's almost no overhead in requesting it.
Just to point out, FWIW, that the .NET Npgsql driver does indeed cache RowDescriptions... The whole point of preparation is to optimize things as much as possible for repeated execution of the query; I get that the value there is much lower than e.g. doing another network roundtrip, but that's still extra work that's better off being cut if it can be.
On Sun, 7 Jan 2024 at 07:55, vignesh C <vignesh21@gmail.com> wrote: > One of the test has aborted in CFBot at [1] with: Rebased the patchset and removed patch 0003 since it was causing the CI issue reported by vignesh and it seems much less useful and more controversial to me anyway (due to the extra required roundtrip). On Sun, 7 Jan 2024 at 09:17, Shay Rojansky <roji@roji.org> wrote: > Just to point out, FWIW, that the .NET Npgsql driver does indeed cache RowDescriptions... The whole point of preparationis to optimize things as much as possible for repeated execution of the query; I get that the value there is muchlower than e.g. doing another network roundtrip, but that's still extra work that's better off being cut if it can be. Hmm, interesting. I totally agree that it's always good to do less when possible. The problem is that in the face of server side prepared statement invalidations due to DDL changes to the table or search path changes, the row types might change. Or the server needs to constantly throw an error, like it does now, but that seems worse imho. I'm wondering though if we can create a middleground, where a client can still cache the RowDescription client side when no DDL or search_patch changes are happening. But still tell the client about a new RowDescription when they do happen. The only way of doing that I can think of is changing the Postgres protocol in a way similar to this: Allow Execute to return a RowDescription too, but have the server only do so when the previously received RowDescription for this prepared statement is now invalid. This would definitely require some additional tracking at PgBouncer to make it work, i.e. per client and per server it should now keep track of the last RowDescription for each prepared statement. But that's definitely something we could do. This would make this patch much more involved though, as now it would suddenly involve an actual protocol change, and that basically depends on this patch moving forward[1]. [1]: https://www.postgresql.org/message-id/flat/CAGECzQTg2hcmb5GaU53uuWcdC7gCNJFLL6mnW0WNhWHgq9UTgw@mail.gmail.com
Attachment
Jelte Fennema <me@jeltef.nl> writes: > The cached plan for a prepared statements can get invalidated when DDL > changes the tables used in the query, or when search_path changes. > ... > However, we would throw an error if the the result of the query is of a > different type than it was before: > ERROR: cached plan must not change result type Yes, this is intentional. > This patch starts to allow a prepared statement to continue to work even > when the result type changes. What this is is a wire protocol break. What if the client has previously done a Describe Statement on the prepared statement? We have no mechanism for notifying it that that information is now falsified. The error is thrown to prevent us from getting into a situation where we'd need to do that. regards, tom lane
On Wed, 24 Jul 2024 at 17:39, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > This patch starts to allow a prepared statement to continue to work even > > when the result type changes. > > What this is is a wire protocol break. Yeah that's what I realised as well in my latest email. I withdrew this patch from the commitfest now to reflect that. Until we get the logic for protocol bumps in: https://www.postgresql.org/message-id/flat/CAGECzQQPQO9K1YeBKe%2BE8yfpeG15cZGd3bAHexJ%2B6dpLP-6jWw%40mail.gmail.com#2386179bc970ebaf1786501f687a7bb2 > What if the client has > previously done a Describe Statement on the prepared statement? > We have no mechanism for notifying it that that information is > now falsified. The error is thrown to prevent us from getting > into a situation where we'd need to do that. However, this makes me think of an intermediary solution. In some sense it's only really a protocol break if the result type changes between the last Describe and the current Execute. So would it be okay if a Describe triggers the proposed invalidation?