Thread: Support prepared statement invalidation when result types change

Support prepared statement invalidation when result types change

From

Jelte Fennema

Date:

25 August 2023, 17:57:32

The cached plan for a prepared statements can get invalidated when DDL
changes the tables used in the query, or when search_path changes. When
this happens the prepared statement can still be executed, but it will
be replanned in the new context. This means that the prepared statement
will do something different e.g. in case of search_path changes it will
select data from a completely different table. This won't throw an
error, because it is considered the responsibility of the operator and
query writers that the query will still do the intended thing.

However, we would throw an error if the the result of the query is of a
different type than it was before:
ERROR: cached plan must not change result type

This requirement was not documented anywhere and it
can thus be a surprising error to hit. But it's actually not needed for
this to be an error, as long as we send the correct RowDescription there
does not have to be a problem for clients when the result types or
column counts change.

This patch starts to allow a prepared statement to continue to work even
when the result type changes.

Without this change all clients that automatically prepare queries as a
performance optimization will need to handle or avoid the error somehow,
often resulting in deallocating and re-preparing queries when its
usually not necessary. With this change connection poolers can also
safely prepare the same query only once on a connection and share this
one prepared query across clients that prepared that exact same query.

Some relevant previous discussions:
[1]: https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com
[2]: https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type
[3]: https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error
[4]: https://github.com/pgjdbc/pgjdbc/pull/451
[5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551
[6]: https://github.com/jackc/pgx/issues/927
[7]: https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2
[8]: https://github.com/rails/rails/issues/12330

Attachment

Re: Support prepared statement invalidation when result types change

From

jian he

Date:

28 August 2023, 09:27:27

On Sat, Aug 26, 2023 at 1:58 AM Jelte Fennema <me@jeltef.nl> wrote:
>
> The cached plan for a prepared statements can get invalidated when DDL
> changes the tables used in the query, or when search_path changes. When
> this happens the prepared statement can still be executed, but it will
> be replanned in the new context. This means that the prepared statement
> will do something different e.g. in case of search_path changes it will
> select data from a completely different table. This won't throw an
> error, because it is considered the responsibility of the operator and
> query writers that the query will still do the intended thing.
>
> However, we would throw an error if the the result of the query is of a
> different type than it was before:
> ERROR: cached plan must not change result type
>
> This requirement was not documented anywhere and it
> can thus be a surprising error to hit. But it's actually not needed for
> this to be an error, as long as we send the correct RowDescription there
> does not have to be a problem for clients when the result types or
> column counts change.
>
> This patch starts to allow a prepared statement to continue to work even
> when the result type changes.
>
> Without this change all clients that automatically prepare queries as a
> performance optimization will need to handle or avoid the error somehow,
> often resulting in deallocating and re-preparing queries when its
> usually not necessary. With this change connection poolers can also
> safely prepare the same query only once on a connection and share this
> one prepared query across clients that prepared that exact same query.
>
> Some relevant previous discussions:
> [1]:
https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com
> [2]: https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type
> [3]: https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error
> [4]: https://github.com/pgjdbc/pgjdbc/pull/451
> [5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551
> [6]: https://github.com/jackc/pgx/issues/927
> [7]: https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2
> [8]: https://github.com/rails/rails/issues/12330

prepared statement with no parameters, tested many cases (add column,
change column data type, rename column, set default, set not null), it
worked as expected.
With parameters, it also works, only a tiny issue with error reporting.

prepstmt2 | PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest
WHERE q1 = $1; | {bigint}        | {bigint,bigint,bigint}
ERROR:  column "q1" does not exist at character 61
HINT:  Perhaps you meant to reference the column "pcachetest.x1".
STATEMENT:  execute prepstmt2(1);

I think "character 61" refer to "PREPARE prepstmt2(bigint) AS SELECT *
FROM pcachetest WHERE q1 = $1;"
so maybe the STATEMENT is slightly misleading.

Re: Support prepared statement invalidation when result types change

From

Konstantin Knizhnik

Date:

28 August 2023, 13:05:28


On 25.08.2023 8:57 PM, Jelte Fennema wrote:
> The cached plan for a prepared statements can get invalidated when DDL
> changes the tables used in the query, or when search_path changes. When
> this happens the prepared statement can still be executed, but it will
> be replanned in the new context. This means that the prepared statement
> will do something different e.g. in case of search_path changes it will
> select data from a completely different table. This won't throw an
> error, because it is considered the responsibility of the operator and
> query writers that the query will still do the intended thing.
>
> However, we would throw an error if the the result of the query is of a
> different type than it was before:
> ERROR: cached plan must not change result type
>
> This requirement was not documented anywhere and it
> can thus be a surprising error to hit. But it's actually not needed for
> this to be an error, as long as we send the correct RowDescription there
> does not have to be a problem for clients when the result types or
> column counts change.
>
> This patch starts to allow a prepared statement to continue to work even
> when the result type changes.
>
> Without this change all clients that automatically prepare queries as a
> performance optimization will need to handle or avoid the error somehow,
> often resulting in deallocating and re-preparing queries when its
> usually not necessary. With this change connection poolers can also
> safely prepare the same query only once on a connection and share this
> one prepared query across clients that prepared that exact same query.
>
> Some relevant previous discussions:
> [1]:
https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com
> [2]: https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type
> [3]: https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error
> [4]: https://github.com/pgjdbc/pgjdbc/pull/451
> [5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551
> [6]: https://github.com/jackc/pgx/issues/927
> [7]: https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2
> [8]: https://github.com/rails/rails/issues/12330


The following assignment of format is not corrects:

      /* Do nothing if portal won't return tuples */
      if (portal->tupDesc == NULL)
+    {
+        /*
+         * For SELECT like queries we delay filling in the tupDesc 
until after
+         * PortalRunUtility, because we don't know what rows an EXECUTE
+         * command will return. Because we delay setting tupDesc, we 
also need
+         * to delay setting formats. We do this in a pretty hacky way, by
+         * temporarily setting the portal formats to the passed in formats.
+         * Then once we fill in tupDesc, we call PortalSetResultFormat 
again
+         * with portal->formats to fill in the final formats value.
+         */
+        if (portal->strategy == PORTAL_UTIL_SELECT)
+            portal->formats = formats;
          return;


because it is create in other memory context:

postgres.c:
     /* Done storing stuff in portal's context */
     MemoryContextSwitchTo(oldContext);
     ...
      /* Get the result format codes */
     numRFormats = pq_getmsgint(input_message, 2);
     if (numRFormats > 0)
     {
         rformats = palloc_array(int16, numRFormats);
         for (int i = 0; i < numRFormats; i++)
             rformats[i] = pq_getmsgint(input_message, 2);
     }



It has to be copied as below:

     portal->formats = (int16 *)
         MemoryContextAlloc(portal->portalContext,
                            natts * sizeof(int16));
         memcpy(portal->formats, formats, natts * sizeof(int16));


or alternatively  MemoryContextSwitchTo(oldContext) should be moved 
after initialization of rformat

Re: Support prepared statement invalidation when result types change

From

Jelte Fennema

Date:

28 August 2023, 16:47:59

On Mon, 28 Aug 2023 at 15:05, Konstantin Knizhnik <knizhnik@garret.ru> wrote:
> The following assignment of format is not corrects:
>
> It has to be copied as below:
>
>      portal->formats = (int16 *)
>          MemoryContextAlloc(portal->portalContext,
>                             natts * sizeof(int16));
>          memcpy(portal->formats, formats, natts * sizeof(int16));

I attached a new version of the patch where I now did this. But I also
moved the code around quite a bit, since all this tupDesc/format
delaying is only needed for exec_simple_query. The original changes
actually broke some prepared statements that were using protocol level
Bind messages.

Attachment

Re: Support prepared statement invalidation when result types change

From

Jelte Fennema

Date:

28 August 2023, 16:51:02

On Mon, 28 Aug 2023 at 11:27, jian he <jian.universality@gmail.com> wrote:
> With parameters, it also works, only a tiny issue with error reporting.
>
> prepstmt2 | PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest
> WHERE q1 = $1; | {bigint}        | {bigint,bigint,bigint}
> ERROR:  column "q1" does not exist at character 61
> HINT:  Perhaps you meant to reference the column "pcachetest.x1".
> STATEMENT:  execute prepstmt2(1);
>
> I think "character 61" refer to "PREPARE prepstmt2(bigint) AS SELECT *
> FROM pcachetest WHERE q1 = $1;"
> so maybe the STATEMENT is slightly misleading.

Could you share the full set of commands that cause the reporting
issue? I don't think my changes should impact this reporting, so I'm
curious if this is a new issue, or an already existing one.

Re: Support prepared statement invalidation when result types change

From

jian he

Date:

29 August 2023, 09:29:43

On Tue, Aug 29, 2023 at 12:51 AM Jelte Fennema <me@jeltef.nl> wrote:
>
> Could you share the full set of commands that cause the reporting
> issue? I don't think my changes should impact this reporting, so I'm
> curious if this is a new issue, or an already existing one.

I didn't apply your v2 patch.
full set of commands:
--------------------
regression=# CREATE TEMP TABLE pcachetest AS SELECT * FROM int8_tbl;
SELECT 5
regression=# PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest
WHERE q1 = $1;'
PREPARE
regression=# alter table pcachetest rename q1 to x;
ALTER TABLE
regression=# EXECUTE prepstmt2(123);
2023-08-29 17:23:59.148 CST [1382074] ERROR:  column "q1" does not
exist at character 61
2023-08-29 17:23:59.148 CST [1382074] HINT:  Perhaps you meant to
reference the column "pcachetest.q2".
2023-08-29 17:23:59.148 CST [1382074] STATEMENT:  EXECUTE prepstmt2(123);
ERROR:  column "q1" does not exist
HINT:  Perhaps you meant to reference the column "pcachetest.q2".
--------------------------

Re: Support prepared statement invalidation when result types change

From

Jelte Fennema

Date:

29 August 2023, 11:19:43

On Tue, 29 Aug 2023 at 11:29, jian he <jian.universality@gmail.com> wrote:
> regression=# CREATE TEMP TABLE pcachetest AS SELECT * FROM int8_tbl;
> SELECT 5
> regression=# PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest
> WHERE q1 = $1;'
> PREPARE
> regression=# alter table pcachetest rename q1 to x;
> ALTER TABLE
> regression=# EXECUTE prepstmt2(123);
> 2023-08-29 17:23:59.148 CST [1382074] ERROR:  column "q1" does not
> exist at character 61
> 2023-08-29 17:23:59.148 CST [1382074] HINT:  Perhaps you meant to
> reference the column "pcachetest.q2".
> 2023-08-29 17:23:59.148 CST [1382074] STATEMENT:  EXECUTE prepstmt2(123);
> ERROR:  column "q1" does not exist
> HINT:  Perhaps you meant to reference the column "pcachetest.q2".

Thank you for the full set of commands. In that case the issue you're
describing is completely separate from this patch. The STATEMENT: part
of the log will always show the query that was received by the server.
This behaviour was already present even without my patch (I double
checked with PG15.3).

Re: Support prepared statement invalidation when result types change

From

Jelte Fennema

Date:

08 September 2023, 09:54:58

Another similar issue was reported on the PgBouncer prepared statement PR[1].

It's related to an issue that was already reported on the
mailinglist[2]. It turns out that invalidation of the argument types
is also important in some cases. The newly added 3rd patch in this
series addresses that issue.

[1]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1309454695
[2]: https://www.postgresql.org/message-id/flat/CA%2Bmi_8YAGf9qibDFTRNKgaTwaBa1OUcteKqLAxfMmKFbo3GHZg%40mail.gmail.com

Attachment

Re: Support prepared statement invalidation when result types change

From

Jelte Fennema

Date:

12 September 2023, 13:17:09

When running the Postgres JDBC tests with this patchset I found dumb
mistake in my last patch where I didn't initialize the contents of
orig_params correctly. This new patchset  fixes that.

Attachment

Re: Support prepared statement invalidation when result types change

From

"Euler Taveira"

Date:

13 September 2023, 21:38:08

On Tue, Sep 12, 2023, at 10:17 AM, Jelte Fennema wrote:

When running the Postgres JDBC tests with this patchset I found dumb
mistake in my last patch where I didn't initialize the contents of
orig_params correctly. This new patchset fixes that.

0001:

Don't you want to execute this code path only for EXECUTE command?

PORTAL_UTIL_SELECT includes other commands such as CALL, FETCH, SHOW, and

EXPLAIN. If so, check if commandTag is CMDTAG_EXECUTE.

Regarding tupDesc, you don't need to call FreeTupleDesc instead you can modify

PortalStart as

case PORTAL_UTIL_SELECT:

* We don't set snapshot here, because PortalRunUtility will

* take care of it if needed.

{

PlannedStmt *pstmt = PortalGetPrimaryStmt(portal);

Assert(pstmt->commandType == CMD_UTILITY);

* tupDesc will be filled by FillPortalStore later because

* it might change due to replanning when ExecuteQuery calls

* GetCachedPlan.

if (portal->commandTag != CMDTAG_EXECUTE)

portal->tupDesc = UtilityTupleDescriptor(pstmt->utilityStmt);

}

Regarding the commit message, ...if the the result... should be fixed. The

sentence "it's actually not needed..." could be "It doesn't need to be an error

as long as it sends the RowDescription...". The sentence "This patch starts to

allow a prepared ..." could be "This patch allows a prepared ...".

0002:

You should remove this comment because it refers to the option you are

removing.

- plan->cursor_options,

- false); /* not fixed result */

+ plan->cursor_options); /* not fixed result */

You should also remove the sentence that refers to fixed_result in

CompleteCachedPlan.

* cursor_options: options bitmask to pass to planner

* fixed_result: true to disallow future changes in query's result tupdesc

void

CompleteCachedPlan(CachedPlanSource *plansource,

List *querytree_list,

MemoryContext querytree_context,

0003:

You should initialize the new parameters (orig_param_types and orig_num_params)

in CreateCachedPlan. One suggestion is to move the following code to

CompleteCachedPlan because plansource->param_types are assigned there.

@@ -108,6 +108,10 @@ PrepareQuery(ParseState *pstate, PrepareStmt *stmt,

argtypes[i++] = toid;

}

+ plansource->orig_num_params = nargs;

+ plansource->orig_param_types = MemoryContextAlloc(plansource->context, nargs * sizeof(Oid));

+ memcpy(plansource->orig_param_types, argtypes, nargs * sizeof(Oid));

}

This comment is confusing. Since the new function

(GetCachedPlanFromRevalidated) contains almost all code from GetCachedPlan, its

content is the same as the *previous* GetCachedPlan function. You could expand

this comment a bit to make it clear that it contains the logic to decide

between generic x custom plan. I don't like the function name but have only a

not-so-good suggestion: GetCachedPlanAfterRevalidate. I also don't like the

revalidationResult as a variable name. Why don't you keep qlist? Or use a name

near query-tree list (query_list? qtlist?). s/te caller/the caller/

+ * GetCachedPlanFromRevalidated: is the same as get GetCachedPlan, but requires

+ * te caller to first revalidate the query. This is needed for callers that

+ * need to use the revalidated plan to generate boundParams.

+ */

+CachedPlan *

+GetCachedPlanFromRevalidated(CachedPlanSource *plansource,

+ ParamListInfo boundParams,

+ ResourceOwner owner,

+ QueryEnvironment *queryEnv,

+ List *revalidationResult)

Are these names accurate? The original are the current ones; new ones are

"runtime" data. It would be good to explain why a new array is required.

Oid *param_types; /* array of parameter type OIDs, or NULL */

int num_params; /* length of param_types array */

+ Oid *orig_param_types; /* array of original parameter type OIDs,

+ * or NULL */

+ int orig_num_params; /* length of orig_param_types array */

You should expand the commit message a bit. Explain this feature. Inform the

behavior change.

Euler Taveira

EDB https://www.enterprisedb.com/

Re: Support prepared statement invalidation when result types change

From

Andy Fan

Date:

14 September 2023, 23:41:46

Hi,

This requirement was not documented anywhere and it
can thus be a surprising error to hit. But it's actually not needed for
this to be an error, as long as we send the correct RowDescription there
does not have to be a problem for clients when the result types or
column counts change.

What if a client has *cached* an old version of RowDescription

and the server changed it to something new and sent resultdata

with the new RowDescription. Will the client still be able to work

expectly?

I don't hope my concern is right since I didn't go through any of

the drivers in detail, but I hope my concern is expressed correctly.

Best Regards

Andy Fan

Re: Support prepared statement invalidation when result types change

From

Jelte Fennema-Nio

Date:

18 September 2023, 11:30:48

@Euler thanks for the review. I addressed the feedback.

On Fri, 15 Sept 2023 at 01:41, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> What if a client has *cached* an old version of RowDescription
> and the server changed it to something new and sent resultdata
> with the new RowDescription.  Will the client still be able to work
> expectly?

It depends a bit on the exact change. For instance a column being
added to the end of the resultdata shouldn't be a problem. And that is
actually quite a common case for this issue:
1. PREPARE p as (SELECT * FROM t);
2. ALTER TABLE t ADD COLUMN ...
3. EXECUTE p

But type changes of existing columns might cause issues when the
RowDescription is cached. But such changes also cause issues now.
Currently the prepared statement becomes unusable when this happens
(returning errors every time). With this patch it's at least possible
to have prepared statements continue working in many cases.
Furthermore caching RowDescription is also not super useful, most
clients request it every time because it does not require an extra
round trip, so there's almost no overhead in requesting it.

Clients caching ParameterDescription seems more useful because
fetching the parameter types does require an extra round trip. So
caching it could cause errors with 0003. But right now if the argument
types need to change it gives an error every time when executing the
prepared statement. So I believe 0003 is still an improvement over the
status quo, because there are many cases where the client knows that
the types might have changed and it thus needs to re-fetch the
ParameterDescription: the most common case is changing the
search_path. And there's also cases where even a cached
ParamaterDescription will work fine: e.g. the type is changed but the
encoding stays the same (e.g. drop + create an enum, or text/varchar,
or the text encoding of int and bigint)

Attachment

Re: Support prepared statement invalidation when result types change

From

vignesh C

Date:

07 January 2024, 06:55:24

On Mon, 18 Sept 2023 at 18:01, Jelte Fennema-Nio <me@jeltef.nl> wrote:
>
> @Euler thanks for the review. I addressed the feedback.
>
> On Fri, 15 Sept 2023 at 01:41, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> > What if a client has *cached* an old version of RowDescription
> > and the server changed it to something new and sent resultdata
> > with the new RowDescription.  Will the client still be able to work
> > expectly?
>
> It depends a bit on the exact change. For instance a column being
> added to the end of the resultdata shouldn't be a problem. And that is
> actually quite a common case for this issue:
> 1. PREPARE p as (SELECT * FROM t);
> 2. ALTER TABLE t ADD COLUMN ...
> 3. EXECUTE p
>
> But type changes of existing columns might cause issues when the
> RowDescription is cached. But such changes also cause issues now.
> Currently the prepared statement becomes unusable when this happens
> (returning errors every time). With this patch it's at least possible
> to have prepared statements continue working in many cases.
> Furthermore caching RowDescription is also not super useful, most
> clients request it every time because it does not require an extra
> round trip, so there's almost no overhead in requesting it.
>
> Clients caching ParameterDescription seems more useful because
> fetching the parameter types does require an extra round trip. So
> caching it could cause errors with 0003. But right now if the argument
> types need to change it gives an error every time when executing the
> prepared statement. So I believe 0003 is still an improvement over the
> status quo, because there are many cases where the client knows that
> the types might have changed and it thus needs to re-fetch the
> ParameterDescription: the most common case is changing the
> search_path. And there's also cases where even a cached
> ParamaterDescription will work fine: e.g. the type is changed but the
> encoding stays the same (e.g. drop + create an enum, or text/varchar,
> or the text encoding of int and bigint)

One of the test has aborted in CFBot at [1] with:
[05:26:16.214] Core was generated by `postgres: postgres
regression_pg_stat_statements [local] EXECUTE              '.
[05:26:16.214] Program terminated with signal SIGABRT, Aborted.
[05:26:16.214] #0  __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:50
[05:26:16.214] Download failed: Invalid argument.  Continuing without
source file ./signal/../sysdeps/unix/sysv/linux/raise.c.
[05:26:16.392]
[05:26:16.392] Thread 1 (Thread 0x7fbe1d997a40 (LWP 28738)):
[05:26:16.392] #0  __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:50
....
....
[05:26:36.911] #5  0x000055c5aa523e71 in RevalidateCachedQuery
(plansource=0x55c5ac811cf0, queryEnv=queryEnv@entry=0x0) at
../src/backend/utils/cache/plancache.c:730
[05:26:36.911]         num_params = 0
[05:26:36.911]         param_types = 0x55c5ac860438
[05:26:36.911]         snapshot_set = false
[05:26:36.911]         rawtree = 0x55c5ac859f08
[05:26:36.911]         tlist = <optimized out>
[05:26:36.911]         qlist = <optimized out>
[05:26:36.911]         resultDesc = <optimized out>
[05:26:36.911]         querytree_context = <optimized out>
[05:26:36.911]         oldcxt = <optimized out>
[05:26:36.911] #6  0x000055c5a9de0262 in ExplainExecuteQuery
(execstmt=0x55c5ac6d9d38, into=0x0, es=0x55c5ac859648,
queryString=0x55c5ac6d91e0 "EXPLAIN (VERBOSE, COSTS OFF) EXECUTE
st6;", params=0x0, queryEnv=0x0) at
../src/backend/commands/prepare.c:594
[05:26:36.911]         entry = 0x55c5ac839ba8
[05:26:36.911]         query_string = <optimized out>
[05:26:36.911]         cplan = <optimized out>
[05:26:36.911]         plan_list = <optimized out>
[05:26:36.911]         p = <optimized out>
[05:26:36.911]         paramLI = 0x0
[05:26:36.911]         estate = 0x0
[05:26:36.911]         planstart = {ticks = <optimized out>}
[05:26:36.911]         planduration = {ticks = 1103806595203}
[05:26:36.911]         bufusage_start = {shared_blks_hit = 1,
shared_blks_read = 16443, shared_blks_dirtied = 16443,
shared_blks_written = 8, local_blks_hit = 94307489783240,
local_blks_read = 94307451894117, local_blks_dirtied = 94307489783240,
local_blks_written = 140727004487184, temp_blks_read = 0,
temp_blks_written = 94307489783416, shared_blk_read_time = {ticks =
0}, shared_blk_write_time = {ticks = 94307489780192},
local_blk_read_time = {ticks = 0}, local_blk_write_time = {ticks =
94307491025040}, temp_blk_read_time = {ticks = 0}, temp_blk_write_time
= {ticks = 0}}
[05:26:36.911]         bufusage = {shared_blks_hit = 140727004486192,
shared_blks_read = 140061866319832, shared_blks_dirtied = 8,
shared_blks_written = 94307447196988, local_blks_hit = 34359738376,
local_blks_read = 94307489783240, local_blks_dirtied = 70622147264512,
local_blks_written = 94307491357144, temp_blks_read = 140061866319832,
temp_blks_written = 94307489783240, shared_blk_read_time = {ticks =
140727004486192}, shared_blk_write_time = {ticks = 140061866319832},
local_blk_read_time = {ticks = 8}, local_blk_write_time = {ticks =
94307489783240}, temp_blk_read_time = {ticks = 0}, temp_blk_write_time
= {ticks = 94307447197163}}
[05:26:36.911]         revalidationResult = <optimized out>
[05:26:36.911] #7  0x000055c5a9daa387 in ExplainOneUtility
(utilityStmt=0x55c5ac6d9d38, into=into@entry=0x0,
es=es@entry=0x55c5ac859648,
queryString=queryString@entry=0x55c5ac6d91e0 "EXPLAIN (VERBOSE, COSTS
OFF) EXECUTE st6;", params=params@entry=0x0,
queryEnv=queryEnv@entry=0x0) at ../src/backend/commands/explain.c:495
[05:26:36.911]         __func__ = "ExplainOneUtility"

[1] - https://cirrus-ci.com/task/5770112389611520?logs=cores#L71

Regards,
Vignesh

Re: Support prepared statement invalidation when result types change

From

Shay Rojansky

Date:

07 January 2024, 08:16:48

On Mon, Sep 18, 2023 at 1:31 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:

> Furthermore caching RowDescription is also not super useful, most
> clients request it every time because it does not require an extra
> round trip, so there's almost no overhead in requesting it.

Just to point out, FWIW, that the .NET Npgsql driver does indeed cache RowDescriptions... The whole point of preparation is to optimize things as much as possible for repeated execution of the query; I get that the value there is much lower than e.g. doing another network roundtrip, but that's still extra work that's better off being cut if it can be.

Re: Support prepared statement invalidation when result types change

From

Jelte Fennema-Nio

Date:

03 April 2024, 10:48:02

On Sun, 7 Jan 2024 at 07:55, vignesh C <vignesh21@gmail.com> wrote:
> One of the test has aborted in CFBot at [1] with:

Rebased the patchset and removed patch 0003 since it was causing the
CI issue reported by vignesh and it seems much less useful and more
controversial to me anyway (due to the extra required roundtrip).

On Sun, 7 Jan 2024 at 09:17, Shay Rojansky <roji@roji.org> wrote:
> Just to point out, FWIW, that the .NET Npgsql driver does indeed cache RowDescriptions... The whole point of
preparationis to optimize things as much as possible for repeated execution of the query; I get that the value there is
muchlower than e.g. doing another network roundtrip, but that's still extra work that's better off being cut if it can
be.

Hmm, interesting. I totally agree that it's always good to do less
when possible. The problem is that in the face of server side prepared
statement invalidations due to DDL changes to the table or search path
changes, the row types might change. Or the server needs to constantly
throw an error, like it does now, but that seems worse imho.

I'm wondering though if we can create a middleground, where a client
can still cache the RowDescription client side when no DDL or
search_patch changes are happening. But still tell the client about a
new RowDescription when they do happen.

The only way of doing that I can think of is changing the Postgres
protocol in a way similar to this: Allow Execute to return a
RowDescription too, but have the server only do so when the previously
received RowDescription for this prepared statement is now invalid.

This would definitely require some additional tracking at PgBouncer to
make it work, i.e. per client and per server it should now keep track
of the last RowDescription for each prepared statement. But that's
definitely something we could do.

This would make this patch much more involved though, as now it would
suddenly involve an actual protocol change, and that basically depends
on this patch moving forward[1].

[1]: https://www.postgresql.org/message-id/flat/CAGECzQTg2hcmb5GaU53uuWcdC7gCNJFLL6mnW0WNhWHgq9UTgw@mail.gmail.com

Attachment

Re: Support prepared statement invalidation when result types change

From

Tom Lane

Date:

24 July 2024, 15:39:56

Jelte Fennema <me@jeltef.nl> writes:
> The cached plan for a prepared statements can get invalidated when DDL
> changes the tables used in the query, or when search_path changes.
> ...
> However, we would throw an error if the the result of the query is of a
> different type than it was before:
> ERROR: cached plan must not change result type

Yes, this is intentional.

> This patch starts to allow a prepared statement to continue to work even
> when the result type changes.

What this is is a wire protocol break.  What if the client has
previously done a Describe Statement on the prepared statement?
We have no mechanism for notifying it that that information is
now falsified.  The error is thrown to prevent us from getting
into a situation where we'd need to do that.

            regards, tom lane

Re: Support prepared statement invalidation when result types change

From

Jelte Fennema-Nio

Date:

25 July 2024, 08:30:09

On Wed, 24 Jul 2024 at 17:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > This patch starts to allow a prepared statement to continue to work even
> > when the result type changes.
>
> What this is is a wire protocol break.

Yeah that's what I realised as well in my latest email. I withdrew
this patch from the commitfest now to reflect that. Until we get the
logic for protocol bumps in:

https://www.postgresql.org/message-id/flat/CAGECzQQPQO9K1YeBKe%2BE8yfpeG15cZGd3bAHexJ%2B6dpLP-6jWw%40mail.gmail.com#2386179bc970ebaf1786501f687a7bb2

> What if the client has
> previously done a Describe Statement on the prepared statement?
> We have no mechanism for notifying it that that information is
> now falsified.  The error is thrown to prevent us from getting
> into a situation where we'd need to do that.

However, this makes me think of an intermediary solution. In some
sense it's only really a protocol break if the result type changes
between the last Describe and the current Execute. So would it be okay
if a Describe triggers the proposed invalidation?