Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data |
Date | |
Msg-id | 2869953.1698873621@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re:BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
|
List | pgsql-bugs |
I wrote: > However, I then started to wonder whether it's really a good idea > that the code treats PREPARE and EXECUTE alike for this purpose. > In the case of EXECUTE, the idea is that we'll still be at top > level when we reach the executor hooks, and they will do the right > thing and then increment exec_nested_level before any lower-level > statement can be reached. But how does that concept apply to > PREPARE, which won't reach the executor? ISTM the net result > is that if any subsidiary statements are reached during PREPARE > (perhaps via const-simplification of some SQL or PL function), > we will erroneously treat them as top-level. Poking at that, I found that (a) PREPARE doesn't run the planner, so probably nested statements are impossible, and (b) if we remove the check so that it bumps the nesting level, then some existing regression test outputs change. PREPARE does run parse analysis of the contained statement, and it seems that that's sensitive to the is-top-level state. So we do need to keep the exclusion for PREPARE. However, I'm still fairly down on PGSS_HANDLED_UTILITY, because I think the rationale for treating PREPARE and EXECUTE specially is different in each place where that's being used, so tying them together is more likely to cause future bugs than prevent bugs. So v3-0001 attached revises the patch per those ideas. It's only cosmetically different from before, but I think the explanatory comments are better. Meanwhile, I realized that we have a second set of bugs. As I said above, it's possible to reach nested statements during planning, as a consequence of const-simplification of an immutable or stable function. pg_stat_statements generally does the wrong thing here, incorrectly treating such statements as top-level if we weren't already nested. This needs to be fixed by including plan_nest_level as a reason to consider execution to be nested too. What's more, testing this showed that pgss_planner has the same bug as pgss_ProcessUtility: it needs to bump the nesting level even if it chooses not to track planning time. v3-0002 fixes this in a minimally invasive way by replacing each check of exec_nested_level by plan_nested_level + exec_nested_level. That's just for review/testing though. What I think we actually ought to do is fold plan_nested_level and exec_nested_level into a single variable nested_level (or nesting_level would be better English). There's no reason to keep them separate; it just adds cycles, complexity, and more risk of mistakes. Comments? regards, tom lane diff --git a/contrib/pg_stat_statements/expected/level_tracking.out b/contrib/pg_stat_statements/expected/level_tracking.out index d924c87b41..e7b29e9c99 100644 --- a/contrib/pg_stat_statements/expected/level_tracking.out +++ b/contrib/pg_stat_statements/expected/level_tracking.out @@ -67,6 +67,61 @@ SELECT toplevel, calls, query FROM pg_stat_statements t | 1 | SET pg_stat_statements.track = 'all' (7 rows) +-- DO block - top-level tracking without utility. +SET pg_stat_statements.track = 'top'; +SET pg_stat_statements.track_utility = FALSE; +SELECT pg_stat_statements_reset(); + pg_stat_statements_reset +-------------------------- + +(1 row) + +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + toplevel | calls | query +----------+-------+----------------------------------- + t | 1 | DELETE FROM stats_track_tab + t | 1 | SELECT pg_stat_statements_reset() +(2 rows) + +-- DO block - all-level tracking without utility. +SET pg_stat_statements.track = 'all'; +SELECT pg_stat_statements_reset(); + pg_stat_statements_reset +-------------------------- + +(1 row) + +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + toplevel | calls | query +----------+-------+----------------------------------- + f | 1 | DELETE FROM stats_track_tab + t | 1 | DELETE FROM stats_track_tab + f | 1 | SELECT $1::TEXT + t | 1 | SELECT pg_stat_statements_reset() +(4 rows) + -- PL/pgSQL function - top-level tracking. SET pg_stat_statements.track = 'top'; SET pg_stat_statements.track_utility = FALSE; diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 767b6ceb10..da5ef7d0b7 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -99,13 +99,6 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100; #define USAGE_DEALLOC_PERCENT 5 /* free this % of entries at once */ #define IS_STICKY(c) ((c.calls[PGSS_PLAN] + c.calls[PGSS_EXEC]) == 0) -/* - * Utility statements that pgss_ProcessUtility and pgss_post_parse_analyze - * ignores. - */ -#define PGSS_HANDLED_UTILITY(n) (!IsA(n, ExecuteStmt) && \ - !IsA(n, PrepareStmt)) - /* * Extension version number, for supporting older extension versions' objects */ @@ -840,12 +833,14 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate) return; /* - * Clear queryId for prepared statements related utility, as those will - * inherit from the underlying statement's one. + * If it's EXECUTE, clear the queryId so that stats will accumulate for + * the underlying PREPARE. But don't do this if we're not tracking + * utility statements, to avoid messing up another extension that might be + * tracking them. */ if (query->utilityStmt) { - if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt)) + if (pgss_track_utility && IsA(query->utilityStmt, ExecuteStmt)) { query->queryId = UINT64CONST(0); return; @@ -1097,6 +1092,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, uint64 saved_queryId = pstmt->queryId; int saved_stmt_location = pstmt->stmt_location; int saved_stmt_len = pstmt->stmt_len; + bool enabled = pgss_track_utility && pgss_enabled(exec_nested_level); /* * Force utility statements to get queryId zero. We do this even in cases @@ -1112,7 +1108,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, * that user configured another extension to handle utility statements * only. */ - if (pgss_enabled(exec_nested_level) && pgss_track_utility) + if (enabled) pstmt->queryId = UINT64CONST(0); /* @@ -1124,11 +1120,13 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, * hash table entry for the PREPARE (with hash calculated from the query * string), and then a different one with the same query string (but hash * calculated from the query tree) would be used to accumulate costs of - * ensuing EXECUTEs. This would be confusing, and inconsistent with other - * cases where planning time is not included at all. + * ensuing EXECUTEs. This would be confusing. Since PREPARE doesn't + * actually run the planner (only parse+rewrite), its costs are generally + * pretty negligible and it seems okay to just ignore it. */ - if (pgss_track_utility && pgss_enabled(exec_nested_level) && - PGSS_HANDLED_UTILITY(parsetree)) + if (enabled && + !IsA(parsetree, ExecuteStmt) && + !IsA(parsetree, PrepareStmt)) { instr_time start; instr_time duration; @@ -1206,14 +1204,41 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, } else { - if (prev_ProcessUtility) - prev_ProcessUtility(pstmt, queryString, readOnlyTree, - context, params, queryEnv, - dest, qc); - else - standard_ProcessUtility(pstmt, queryString, readOnlyTree, + /* + * Even though we're not tracking execution time for this statement, + * we must still increment the nesting level, to ensure that functions + * evaluated within it are not seen as top-level calls. But don't do + * so for EXECUTE; that way, when control reaches pgss_planner or + * pgss_ExecutorStart, we will treat the costs as top-level if + * appropriate. Likewise, don't bump for PREPARE, so that parse + * analysis will treat the statement as top-level if appropriate. + * + * To be absolutely certain we don't mess up the nesting level, + * evaluate the bump_level condition just once. + */ + bool bump_level = + !IsA(parsetree, ExecuteStmt) && + !IsA(parsetree, PrepareStmt); + + if (bump_level) + exec_nested_level++; + PG_TRY(); + { + if (prev_ProcessUtility) + prev_ProcessUtility(pstmt, queryString, readOnlyTree, context, params, queryEnv, dest, qc); + else + standard_ProcessUtility(pstmt, queryString, readOnlyTree, + context, params, queryEnv, + dest, qc); + } + PG_FINALLY(); + { + if (bump_level) + exec_nested_level--; + } + PG_END_TRY(); } } diff --git a/contrib/pg_stat_statements/sql/level_tracking.sql b/contrib/pg_stat_statements/sql/level_tracking.sql index 0c20b8ce69..eab881a43b 100644 --- a/contrib/pg_stat_statements/sql/level_tracking.sql +++ b/contrib/pg_stat_statements/sql/level_tracking.sql @@ -33,6 +33,39 @@ END; $$; SELECT toplevel, calls, query FROM pg_stat_statements ORDER BY query COLLATE "C", toplevel; +-- DO block - top-level tracking without utility. +SET pg_stat_statements.track = 'top'; +SET pg_stat_statements.track_utility = FALSE; +SELECT pg_stat_statements_reset(); +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + +-- DO block - all-level tracking without utility. +SET pg_stat_statements.track = 'all'; +SELECT pg_stat_statements_reset(); +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + -- PL/pgSQL function - top-level tracking. SET pg_stat_statements.track = 'top'; SET pg_stat_statements.track_utility = FALSE; diff --git a/contrib/pg_stat_statements/expected/level_tracking.out b/contrib/pg_stat_statements/expected/level_tracking.out index e7b29e9c99..0b94c71c9c 100644 --- a/contrib/pg_stat_statements/expected/level_tracking.out +++ b/contrib/pg_stat_statements/expected/level_tracking.out @@ -173,6 +173,31 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; 1 | 1 | SELECT pg_stat_statements_reset() (3 rows) +-- immutable SQL function --- can be executed at plan time +CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS +$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL; +SELECT PLUS_THREE(8); + plus_three +------------ + 11 +(1 row) + +SELECT PLUS_THREE(10); + plus_three +------------ + 13 +(1 row) + +SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; + toplevel | calls | rows | query +----------+-------+------+------------------------------------------------------------------------------ + t | 2 | 2 | SELECT PLUS_ONE($1) + t | 2 | 2 | SELECT PLUS_THREE($1) + t | 2 | 2 | SELECT PLUS_TWO($1) + t | 1 | 3 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C" + t | 1 | 1 | SELECT pg_stat_statements_reset() +(5 rows) + -- PL/pgSQL function - all-level tracking. SET pg_stat_statements.track = 'all'; SELECT pg_stat_statements_reset(); @@ -184,6 +209,7 @@ SELECT pg_stat_statements_reset(); -- we drop and recreate the functions to avoid any caching funnies DROP FUNCTION PLUS_ONE(INTEGER); DROP FUNCTION PLUS_TWO(INTEGER); +DROP FUNCTION PLUS_THREE(INTEGER); -- PL/pgSQL function CREATE FUNCTION PLUS_TWO(i INTEGER) RETURNS INTEGER AS $$ DECLARE @@ -229,7 +255,34 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; 1 | 1 | SELECT pg_stat_statements_reset() (5 rows) -DROP FUNCTION PLUS_ONE(INTEGER); +-- immutable SQL function --- can be executed at plan time +CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS +$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL; +SELECT PLUS_THREE(8); + plus_three +------------ + 11 +(1 row) + +SELECT PLUS_THREE(10); + plus_three +------------ + 13 +(1 row) + +SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; + toplevel | calls | rows | query +----------+-------+------+------------------------------------------------------------------------------ + f | 2 | 2 | SELECT (i + $2 + $3)::INTEGER + f | 2 | 2 | SELECT (i + $2)::INTEGER LIMIT $3 + t | 2 | 2 | SELECT PLUS_ONE($1) + t | 2 | 2 | SELECT PLUS_THREE($1) + t | 2 | 2 | SELECT PLUS_TWO($1) + t | 1 | 5 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C" + f | 2 | 2 | SELECT i + $2 LIMIT $3 + t | 1 | 1 | SELECT pg_stat_statements_reset() +(8 rows) + -- -- pg_stat_statements.track = none -- diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index da5ef7d0b7..f1cb02e463 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -829,7 +829,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate) prev_post_parse_analyze_hook(pstate, query, jstate); /* Safety check... */ - if (!pgss || !pgss_hash || !pgss_enabled(exec_nested_level)) + if (!pgss || !pgss_hash || !pgss_enabled(plan_nested_level + exec_nested_level)) return; /* @@ -954,12 +954,26 @@ pgss_planner(Query *parse, } else { - if (prev_planner_hook) - result = prev_planner_hook(parse, query_string, cursorOptions, - boundParams); - else - result = standard_planner(parse, query_string, cursorOptions, - boundParams); + /* + * Even though we're not tracking plan time for this statement, we + * must still increment the nesting level, to ensure that functions + * evaluated during planning are not seen as top-level calls. + */ + plan_nested_level++; + PG_TRY(); + { + if (prev_planner_hook) + result = prev_planner_hook(parse, query_string, cursorOptions, + boundParams); + else + result = standard_planner(parse, query_string, cursorOptions, + boundParams); + } + PG_FINALLY(); + { + plan_nested_level--; + } + PG_END_TRY(); } return result; @@ -981,7 +995,7 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags) * counting of optimizable statements that are directly contained in * utility statements. */ - if (pgss_enabled(exec_nested_level) && queryDesc->plannedstmt->queryId != UINT64CONST(0)) + if (pgss_enabled(plan_nested_level + exec_nested_level) && queryDesc->plannedstmt->queryId != UINT64CONST(0)) { /* * Set up to track total elapsed time in ExecutorRun. Make sure the @@ -1051,7 +1065,7 @@ pgss_ExecutorEnd(QueryDesc *queryDesc) uint64 queryId = queryDesc->plannedstmt->queryId; if (queryId != UINT64CONST(0) && queryDesc->totaltime && - pgss_enabled(exec_nested_level)) + pgss_enabled(plan_nested_level + exec_nested_level)) { /* * Make sure stats accumulation is done. (Note: it's okay if several @@ -1092,7 +1106,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, uint64 saved_queryId = pstmt->queryId; int saved_stmt_location = pstmt->stmt_location; int saved_stmt_len = pstmt->stmt_len; - bool enabled = pgss_track_utility && pgss_enabled(exec_nested_level); + bool enabled = pgss_track_utility && pgss_enabled(plan_nested_level + exec_nested_level); /* * Force utility statements to get queryId zero. We do this even in cases @@ -1296,7 +1310,7 @@ pgss_store(const char *query, uint64 queryId, key.userid = GetUserId(); key.dbid = MyDatabaseId; key.queryid = queryId; - key.toplevel = (exec_nested_level == 0); + key.toplevel = (plan_nested_level + exec_nested_level == 0); /* Lookup the hash table entry with shared lock. */ LWLockAcquire(pgss->lock, LW_SHARED); diff --git a/contrib/pg_stat_statements/sql/level_tracking.sql b/contrib/pg_stat_statements/sql/level_tracking.sql index eab881a43b..dcd0b04358 100644 --- a/contrib/pg_stat_statements/sql/level_tracking.sql +++ b/contrib/pg_stat_statements/sql/level_tracking.sql @@ -90,6 +90,15 @@ SELECT PLUS_ONE(10); SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; +-- immutable SQL function --- can be executed at plan time +CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS +$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL; + +SELECT PLUS_THREE(8); +SELECT PLUS_THREE(10); + +SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; + -- PL/pgSQL function - all-level tracking. SET pg_stat_statements.track = 'all'; SELECT pg_stat_statements_reset(); @@ -97,6 +106,7 @@ SELECT pg_stat_statements_reset(); -- we drop and recreate the functions to avoid any caching funnies DROP FUNCTION PLUS_ONE(INTEGER); DROP FUNCTION PLUS_TWO(INTEGER); +DROP FUNCTION PLUS_THREE(INTEGER); -- PL/pgSQL function CREATE FUNCTION PLUS_TWO(i INTEGER) RETURNS INTEGER AS $$ @@ -118,7 +128,15 @@ SELECT PLUS_ONE(3); SELECT PLUS_ONE(1); SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -DROP FUNCTION PLUS_ONE(INTEGER); + +-- immutable SQL function --- can be executed at plan time +CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS +$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL; + +SELECT PLUS_THREE(8); +SELECT PLUS_THREE(10); + +SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -- -- pg_stat_statements.track = none
pgsql-bugs by date: