Re: pg_stat_statements: Add `calls_aborted` counter for tracking query cancellations - Mailing list pgsql-hackers

From Benoit T
Subject Re: pg_stat_statements: Add `calls_aborted` counter for tracking query cancellations
Date
Msg-id 4C0ECE11-C922-479F-9C5E-5A42F7BAF07F@gmail.com
Whole thread Raw
In response to Re: pg_stat_statements: Add `calls_aborted` counter for tracking query cancellations  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers

> On 14 Aug 2025, at 10:18, Michael Paquier <michael@paquier.xyz> wrote:
>
> That seems kind of limited to me in scope.  The executor is only one
> part of the system.  I would have considered using an xact callback
> when a transaction is aborted if I were to do a patch like the one you
> are proposing, to know how many times a transaction is failing at a
> specific phase, because you should know the latest query_id in this
> case to be able to put a counter update in the correct slot (right?).
>
> +-- aborted calls tracking
> +SELECT pg_sleep(0.5);
> + pg_sleep
> +----------
> +
> +(1 row)
>
> Using hardcoded sleep times for deterministic tests is never a good
> idea.  On fast machines, they eat time for nothing.  And if not
> written correctly, they may not achieve their goal on slow machines
> because the sleep threshold may be reached before the custom action is
> taken.  If you want to force a failure, you should just use a SQL that
> you know would fail at execution time (based on your implementation
> expects).

Hello,

Thanks for the review.

On scope and callbacks
- I prototyped a version with RegisterXactCallback and RegisterSubXactCallback but I’m not very happy with the result.
Samelimitation as v4: we only have a stable identity (queryId) after parse/analyze. The xact/subxact callbacks fire
withno statement context, so you still need to know “which statement was active” at abort time. 
- To deal with that, I keep a small backend-local tracker of the current statement’s key (userid, dbid, queryId,
toplevel)when we enter the executor, mark it completed on success, and bump calls_aborted in the callbacks only if
active&& !completed. 
- This reliably captures executor-time failures and timeouts. It cannot attribute errors that happen before post-parse
(syntax/nameresolution), because queryId isn’t set yet. That’s the hard boundary. Planner failures aren’t attributed in
thisversion because the tracker is set in ExecutorRun..still. 

On the value of callbacks
- It seems callbacks alone are insufficient (no per-statement context).
- If we want to preserve pgss semantics (only report statements that reached post-parse and have a queryId), the v4
approachis simpler and safer; I could look into planner error handling in a next patch version. 
- If we also want to count very-early errors, that implies raw-text keyed entries (no queryId), which changes
pg_stat_statements’semantics (normalization, PII risk, cardinality). That probably should be separate extension as
JulienRouhaud suggested: https://www.postgresql.org/message-id/aJ2ov4KYM2vX4uAs%40jrouhaud. Not sure I would be able to
makethis, also I need to have something that will be supported by managed PostgreSQL with extension restrictions. 

Happy to adjust if you see a better way to thread “latest queryId” into the abort path without the local tracker.

On pre-creating entries earlier
- I considered creating entries from raw text to count very-early errors, but I’ve avoided it: pgss is keyed on the
normalizedparse tree (queryId). Pre-hashing raw text weakens normalization, risks storing sensitive literals before
jumbling,and adds lock churn in hot paths. 

Tests
- Agreed on sleep flakiness. I switched to deterministic executor-time failures: duplicate key (primary key violation)
anda check constraint violation. Both trip calls_aborted without timing assumptions. 

Also, are we okay tracking statements that never reach post-parse (i.e., no queryId)? That would effectively key on raw
text,which changes pg_stat_statements’ semantics. 

Benoit


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Improve hash join's handling of tuples with null join keys
Next
From: Tom Lane
Date:
Subject: Re: Making jsonb_agg() faster