Re: pg_stat_statements: Add `calls_aborted` counter for tracking query cancellations - Mailing list pgsql-hackers
From | Benoit T |
---|---|
Subject | Re: pg_stat_statements: Add `calls_aborted` counter for tracking query cancellations |
Date | |
Msg-id | 4C0ECE11-C922-479F-9C5E-5A42F7BAF07F@gmail.com Whole thread Raw |
In response to | Re: pg_stat_statements: Add `calls_aborted` counter for tracking query cancellations (Michael Paquier <michael@paquier.xyz>) |
List | pgsql-hackers |
> On 14 Aug 2025, at 10:18, Michael Paquier <michael@paquier.xyz> wrote: > > That seems kind of limited to me in scope. The executor is only one > part of the system. I would have considered using an xact callback > when a transaction is aborted if I were to do a patch like the one you > are proposing, to know how many times a transaction is failing at a > specific phase, because you should know the latest query_id in this > case to be able to put a counter update in the correct slot (right?). > > +-- aborted calls tracking > +SELECT pg_sleep(0.5); > + pg_sleep > +---------- > + > +(1 row) > > Using hardcoded sleep times for deterministic tests is never a good > idea. On fast machines, they eat time for nothing. And if not > written correctly, they may not achieve their goal on slow machines > because the sleep threshold may be reached before the custom action is > taken. If you want to force a failure, you should just use a SQL that > you know would fail at execution time (based on your implementation > expects). Hello, Thanks for the review. On scope and callbacks - I prototyped a version with RegisterXactCallback and RegisterSubXactCallback but I’m not very happy with the result. Samelimitation as v4: we only have a stable identity (queryId) after parse/analyze. The xact/subxact callbacks fire withno statement context, so you still need to know “which statement was active” at abort time. - To deal with that, I keep a small backend-local tracker of the current statement’s key (userid, dbid, queryId, toplevel)when we enter the executor, mark it completed on success, and bump calls_aborted in the callbacks only if active&& !completed. - This reliably captures executor-time failures and timeouts. It cannot attribute errors that happen before post-parse (syntax/nameresolution), because queryId isn’t set yet. That’s the hard boundary. Planner failures aren’t attributed in thisversion because the tracker is set in ExecutorRun..still. On the value of callbacks - It seems callbacks alone are insufficient (no per-statement context). - If we want to preserve pgss semantics (only report statements that reached post-parse and have a queryId), the v4 approachis simpler and safer; I could look into planner error handling in a next patch version. - If we also want to count very-early errors, that implies raw-text keyed entries (no queryId), which changes pg_stat_statements’semantics (normalization, PII risk, cardinality). That probably should be separate extension as JulienRouhaud suggested: https://www.postgresql.org/message-id/aJ2ov4KYM2vX4uAs%40jrouhaud. Not sure I would be able to makethis, also I need to have something that will be supported by managed PostgreSQL with extension restrictions. Happy to adjust if you see a better way to thread “latest queryId” into the abort path without the local tracker. On pre-creating entries earlier - I considered creating entries from raw text to count very-early errors, but I’ve avoided it: pgss is keyed on the normalizedparse tree (queryId). Pre-hashing raw text weakens normalization, risks storing sensitive literals before jumbling,and adds lock churn in hot paths. Tests - Agreed on sleep flakiness. I switched to deterministic executor-time failures: duplicate key (primary key violation) anda check constraint violation. Both trip calls_aborted without timing assumptions. Also, are we okay tracking statements that never reach post-parse (i.e., no queryId)? That would effectively key on raw text,which changes pg_stat_statements’ semantics. Benoit
Attachment
pgsql-hackers by date: