Re: RFC: Logging plan of the running query - Mailing list pgsql-hackers
From | torikoshia |
---|---|
Subject | Re: RFC: Logging plan of the running query |
Date | |
Msg-id | 6143f00af4bfdba95a85cf866f4acb41@oss.nttdata.com Whole thread Raw |
In response to | Re: RFC: Logging plan of the running query (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: RFC: Logging plan of the running query
|
List | pgsql-hackers |
On 2025-09-10 02:59, Robert Haas wrote: > I haven't had time to review this in a while -- sorry about that -- > and I have only limited time now, but let me try to give you some > comments in the time that I do have. Thank you very much for taking the time to review this! > I bet some users would really like a feature that allows them to get > the plans for their own running queries. They will be sad with the > limitation of this feature, that it only writes to the logs. We should > probably just live with that limitation, because doing anything else > seems a lot more complicated. Still, it's not great. I agree. Regarding output in a form other than the logs, I think the idea proposed in [1] — making it possible to inspect memory contexts of other backends via a view — would be a useful reference. [1] https://www.postgresql.org/message-id/flat/CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com > The first of these messages ends with a period, which is contrary to > style guidelines. Removed the period. > All of them do errhidestmt() and errhidecontext(), > which is an unusual choice. I don't see why it's appropriate here. > Wanting to see both the query and the query plan seems pretty > reasonable to me. Removed errhidestmt() and errhidecontext(). > The first two messages seem fairly unhelpful to me: > the user isn't going to understand the distinction between those two > states and it's unclear why we should give them that information. I'm > not sure if we should log a generic message in these kinds of cases or > log nothing at all, but I feel like this is too much technical > information. I'm not sure if this is the best approach, but I changed them to log nothing in these cases. In addition to the reason you mentioned, I found that when pg_log_query_plan() is executed repeatedly at very short intervals (e.g., every 0.01 seconds), ereport() could lead to ProcessLogQueryPlanInterrupt() being invoked again inside errfinish via CFI(), which resulted in a stack overflow. While this could be processed with check_stack_depth(), it doesn’t seem worthwhile to use it just to emit a message of limited usefulness. Additionally, there are other cases where the plan cannot be logged (e.g., when the query is in a state which ExecProcNode will not be invoked any further). So instead, I added the following note to the documentation: Note that depending on the execution state of the query, it may not be possible to log the plan. > Also, I think that we don't normally put the PID of the > process that is performing an action in the primary error message, > because the user can include %p in log_line_prefix if they so desire. That makes sense, but I had used the following message from pg_log_backend_memory_contexts() as a reference and put the PID in the message: errmsg("logging memory contexts of PID %d", MyProcPid))); Since both pg_log_query_plan() and pg_log_backend_memory_contexts() output information about another other backend, it feels slightly odd that their log messages would be inconsistent. Perhaps should we consider changing pg_log_backend_memory_contexts() for consistency as well? > But there's another, subtler point here, which is that I feel like "is > not running a query or a subtransaction is aborted" is pointing to a > design defect in the patch. When we enter an inner query, we switch > activeQueryDesc to point to the inner QueryDesc. When we abort a > subtransaction, instead of resetting it to the outer query's > QueryDesc, we reset it to NULL. I don't really think that's > acceptable. Consider this: > > SELECT some_slow_function_that_uses_a_subtransaction_which_aborts(g) > FROM generate_series(1,1000000) g; > > What's going to happen here is that for 99.9999% of the execution time > of this function, you can't print the query plan. And that won't be > because of any fundamental thing is preventing you from so doing -- it > will just be because the patch doesn't have the right bookkeeping. If > you added a QueryDesc * pointer to TransactionStateData, then > ExecutorRun could (1) save the current value of that pointer from the > topmost element of the transaction state stack, (2) update the pointer > value to the new QueryDesc, and (3) put the old pointer back at the > end. Thank you for pointing out the issue and suggesting a fix. Attached patch added a QueryDesc pointer to TransactionStateData and updated it accordingly. With this change, while running the query below, pg_log_query_plan() is now able to output the plan for SELECT gen_subtxn(g) FROM generate_series(1,1000000) g as below: (session1)=# CREATE OR REPLACE FUNCTION gen_subtxn(i int) RETURNS int LANGUAGE plpgsql AS $$ DECLARE result int; BEGIN BEGIN PERFORM 1 / 0; EXCEPTION WHEN division_by_zero THEN result := -i; END; RETURN result; END; $$; =# SELECT gen_subtxn(g) from generate_series(1,1000000) g; (session2)=# SELECT pg_log_query_plan(pid) FROM pg_stat_activity WHERE backend_type = 'client backend'; (session2)=# \watch 0.1 (in the log) LOG: 00000: query and its plan running on the backend are: Query Text: SELECT gen_subtxn(g) from generate_series(1,1000000) g; Function Scan on pg_catalog.generate_series g (cost=0.00..260000.00 rows=1000000 width=4) Output: gen_subtxn(g) Function Call: generate_series(1, 1000000) > Then, you wouldn't need any cleanup in AbortSubTransaction, and > this value would always be right, even after an outer query continues > after an abort, because the subtransaction abort would pop the > transaction state stack, leaving the right thing on top. It seems that simply making the above change is not sufficient. In particular, when raising an error inside a subtransaction, I sometimes observed segfaults: BEGIN; savepoint x; CREATE TABLE koju (a INT UNIQUE); INSERT INTO koju VALUES (1); INSERT INTO koju VALUES (1); -- key duplicate error Inspecting the crash showed that the argument *ps to ExecSetExecProcNodeRecurse() was 0x7f7f7f.... (lldb) f 0 frame #0: 0x000000010102ec40 postgres`ExecSetExecProcNodeRecurse(ps=0x7f7f7f7f7f7f7f7f) at execProcnode.c:601:30 600 { -> 601 ExecSetExecProcNode(ps, ps->ExecProcNodeReal); This happens when the plan-logging function is called after an ERROR is raised but before the transaction state stack is popped. So in the attached patch, as in the previous version, I reset QueryDesc to NULL during that interval. I have also confirmed that even with this change, the above test (SELECT gen_subtxn(g) from generate_series(1,1000000) g;) can still have its plan logged successfully. > The changes to miscadmin.h delete a blank line. They probably > shouldn't. Fixed. > The only change to pquery.h is to delete a blank line. That hunk needs > reverting. Fixed. > +-- Note that we're using a slightly more complex query here because > +-- a simple 'SELECT pg_log_query_plan(pg_backend_pid())' would finish > +-- before it reaches the code path that actually outputs the plan. > +SELECT result FROM (SELECT pg_log_query_plan(pg_backend_pid())) > result; > + result > +-------- > + (t) > +(1 row) > > I seriously doubt that this will be stable across the entire > buildfarm. You're going to need a different approach here. Changed it to the following query: WITH t AS MATERIALIZED (SELECT pg_log_query_plan(pg_backend_pid())) SELECT * FROM t; Also since when the target query is using CTE, pg_log_query_plan() had few chance to log the plan, attached patch added T_CteScan handling. > Is there any real reason to rename regress_log_memory to > regress_log_function, vs. just introducing a separate regress_log_plan > role? I don’t have any particular reason to insist on unifying them. In the attached patch, I created a regress_log_plan role and used that to run the same test. -- Regards, -- Atsushi Torikoshi Seconded from NTT DATA Japan Corporation to SRA OSS K.K.
Attachment
pgsql-hackers by date: