Re: RFC: Logging plan of the running query - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: RFC: Logging plan of the running query |
Date | |
Msg-id | CA+TgmoYBLNSNGSumcDZB4ymVfy-aHyFa_i4n9MHRpPOLp5m+tQ@mail.gmail.com Whole thread Raw |
In response to | Re: RFC: Logging plan of the running query (torikoshia <torikoshia@oss.nttdata.com>) |
List | pgsql-hackers |
On Mon, Sep 1, 2025 at 9:35 AM torikoshia <torikoshia@oss.nttdata.com> wrote: > Rebased just because of doc refactoring. I haven't had time to review this in a while -- sorry about that -- and I have only limited time now, but let me try to give you some comments in the time that I do have. I bet some users would really like a feature that allows them to get the plans for their own running queries. They will be sad with the limitation of this feature, that it only writes to the logs. We should probably just live with that limitation, because doing anything else seems a lot more complicated. Still, it's not great. + ereport(LOG_SERVER_ONLY, + errmsg("backend with PID %d is finishing query execution and cannot log the plan.", + MyProcPid), + errhidestmt(true), + errhidecontext(true)); + ereport(LOG_SERVER_ONLY, + errmsg("backend with PID %d is not running a query or a subtransaction is aborted", + MyProcPid), + errhidestmt(true), + errhidecontext(true)); + ereport(LOG_SERVER_ONLY, + errmsg("query plan running on backend with PID %d is:\n%s", + MyProcPid, es->str->data), + errhidestmt(true), + errhidecontext(true)); The first of these messages ends with a period, which is contrary to style guidelines. All of them do errhidestmt() and errhidecontext(), which is an unusual choice. I don't see why it's appropriate here. Wanting to see both the query and the query plan seems pretty reasonable to me. The first two messages seem fairly unhelpful to me: the user isn't going to understand the distinction between those two states and it's unclear why we should give them that information. I'm not sure if we should log a generic message in these kinds of cases or log nothing at all, but I feel like this is too much technical information. Also, I think that we don't normally put the PID of the process that is performing an action in the primary error message, because the user can include %p in log_line_prefix if they so desire. But there's another, subtler point here, which is that I feel like "is not running a query or a subtransaction is aborted" is pointing to a design defect in the patch. When we enter an inner query, we switch activeQueryDesc to point to the inner QueryDesc. When we abort a subtransaction, instead of resetting it to the outer query's QueryDesc, we reset it to NULL. I don't really think that's acceptable. Consider this: SELECT some_slow_function_that_uses_a_subtransaction_which_aborts(g) FROM generate_series(1,1000000) g; What's going to happen here is that for 99.9999% of the execution time of this function, you can't print the query plan. And that won't be because of any fundamental thing is preventing you from so doing -- it will just be because the patch doesn't have the right bookkeeping. If you added a QueryDesc * pointer to TransactionStateData, then ExecutorRun could (1) save the current value of that pointer from the topmost element of the transaction state stack, (2) update the pointer value to the new QueryDesc, and (3) put the old pointer back at the end. Then, you wouldn't need any cleanup in AbortSubTransaction, and this value would always be right, even after an outer query continues after an abort, because the subtransaction abort would pop the transaction state stack, leaving the right thing on top. The changes to miscadmin.h delete a blank line. They probably shouldn't. The only change to pquery.h is to delete a blank line. That hunk needs reverting. +-- Note that we're using a slightly more complex query here because +-- a simple 'SELECT pg_log_query_plan(pg_backend_pid())' would finish +-- before it reaches the code path that actually outputs the plan. +SELECT result FROM (SELECT pg_log_query_plan(pg_backend_pid())) result; + result +-------- + (t) +(1 row) I seriously doubt that this will be stable across the entire buildfarm. You're going to need a different approach here. Is there any real reason to rename regress_log_memory to regress_log_function, vs. just introducing a separate regress_log_plan role? -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: