RE: Planning counters in pg_stat_statements (using pgss_store) - Mailing list pgsql-hackers
From | legrand legrand |
---|---|
Subject | RE: Planning counters in pg_stat_statements (using pgss_store) |
Date | |
Msg-id | 1584180240397-0.post@n3.nabble.com Whole thread Raw |
In response to | RE: Planning counters in pg_stat_statements (using pgss_store) ("imai.yoshikazu@fujitsu.com" <imai.yoshikazu@fujitsu.com>) |
Responses |
RE: Planning counters in pg_stat_statements (using pgss_store)
Re: Planning counters in pg_stat_statements (using pgss_store) |
List | pgsql-hackers |
imai.yoshikazu@fujitsu.com wrote > On Thu, Mar 12, 2020 at 6:37 PM, Julien Rouhaud wrote: >> On Thu, Mar 12, 2020 at 1:11 PM Marco Slot < > marco.slot@ > > wrote: >> > On Thu, Mar 12, 2020 at 11:31 AM Julien Rouhaud < > rjuju123@ > > >> wrote: >> > > There's at least the current version of IVM patchset that lacks the >> > > querytext. Looking at various extensions, I see that pg_background >> > > and pglogical call pg_plan_query internally but shouldn't have any >> > > issue providing the query text. But there's also citus extension, >> > > which don't keep around the query string at least when distributing >> > > plans, which makes sense since it's of no use and they're heavily >> > > modifying the original Query. I think that citus folks opinion on >> > > the subject would be useful, so I'm Cc-ing Marco. >> > >> > Most of the time we keep our Query * data structures in a form that >> > can be deparsed back into a query string by a modified copy of >> > ruleutils.c, so we could generate a correct query string if absolutely >> > necessary, though there are performance-sensitive cases where we'd >> > rather not have the deparsing overhead. >> >> Yes, deparsing is probably too expensive for that use case. >> >> > In case of INSERT..SELECT into a distributed table, we might call >> > pg_plan_query on the SELECT part (Query *) and send the output into a >> > DestReceiver that sends tuples into shards of the distributed table >> > via COPY. The fact that SELECT does not show up in pg_stat_statements >> > separately is generally fine because it's an implementation detail, >> > and it would probably be a little confusing because the user never ran >> > the SELECT query. Moreover, the call to pg_plan_query would already be >> > reflected in the planning or execution time of the top-level query, so >> > it would be double counted if it had its own entry. >> >> Well, surprising statements can already appears in pg_stat_statements >> when >> you use some psql features, or if you have triggers as those will run >> additional >> queries under the hood. >> >> The difference here is that since citus is a CustomNode, underlying calls >> to >> planner will be accounted for that node, and that's indeed annoying. I >> can see >> that citus is doing some calls to spi_exec or >> Executor* (in ExecuteLocalTaskPlan), which could also trigger >> pg_stat_statements, but I don't know if a queryid is present there. >> >> > Another case is when some of the shards turn out to be local to the >> > server that handles the distributed query. In that case we plan the >> > queries on those shards via pg_plan_query instead of deparsing and >> > sending the query string to a remote server. It would be less >> > confusing for these queries to show in pg_stat_statements, because the >> > queries on the shards on remote servers will show up as well. However, >> > this is a performance-sensitive code path where we'd rather avoid >> > deparsing. >> >> Agreed. >> >> > In general, I'd prefer if there was no requirement to pass a correct >> > query string. I'm ok with passing "SELECT 'citus_internal'" or just "" >> > if that does not lead to issues. Passing NULL to signal that the >> > planner call should not be tracked separately does seem a bit cleaner. >> >> That's very interesting feedback, thanks! I'm not a fan of giving a way >> for >> queries to say that they want to be ignored by pg_stat_statements, but >> double >> counting the planning counters seem even worse, so I'm +0.5 to accept >> NULL >> query string in the planner, incidentally making pgss ignore those. > > It is preferable that we can count various queries statistics as much as > possible > but if it causes overhead even when without using pg_stat_statements, we > would > not have to force them to set appropriate query_text. > About settings a fixed string in query_text, I think it doesn't make much > sense > because users can't take any actions by seeing those queries' stats. > Moreover, if > we set a fixed string in query_text to avoid pg_stat_statement's errors, > codes > would be inexplicable for other developers who don't know there's such > requirements. > After all, I agree accepting NULL query string in the planner. > > I don't know it is useful but there are also codes that avoid an error > when > sourceText is NULL. > > executor_errposition(EState *estate, int location) > { > ... > /* Can't do anything if source text is not available */ > if (estate == NULL || estate->es_sourceText == NULL) > } > > -- > Yoshikazu Imai Hello Imai, My understanding of V5 patch, that checks for Non-Zero queryid, manage properly case where sourceText is NULL. A NULL sourceText means that there was no Parsing for the associated query, if there was no Parsing, there is no queryid (queryId=0), and no planning counters update. It doesn't change pg_plan_query behaviour (no regression for Citus, IVM, ...), and was tested with success for IVM. If my understanding is wrong, then setting track_planning = false would still be the work arround for the very rare (no core) extension(s) that may hit the NULL query text assertion failure. What do you think about this ? Would this make V5 patch ready for committers ? Thanks in advance. Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
pgsql-hackers by date: