Thread: So why is EXPLAIN printing only *plan* time?
... and not, in particular, parse analysis or rewrite time? I'd been a bit suspicious of the recent patch to add $SUBJECT without the other pre-execution components, but it just now occurred to me that there's at least one reason why this might be a significant omission: any delay caused by waiting to acquire locks on the query's tables will be spent in the parser. It's probably okay to omit raw parsing time (that is, flex+bison), even though profiling sometimes suggests that that's a nontrivial cost. It's completely predictable and more or less linear in the query string length, so there are not likely to be any surprises for users in that time. But it's not at all clear to me that the same can be said for parse analysis or rewrite times. Rewrite timing could easily be captured by EXPLAIN since that call is done within ExplainQuery(). Parse analysis isn't, but we could imagine having transformExplainStmt() time the operation and stick the result into a new field in struct ExplainStmt. I'm not sure if it'd be appropriate to add all of these measurements as separate printout lines; arguably we should just fold them into "planning time". Thoughts? regards, tom lane
On 04/27/2014 07:07 PM, Tom Lane wrote: > Rewrite timing could easily be captured by EXPLAIN since that call > is done within ExplainQuery(). Parse analysis isn't, but we could > imagine having transformExplainStmt() time the operation and stick > the result into a new field in struct ExplainStmt. > > I'm not sure if it'd be appropriate to add all of these measurements > as separate printout lines; arguably we should just fold them into > "planning time". > > Thoughts? I think folding them all (except flex+bison) into Planning time makes sense since while the sum total sum is interesting to users you would not want every EXPLAIN plan to be filled with timings. If it is interesting to print them separately I suggest that is done as an option to EXPLAIN, and not by default. Andreas
Andreas Karlsson <andreas@proxel.se> writes: > On 04/27/2014 07:07 PM, Tom Lane wrote: >> Rewrite timing could easily be captured by EXPLAIN since that call >> is done within ExplainQuery(). Parse analysis isn't, but we could >> imagine having transformExplainStmt() time the operation and stick >> the result into a new field in struct ExplainStmt. >> >> I'm not sure if it'd be appropriate to add all of these measurements >> as separate printout lines; arguably we should just fold them into >> "planning time". >> >> Thoughts? > I think folding them all (except flex+bison) into Planning time makes > sense since while the sum total sum is interesting to users you would > not want every EXPLAIN plan to be filled with timings. > If it is interesting to print them separately I suggest that is done as > an option to EXPLAIN, and not by default. I did a bit of experimentation with a quick-n-dirty prototype and confirmed the expectation that parse analysis + rewrite is usually significantly less than planning time. It's not always negligible by comparison, but it's usually so. So if it weren't for the lock acquisition issue, it's not clear that it'd be worth accounting for. A presentation problem that I'm not sure what to do with is that if the original Query expands to multiple queries during rewrite (as a result of DO ALSO rules), there will be multiple plan trees and multiple planning times to display, but we have only one parse analysis measurement and only one rewrite measurement to offer. Adding those times into *each* planning time report would clearly be misleading. We could print them separately (but I agree with Andreas that this'd mostly be clutter). Or we could add them into just the first planning-time printout, though that might also be misleading. So at this point I'm not entirely sure if this is worth worrying about, or if we should just leave the existing code alone. But if we ship 9.4 like this, we probably don't get to change our minds later. Comments? regards, tom lane
Tom Lane-2 wrote > Or we could add them into > just the first planning-time printout, though that might also be > misleading. If you are going to show a number for these steps, which seems like a good idea, then this seems like a reasonable option in the face of this situation. Basically two options: 1. Query overhead timing detailed separately 2. Query overhead included with first plans' planning time An explain option: (OVERHEAD '[off?|detail|first]') would then give the user the ability to choose between the two (not sure if off is worthwhile but worth at least rejecting explicitly). David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/So-why-is-EXPLAIN-printing-only-plan-time-tp5801663p5801674.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > I'd been a bit suspicious of the recent patch to add $SUBJECT > without the other pre-execution components, but it just now > occurred to me that there's at least one reason why this might > be a significant omission: any delay caused by waiting to acquire > locks on the query's tables will be spent in the parser. [...] > I'm not sure if it'd be appropriate to add all of these measurements > as separate printout lines; arguably we should just fold them into > "planning time". > > Thoughts? Having a distinction between "time spent waiting on locks" (even just "waited on locks" as a boolean) would be very nice, imv. Having the time spent would be best, provided it doesn't add too much. As for individual print-out lines, perhaps we should consider putting them on the same line? Maybe: Planning time X.Y (waiting for locks: X.Y, other stuff: X.Y). Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> I'd been a bit suspicious of the recent patch to add $SUBJECT >> without the other pre-execution components, but it just now >> occurred to me that there's at least one reason why this might >> be a significant omission: any delay caused by waiting to acquire >> locks on the query's tables will be spent in the parser. > Having a distinction between "time spent waiting on locks" (even > just "waited on locks" as a boolean) would be very nice, imv. Having > the time spent would be best, provided it doesn't add too much. We've already got log_lock_waits. I'm not that eager to try to make EXPLAIN print the same info, and even if I was, it would be a large and invasive patch. The concern I had here was just that if an EXPLAIN does get delayed by parse-time lock waits, it'd be nice if the printed times added up to something approaching the observed runtime. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > >> I'd been a bit suspicious of the recent patch to add $SUBJECT > >> without the other pre-execution components, but it just now > >> occurred to me that there's at least one reason why this might > >> be a significant omission: any delay caused by waiting to acquire > >> locks on the query's tables will be spent in the parser. > > > Having a distinction between "time spent waiting on locks" (even > > just "waited on locks" as a boolean) would be very nice, imv. Having > > the time spent would be best, provided it doesn't add too much. > > We've already got log_lock_waits. I'm not that eager to try to make > EXPLAIN print the same info, and even if I was, it would be a large and > invasive patch. As log_lock_waits only writes to the log file (a fact that I complained about when we discussed its original implementation), I could see having that info in EXPLAIN as useful. Of course, locks are also transient, so this isn't as useful since it would only show under EXPLAIN. I'd still love an option to have an async message sent to the client after lock timeout which could let an end-user know that the query is waiting on a lock. > The concern I had here was just that if an EXPLAIN does > get delayed by parse-time lock waits, it'd be nice if the printed times > added up to something approaching the observed runtime. I agree with this. Thanks, Stephen
Tom, > I'd been a bit suspicious of the recent patch to add $SUBJECT > without the other pre-execution components, but it just now > occurred to me that there's at least one reason why this might > be a significant omission: any delay caused by waiting to acquire > locks on the query's tables will be spent in the parser. Well, if I could have a pony, I would like the ability to break out lockwait vs. other planning time. The two can be significantly different, especially in cases where users are trying to pin down why a particular query has a high degree of variance in response times. > It's probably okay to omit raw parsing time (that is, flex+bison), > even though profiling sometimes suggests that that's a nontrivial > cost. It's completely predictable and more or less linear in the > query string length, so there are not likely to be any surprises > for users in that time. Just because it's predictable doesn't mean that users expect it. Frankly, I'd like to have raw parse time available so that I could show users exactly how long it takes to process a query with an IN() list 1000 items long, as a demonstration of why they shouldn't do that. > We've already got log_lock_waits. That's not useful for interactive query debugging, especially since log_lock_waits only returns lock waits which are longer than deadlock_timeout. > I'm not that eager to try to make > EXPLAIN print the same info, and even if I was, it would be a large and > invasive patch. I understand that we're not getting that for 9.4, but I'd like to keep it on the table in case someone wants to work on it in the future. It would be a useful feature. > The concern I had here was just that if an EXPLAIN does > get delayed by parse-time lock waits, it'd be nice if the printed times > added up to something approaching the observed runtime. Yes, definitely. Maybe we should call it something other than "planning time", so that when we have the ability to display each stage separately in the future, we don't confuse people? Maybe "query setup time"? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 27 April 2014 22:38, Tom Lane Wrote: > ... and not, in particular, parse analysis or rewrite time? > > I'd been a bit suspicious of the recent patch to add $SUBJECT without > the other pre-execution components, but it just now occurred to me that > there's at least one reason why this might be a significant omission: > any delay caused by waiting to acquire locks on the query's tables will > be spent in the parser. > > It's probably okay to omit raw parsing time (that is, flex+bison), even > though profiling sometimes suggests that that's a nontrivial cost. > It's completely predictable and more or less linear in the query string > length, so there are not likely to be any surprises for users in that > time. But it's not at all clear to me that the same can be said for > parse analysis or rewrite times. > > Rewrite timing could easily be captured by EXPLAIN since that call is > done within ExplainQuery(). Parse analysis isn't, but we could imagine > having transformExplainStmt() time the operation and stick the result > into a new field in struct ExplainStmt. > > I'm not sure if it'd be appropriate to add all of these measurements as > separate printout lines; arguably we should just fold them into > "planning time". > > Thoughts? I had understanding that the "Planning time" printed along with EXPLAIN is only for the purpose to see how much time is spent in creating plan for execution. If we start including all previous (i.e. parsing + analyzing + rewrite) time also in "Planning time" then, it might lose whole of the meaning of printing this. If we add time spent in locking, then "Planning time" for the same query with same statistics will become variable depending on the amount of time it had to wait to acquire lock, which will be bit confusing for users. May be if we really have to print other time (i.e. parsing + analyzing + rewrite), then IMHO we can print with some different name instead of including in "Planning time" only. Thanks and Regards, Kumar Rajeev Rastogi
On Sun, Apr 27, 2014 at 1:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > ... and not, in particular, parse analysis or rewrite time? I think breaking those out would be a good idea. Especially rewrite time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Apr 27, 2014 at 1:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> ... and not, in particular, parse analysis or rewrite time? > I think breaking those out would be a good idea. Especially rewrite time. Rewrite time seems generally negligible in comparison to the other two components, at least in the simple testing I did yesterday. It would only be significant if you were expanding some complicated views, in which case planning time would almost surely dominate anyway. Anyway, I'm starting to come to the conclusion that the idea of silently adding parse/rewrite time into the "planning time" line isn't such a good one. So there may or may not be sufficient interest in the other numbers to justify adding them as separate lines later --- but the key word there is "later". I now think we should leave "planning time" as it's currently defined, which means we don't need to address this issue for 9.4. regards, tom lane
On Mon, Apr 28, 2014 at 11:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sun, Apr 27, 2014 at 1:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> ... and not, in particular, parse analysis or rewrite time? > >> I think breaking those out would be a good idea. Especially rewrite time. > > Rewrite time seems generally negligible in comparison to the other two > components, at least in the simple testing I did yesterday. It would > only be significant if you were expanding some complicated views, in > which case planning time would almost surely dominate anyway. > > Anyway, I'm starting to come to the conclusion that the idea of silently > adding parse/rewrite time into the "planning time" line isn't such a good > one. So there may or may not be sufficient interest in the other numbers > to justify adding them as separate lines later --- but the key word there > is "later". I now think we should leave "planning time" as it's currently > defined, which means we don't need to address this issue for 9.4. Works for me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company