Thread: Plan targetlists in EXPLAIN output
For debugging the planner work I'm about to do, I'm expecting it will be useful to be able to get EXPLAIN to print the targetlist of each plan node, not just the quals (conditions) as it's historically done. My first instinct is just to stick in the code under a debugging #ifdef, but I wonder if anyone wants to argue for making it more easily available? I think it'd be a mistake to turn it on by default, because it'd add a line for every plan node, which'd be an awful lot of bloat in output that's hard enough to read already. And experience has shown that 99.99% of the time people don't need the info. Still, there's that other 0.01%. I'm tempted to propose redefining the currently-nearly-useless EXPLAIN VERBOSE option as doing this. regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > For debugging the planner work I'm about to do, I'm expecting it will be > useful to be able to get EXPLAIN to print the targetlist of each plan > node, not just the quals (conditions) as it's historically done. My > first instinct is just to stick in the code under a debugging #ifdef, > but I wonder if anyone wants to argue for making it more easily > available? Yes please. > I think it'd be a mistake to turn it on by default, because it'd add a > line for every plan node, which'd be an awful lot of bloat in output > that's hard enough to read already. And experience has shown that > 99.99% of the time people don't need the info. Still, there's that > other 0.01%. > > I'm tempted to propose redefining the currently-nearly-useless > EXPLAIN VERBOSE option as doing this. EXPLAIN VERBOSE is indeed ridiculous. The only downside is that people following modern instructions on old installs will be sad. But I'm fine with that. IMHO You could also move "width" to VERBOSE while you're at it. In fact you'll probably want width in precisely the same cases where you want the target list. I think down the road we'll have a few different independent data sets you can get out of explain or at least explain analyze. I want to get i/o stats in there which I think you'll want to turn on and off as a group, for example. But perhaps by the time we do that someone will have done XML explain and it'll be irrelevant. I can't think of any nice syntax to do that offhand anyways. So +1 for just redefining VERBOSE. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!
Gregory Stark <stark@enterprisedb.com> writes: > EXPLAIN VERBOSE is indeed ridiculous. There are other ways to get that printout, too, if you really do need it. > IMHO You could also move "width" to VERBOSE while you're at it. In fact you'll > probably want width in precisely the same cases where you want the target > list. I'm not convinced. The width is often useful to understand why the planner did something (eg, chose a hash plan or not). The exact contents of the targetlist are usually not nearly as interesting. > So +1 for just redefining VERBOSE. Barring other objections I'll go do that. BTW, while testing the code I already found a bug: regression=# set enable_hashagg to 0; SET regression=# explain select thousand from tenk1 group by 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------Group (cost=1122.39..1172.39 rows=998 width=4) Output: thousand -> Sort (cost=1122.39..1147.39 rows=10000 width=4) Output:unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1,stringu2, string4 Sort Key: thousand -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous,odd, even, stringu1, stringu2, string4 (7 rows) Only the "thousand" column is needed, so why is it emitting all columns? It's evidently allowing the "use physical tlist" optimization to fire, which saves cycles inside the SeqScan node --- but in this context that's penny-wise and pound-foolish, because we're pumping useless data through the Sort. There is code in the planner that's supposed to notice the needs of the next level up, but it's not getting this case right for some reason... regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > I'm not convinced. The width is often useful to understand why the > planner did something (eg, chose a hash plan or not). The exact > contents of the targetlist are usually not nearly as interesting. I've never seen a single post on any of the lists where anyone went through that exercise though. > -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) > Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd,even, stringu1, stringu2, string4 I wonder if I even understand what width means. Or does the planner think most of these columns are mostly null? Or is it estimating the width based on the belief that only the thousand column is actually going to be emitted? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB'sPostgreSQL training!
Gregory Stark <stark@enterprisedb.com> writes: > Or is it estimating the width based on the belief that only the thousand > column is actually going to be emitted? Right. The width is used to estimate how much space would be needed for, eg, sorting or hashing the plan node's output. In any case where something like that is actually happening, we *should* be emitting only the required columns, so I didn't see any particular need to make use_physical_tlist change the reported width. OTOH this bug shows that maybe that was hiding useful information ... regards, tom lane
On Thu, 2008-04-17 at 12:34 -0400, Tom Lane wrote: > I'm tempted to propose redefining the currently-nearly-useless > EXPLAIN VERBOSE option as doing this. Yes please. Sounds like a good home for other useful things also. I'd like to have an EXPLAIN mode that displayed the plan without *any* changeable info (i.e. no costs, row counts etc). This would then allow more easy determination of whether plans had changed over time. (But EXPLAIN TERSE sounds silly). -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
On Thu, 17 Apr 2008, Tom Lane wrote: > For debugging the planner work I'm about to do, I'm expecting it will be > useful to be able to get EXPLAIN to print the targetlist of each plan > node, not just the quals (conditions) as it's historically done. I've heard that some of the academic users of PostgreSQL were hoping to add features in this area in order to allow better using planner internals for educational purposes. It would be nice if that were available for such purposes without having to recompile. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Thu, 17 Apr 2008 20:42:49 +0200, Simon Riggs <simon@2ndquadrant.com> wrote: > On Thu, 2008-04-17 at 12:34 -0400, Tom Lane wrote: > >> I'm tempted to propose redefining the currently-nearly-useless >> EXPLAIN VERBOSE option as doing this. > > Yes please. > > Sounds like a good home for other useful things also. > > I'd like to have an EXPLAIN mode that displayed the plan without *any* > changeable info (i.e. no costs, row counts etc). This would then allow > more easy determination of whether plans had changed over time. (But > EXPLAIN TERSE sounds silly). > Plan = TreeTree = XML EXPLAIN ANALYZE SELECT * FROM test NATURAL JOIN test2 WHERE id =ANY('{3,666,975,521'}); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- NestedLoop (cost=17.04..65.13 rows=1 width=8) (actual time=51.835..51.835 rows=0 loops=1) Join Filter: (test.value = test2.value) -> Bitmap Heap Scan on test (cost=17.04..31.96rows=4 width=8) (actual time=16.622..16.631 rows=4 loops=1) Recheck Cond: (id = ANY ('{3,666,975,521}'::integer[])) -> BitmapIndex Scan on test_pkey (cost=0.00..17.04 rows=4 width=0) (actual time=16.613..16.613 rows=4 loops=1) Index Cond: (id = ANY ('{3,666,975,521}'::integer[])) -> Index Scan using test2_pkey on test2 (cost=0.00..8.28 rows=1 width=8) (actual time=8.794..8.795 rows=1 loops=4) Index Cond: (test2.id = test.id) EXPLAIN XML ... <NestedLoop> <Join Filter="(test.value = test2.value)"> <BitmapHeapScan Target="test" RecheckCond="(id) = ANY ($1)"/> <BitmapIndexScan Index="test_pkey" Cond="id = ANY ('$1'::integer[]))" /></Join><IndexScan Index="test2_pkey" Target="test2"Cond="test2.id = test.id" /> </NestedLoop> Nicely parsable and displayable in all its glory in pgadmin ;)
PFC wrote: > Plan = Tree > Tree = XML If you want to propose a DTD I'm sure there would be many people interested. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
I have been working on a project (for GSOC) to retrieve planner/optimizer details. As part of the project, I need machine parsable output. So, I thought I would dust off a patch I found from last year that Germán Caamaño submitted. I didn't see any further activity there so I integrated it into 8.4 and added a DTD. The output below is generated by using the added flag 'XML' to the EXPLAIN command. The DTD probably wouldn't be needed for every output instance and may need its own flag. I am coming up to speed on the planner internals, but it seems like this first EXPLAIN XML concept may have some use. Are there any strong opinions about the XML hierarchy? Is it enough to simply wrap the text output from EXPLAIN with XML tags? -Tom Raney QUERY PLAN -------------------------------------------------------------------<?xml version="1.0"?> <!DOCTYPE explain[<!ELEMENT explain (plan+) ><!ELEMENT plan (table?, cost, qualifier?) ><!ELEMENT table EMPTY ><!ELEMENTcost EMPTY ><!ELEMENT qualifier EMPTY ><!ATTLIST explain version CDATA #REQUIRED ><!ATTLIST plan name CDATA #REQUIRED level CDATA #REQUIRED ><!ATTLIST cost startup CDATA #REQUIRED total CDATA #REQUIRED rowsCDATA #REQUIRED width CDATA #REQUIRED ><!ATTLIST table name CDATA #REQUIRED ><!ATTLIST qualifier typeCDATA #REQUIRED value CDATA #REQUIRED >]> <explain version="8.4devel"><plan name="Seq Scan" level="0"> <table name="tenk1"/> <cost startup="0.00" total="445.00"rows="10000" width="244" /></plan></explain> (32 rows) Greg Smith wrote: > On Thu, 17 Apr 2008, Tom Lane wrote: > >> For debugging the planner work I'm about to do, I'm expecting it will be >> useful to be able to get EXPLAIN to print the targetlist of each plan >> node, not just the quals (conditions) as it's historically done. > > I've heard that some of the academic users of PostgreSQL were hoping > to add features in this area in order to allow better using planner > internals for educational purposes. It would be nice if that were > available for such purposes without having to recompile. > > -- > * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD >