Re: machine-readable explain output v4 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: machine-readable explain output v4 |
Date | |
Msg-id | 603c8f070908091519g3f54248bk93e82773ad900432@mail.gmail.com Whole thread Raw |
In response to | Re: machine-readable explain output v4 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: machine-readable explain output v4
|
List | pgsql-hackers |
On Sun, Aug 9, 2009 at 3:57 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Revised patch attached. I'm not convinced this is as good as it can >> be, but I've been looking at this patch for so long that I'm starting >> to get cross-eyed, and I'd like to Tom at least have a look at this >> and assess it before we run out of CommitFest. > > I'm starting to look at this now. I feel unqualified to opine on the > quality of the XML/JSON schema design, and given the utter lack of > documentation of what that design is, I'm not sure that anyone who > could comment on it has done so. Could we have a spec please? *scratches head* You're not the first person to make that request, and I'd like to respond to it to well, but I don't really know what to write. Most of the discussion about the XML/JSON output format thus far has been around things like whether we should downcase everything, and even the people offering these comments have mostly labelled them with words to the effect of "I know this is trival but...". I think that the reason for this is that fundamentally explain output is fundamentally a tree, and XML and JSON both have ways of representing a tree with properties hanging off the nodes, and this patch uses those ways. I can't figure out what else there is, so I don't know what I'm explaining why I didn't do. The one significant representational choice that I'm aware of having made is to use nested tags rather than attributes in the XML format. This seems to me to offer several advantages. First, it's clearly impossible to standardize on attributes, because attributes can only be text, and it seems to me that if we're going to try to output structured data, we want to take that as far as we can, and we have attributes (like sort keys) that are lists rather than scalars. Using tags means that they can have substructure when needed. Second, it seems likely to me that people will want to extend explain further in the future: indeed, that was the whole point of the explain-options patch which was already committed. That's pretty simple in the current design - just add a few more calls to ExplainPropertyText or ExplainPropertyList in the appropriate place, and you're done. I'm pretty sure that splitting things up between attributes and nested tags would complicate such modifications. Peter Eisentraut, in an earlier review of this patch, complained about the format as well, saying something along the lines of "this is trying to be all things to all people". I don't want to dismiss that criticism, but neither can I put my finger on the problem. In an ideal world, we'd like to be all things to all people, but it's usually not possible to achieve that in practice. Still, it's not clear to me what need this wouldn't serve. It's possible to generate the text format from the XML or JSON format, so it should be well-suited to graphical presentation of explain output. It's also possible to grope through the output and, say, find the average cost of all your seqscan nodes, or verify the absence of merge joins, or anything of that sort that someone might think that they want to do. In a nutshell, the design is "take all the fields we have now and put XML/JSON markup around them so they're easier to get to". Maybe that's not enough of a design, but I don't have any other ideas. > Also, the JSON code seems a bit messy/poorly factorized. Is there > a reason for that, or just it's not as mature as the XML code? I wrote them together, so it's not a question of code maturity, but I wouldn't rule out me being dumb. I'm open to suggestions... AFAICS, the need to comma-separate list and hash elements is most of the problem. I had thought about replacing es->needs_separator with a list so that we could push/pop elements, but I wasn't totally sure whether that was a good idea. ...Robert
pgsql-hackers by date: