Thread: a proposal for an extendable deparser
While writing a shared-library extension for Postgres, I needed to output SQL expressions from trees. The only facility fordoing that seems to be the deparse code in ruleutils.c, which is really intended for outputing rules and constraints,not for producing general SQL, so it didn't do quite what I wanted. I ended up having to copy the entire ruleutils.cfile and making a few minor changes deep in the file. Of course, copy-and-paste is not a very maintainable formof code reuse so I'm not too happy with this solution. What I would like is a generic pretty-printing mechanism thatcan be tailored for specific needs. I'm willing to do this work if I think it's likely to be accepted into the main codeline.<br /><br />Here is the proposal: get_rule_expr() consists of a switch statement that looks like this:<br /><br/> switch (nodeTag(node))<br /> {<br /> case T_Var:<br /> (void) get_variable((Var *) node,0, true, context);<br /> break;<br /><br /> case T_Const:<br /> get_const_expr((Const*) node, context, 0);<br /> break;<br /><br /> case T_Param:<br /> appendStringInfo(buf,"$%d", ((Param *) node)->paramid);<br /> break;<br /><br /> case ...<br /><br/>I would replace this with a table-driven deparser:<br /><br /> deparse_table[nodeTag(node)](node, context);<br/><br />where deparse_table[] is an array of function pointers containing functions similar to get_variable(),get_const_expr(), etc. The functions would have to be adapted to a consistent signature using a more genericcontext object. To create a modified deparser, you just copy deparse_table[] and replace some of its members withyour own specialized replacements.<br /><br />The above description is a bit over-simplified. I would probably end upmaking deparse_table[] into a struct with various flags and user-defined data in addition to the table of function pointers.Also, it might have more than one table of function pointers. I think a table of RTE types would be useful, forexample, and maybe a table of operators. It would support pretty printing entire queries, not just rules, constraints,and fragments.<br /><br />I'd lke to get some feedback on this from the Postgres developers. Is there any interestin this kind of facility? Would it be likely to be accepted?<br /><br />
Dave Gudeman <dave.gudeman@gmail.com> writes: > I would replace this with a table-driven deparser: > deparse_table[nodeTag(node)](node, context); I don't actually see what this is going to buy for you. You didn't say exactly why ruleutils doesn't work for you, but reading between the lines suggests that you want to add new node types. There are a *ton* of places that need to change for that, typically, and this isn't going to fix it. I've occasionally speculated about the possible value of switching over to method-table-based node types (or maybe just biting the bullet and going to C++) but it never really looked like it would be worth the trouble. regards, tom lane
I don't need to add new node types or add any syntax; it is the output that I'm concerned with. What I want is a way to printa tree according to some pretty strict rules. For example, I want a special syntax for function RTEs and I don't wantthe v::type notation to be output (the flag to turn it off doesn't do what I want).<br /><br />There are lots of usesfor specialized pretty-printing. Sometimes you have a simplified syntax reader that can't handle the fully general syntax.For example, you might write an extension in Perl that needs to understand the parse trees. One way to make this workis to print out a simplified syntax from the parse tree and then reparse in Perl. Another use is for general pretty-printing.For example, I modified ruleutils.c to let me print a nice representation of the SQL statement after allof the source-to-source transformations but before the planning. This was a big help in debugging the source-to-sourcetransformations I was working on.<br /><br />As a general rule, you want the list of node types to appearin as few places as possible in order to increase the maintainability of the code. One way to do that is with genericwalkers like the ones in Postgresql (a nice solution, by the way). This works well in the case where only a smallnumber of node types need special consideration. The case of printing different though. In printing, there is a defaultthing to do with each node type --but something different for each type. You want to do the default thing for mostof the nodes, but something special for a few types. The best way I know to abstract that sort of process is with a table-drivenwalker.<br /><br />As for future plans, if you ever get serious about making a big change in the parsing andtree-manipulating code then you might want to look into some of the open-source attribute-grammar and tree-transformationsoftware rather than C++. Those tools are specialized for that kind of work while C++ has some weaknessesin that area. I think some of these tools have BSD-style licenses. The downside is that they require the maintainersto know yet another language. The upside is that they let you work at a higher level of abstraction. And mostof them come with built-in pretty printers :-).<br /><br /><br /><div class="gmail_quote">On Thu, Feb 26, 2009 at 11:54AM, Tom Lane <span dir="ltr"><<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>></span> wrote:<br /><blockquoteclass="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left:1ex;"><div class="Ih2E3d">Dave Gudeman <<a href="mailto:dave.gudeman@gmail.com">dave.gudeman@gmail.com</a>>writes:<br /> > I would replace this with a table-drivendeparser:<br /> > deparse_table[nodeTag(node)](node, context);<br /><br /></div>I don't actually seewhat this is going to buy for you. You didn't<br /> say exactly why ruleutils doesn't work for you, but reading between<br/> the lines suggests that you want to add new node types. There are<br /> a *ton* of places that need to changefor that, typically, and this<br /> isn't going to fix it.<br /><br /> I've occasionally speculated about the possiblevalue of switching<br /> over to method-table-based node types (or maybe just biting the bullet<br /> and going toC++) but it never really looked like it would be worth<br /> the trouble.<br /><br /> regards, tomlane<br /></blockquote></div><br />
Dave Gudeman wrote: > I don't need to add new node types or add any syntax; it is the output that > I'm concerned with. What I want is a way to print a tree according to some > pretty strict rules. For example, I want a special syntax for function RTEs > and I don't want the v::type notation to be output (the flag to turn it off > doesn't do what I want). This will become useful for SQL/MED connectors to other databases. Other DBMSs have slightly different syntax, and with something like this you could still use ruleutils.c for the deparsing, but tweak it slightly for the target database. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Dave Gudeman wrote: >> I don't need to add new node types or add any syntax; it is the output that >> I'm concerned with. What I want is a way to print a tree according to some >> pretty strict rules. For example, I want a special syntax for function RTEs >> and I don't want the v::type notation to be output (the flag to turn it off >> doesn't do what I want). > This will become useful for SQL/MED connectors to other databases. Other > DBMSs have slightly different syntax, and with something like this you > could still use ruleutils.c for the deparsing, but tweak it slightly for > the target database. That all sounds like pie in the sky to me. It's unlikely that you could produce any specified syntax with just minor changes to the dumping of a node type or two --- the node structure is specific to Postgres' view of the world and won't necessarily be amenable to producing someone else's syntax. On the whole, "copy and paste ruleutils" seems like a sufficient answer to me. Maybe when we have a couple of examples of people having to do that, we can figure out an abstraction that solves the problem better; but I have no confidence that the mechanism Dave proposes will help or will be worth the trouble to implement. An even more likely answer is "patch ruleutils so it has an extra flag that does what you want". We might or might not be willing to take such a patch back into core, but it sure seems like a lot less work. regards, tom lane