Thread: a proposal for an extendable deparser

a proposal for an extendable deparser

From

Dave Gudeman

Date:

26 February 2009, 15:13:55

While writing a shared-library extension for Postgres, I needed to output SQL expressions from trees. The only facility
fordoing that seems to be the deparse code in ruleutils.c, which is really intended for outputing rules and
constraints,not for producing general SQL, so it didn't do quite what I wanted. I ended up having to copy the entire
ruleutils.cfile and making a few minor changes deep in the file. Of course, copy-and-paste is not a very maintainable
formof code reuse so I'm not too happy with this solution. What I would like is a generic pretty-printing mechanism
thatcan be tailored for specific needs. I'm willing to do this work if I think it's likely to be accepted into the main
codeline.<br /><br />Here is the proposal: get_rule_expr() consists of a switch statement that looks like this:<br
/><br/>    switch (nodeTag(node))<br />    {<br />        case T_Var:<br />            (void) get_variable((Var *)
node,0, true, context);<br />             break;<br /><br />        case T_Const:<br />           
get_const_expr((Const*) node, context, 0);<br />            break;<br /><br />        case T_Param:<br />           
appendStringInfo(buf,"$%d", ((Param *) node)->paramid);<br />             break;<br /><br />        case ...<br
/><br/>I would replace this with a table-driven deparser:<br /><br />        deparse_table[nodeTag(node)](node,
context);<br/><br />where deparse_table[] is an array of function pointers containing functions similar to
get_variable(),get_const_expr(), etc. The functions would have to be adapted to a consistent signature using a more
genericcontext object. To create a modified deparser, you just copy deparse_table[] and replace some of its members
withyour own specialized replacements.<br /><br />The above description is a bit over-simplified. I would probably end
upmaking deparse_table[] into a struct with various flags and user-defined data in addition to the table of function
pointers.Also, it might have more than one table of function pointers. I think a table of RTE types would be useful,
forexample, and maybe a table of operators. It would support pretty printing entire queries, not just rules,
constraints,and fragments.<br /><br />I'd lke to get some feedback on this from the Postgres developers. Is there any
interestin this kind of facility? Would it be likely to be accepted?<br /><br />

Re: a proposal for an extendable deparser

From

Tom Lane

Date:

26 February 2009, 15:54:41

Dave Gudeman <dave.gudeman@gmail.com> writes:
> I would replace this with a table-driven deparser:
>         deparse_table[nodeTag(node)](node, context);

I don't actually see what this is going to buy for you.  You didn't
say exactly why ruleutils doesn't work for you, but reading between
the lines suggests that you want to add new node types.  There are
a *ton* of places that need to change for that, typically, and this
isn't going to fix it.

I've occasionally speculated about the possible value of switching
over to method-table-based node types (or maybe just biting the bullet
and going to C++) but it never really looked like it would be worth
the trouble.
        regards, tom lane

Re: a proposal for an extendable deparser

From

Dave Gudeman

Date:

27 February 2009, 03:53:36

I don't need to add new node types or add any syntax; it is the output that I'm concerned with. What I want is a way to
printa tree according to some pretty strict rules. For example, I want a special syntax for function RTEs and I don't
wantthe v::type notation to be output (the flag to turn it off doesn't do what I want).<br /><br />There are lots of
usesfor specialized pretty-printing. Sometimes you have a simplified syntax reader that can't handle the fully general
syntax.For example, you might write an extension in Perl that needs to understand the parse trees. One way to make this
workis to print out a simplified syntax from the parse tree and then reparse in Perl. Another use is for general
pretty-printing.For example, I modified ruleutils.c to let me print a nice representation of the SQL statement after
allof the source-to-source transformations but before the planning. This was a big help in debugging the
source-to-sourcetransformations I was working on.<br /><br />As a general rule, you want the list of node types to
appearin as few places as possible in order to increase the maintainability of the code. One way to do that is with
genericwalkers like the ones in Postgresql (a nice solution, by the way). This works well in the case where only a
smallnumber of node types need special consideration. The case of printing different though. In printing, there is a
defaultthing to do with each node type --but something different for each type. You want to do the default thing for
mostof the nodes, but something special for a few types. The best way I know to abstract that sort of process is with a
table-drivenwalker.<br /><br />As for future plans, if you ever get serious about making a big change in the parsing
andtree-manipulating code then you might want to look into some of the open-source attribute-grammar and
tree-transformationsoftware rather than C++. Those tools are specialized for that kind of work while C++ has some
weaknessesin that area. I think some of these tools have BSD-style licenses. The downside is that they require the
maintainersto know yet another language. The upside is that they let you work at a higher level of abstraction. And
mostof them come with built-in pretty printers :-).<br /><br /><br /><div class="gmail_quote">On Thu, Feb 26, 2009 at
11:54AM, Tom Lane <span dir="ltr"><<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex;
padding-left:1ex;"><div class="Ih2E3d">Dave Gudeman <<a
href="mailto:dave.gudeman@gmail.com">dave.gudeman@gmail.com</a>>writes:<br /> > I would replace this with a
table-drivendeparser:<br /> >         deparse_table[nodeTag(node)](node, context);<br /><br /></div>I don't actually
seewhat this is going to buy for you.  You didn't<br /> say exactly why ruleutils doesn't work for you, but reading
between<br/> the lines suggests that you want to add new node types.  There are<br /> a *ton* of places that need to
changefor that, typically, and this<br /> isn't going to fix it.<br /><br /> I've occasionally speculated about the
possiblevalue of switching<br /> over to method-table-based node types (or maybe just biting the bullet<br /> and going
toC++) but it never really looked like it would be worth<br /> the trouble.<br /><br />                        regards,
tomlane<br /></blockquote></div><br />

Re: a proposal for an extendable deparser

From

Heikki Linnakangas

Date:

02 March 2009, 04:37:21

Dave Gudeman wrote:
> I don't need to add new node types or add any syntax; it is the output that
> I'm concerned with. What I want is a way to print a tree according to some
> pretty strict rules. For example, I want a special syntax for function RTEs
> and I don't want the v::type notation to be output (the flag to turn it off
> doesn't do what I want).

This will become useful for SQL/MED connectors to other databases. Other 
DBMSs have slightly different syntax, and with something like this you 
could still use ruleutils.c for the deparsing, but tweak it slightly for 
the target database.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: a proposal for an extendable deparser

From

Tom Lane

Date:

02 March 2009, 11:54:36

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Dave Gudeman wrote:
>> I don't need to add new node types or add any syntax; it is the output that
>> I'm concerned with. What I want is a way to print a tree according to some
>> pretty strict rules. For example, I want a special syntax for function RTEs
>> and I don't want the v::type notation to be output (the flag to turn it off
>> doesn't do what I want).

> This will become useful for SQL/MED connectors to other databases. Other 
> DBMSs have slightly different syntax, and with something like this you 
> could still use ruleutils.c for the deparsing, but tweak it slightly for 
> the target database.

That all sounds like pie in the sky to me.  It's unlikely that you could
produce any specified syntax with just minor changes to the dumping of a
node type or two --- the node structure is specific to Postgres' view of
the world and won't necessarily be amenable to producing someone else's
syntax.

On the whole, "copy and paste ruleutils" seems like a sufficient answer
to me.  Maybe when we have a couple of examples of people having to do
that, we can figure out an abstraction that solves the problem better;
but I have no confidence that the mechanism Dave proposes will help
or will be worth the trouble to implement.

An even more likely answer is "patch ruleutils so it has an extra flag
that does what you want".  We might or might not be willing to take such
a patch back into core, but it sure seems like a lot less work.
        regards, tom lane