Thread: dollar quoting and pg_dump
I had a brief look at this today. Basically, I thought of adding a new routine to dumputils.c thus: void appendStringLiteralDQ(PQExpBuffer buf, const char *str, const char *dqprefix) and using it in dumping function bodies and comments on all objects, with a prefix argument of "function" and "comment" respectively. There might be other places where we want to use dollar quoting, but this would be a good start, ISTM. Basically, this routine would start with $ (+ dqprefix if not null) and then keep adding characters (in turn "_1234567890") until that string was not found in str, then appending "$" and using that as the delimiter. Thoughts? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > ... using it in dumping function bodies and comments on all objects, > with a prefix argument of "function" and "comment" respectively. There > might be other places where we want to use dollar quoting, but this > would be a good start, ISTM. Do we really need to be that verbose? Why not start with the minimal $$ and extend only if needed? On the KISS principle, trying "$$", "$X$", "$XX$", "$XXX$", etc seems sufficient. For that matter, I'm not convinced we should use $$ for comments. They don't have nearly the problem that functions do with embedded quotes. A thought: maybe just put this logic into the regular appendStringLiteral routine, and trigger it when the string contains any quotes or backslashes; if it has none, you can just use quotes ... BTW, I've been holding off making this change myself, realizing that it will completely break backwards compatibility of pg_dump output to 7.4 and earlier. Not sure if anyone is trying to use CVS tip pg_dump with older releases, but it seems possible given that the dump ordering issue is finally solved. Might be a good idea to make it disablable with a fallback to regular quoting. regards, tom lane
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>... using it in dumping function bodies and comments on all objects, >>with a prefix argument of "function" and "comment" respectively. There >>might be other places where we want to use dollar quoting, but this >>would be a good start, ISTM. >> >> > >Do we really need to be that verbose? Why not start with the minimal $$ >and extend only if needed? On the KISS principle, trying "$$", "$X$", >"$XX$", "$XXX$", etc seems sufficient. > > It's a matter of taste, I guess. I'm certainly not dogmatic about it. The function design in my head is flexible enough for either. >For that matter, I'm not convinced we should use $$ for comments. They >don't have nearly the problem that functions do with embedded quotes. > > Well, I keep a master schema file and I like to decorate it with fairly verbose comments, so users can see what the object is for and how it works. I've been caught a few times with forgetting to double quotes inside the comments. But again, Maybe you are right, and we should at least start with just the obvious case. >A thought: maybe just put this logic into the regular >appendStringLiteral routine, and trigger it when the string contains any >quotes or backslashes; if it has none, you can just use quotes ... > > I did think of fallback, but rejected it on the KISS principle :-) I also prefer consistency in style - I want all my functions dollar quoted even if they don't currently contain characters in need of escape. >BTW, I've been holding off making this change myself, realizing that it >will completely break backwards compatibility of pg_dump output to 7.4 >and earlier. Not sure if anyone is trying to use CVS tip pg_dump with >older releases, but it seems possible given that the dump ordering issue >is finally solved. Might be a good idea to make it disablable with a >fallback to regular quoting. > > Makes sense. "-X disable-dollar-quoting"? Or we could have it turned off by default and require it to be specifically turned on - that might conform to the principle of least surprise, at least for now. For that matter, we could also have a "verbose-dollar-quoting" feature, and/or a "dollar-quote-objects=functions,comments,......" feature But let's walk before we start to run ;-) cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> A thought: maybe just put this logic into the regular >> appendStringLiteral routine, and trigger it when the string contains any >> quotes or backslashes; if it has none, you can just use quotes ... > I did think of fallback, but rejected it on the KISS principle :-) I > also prefer consistency in style - I want all my functions dollar quoted > even if they don't currently contain characters in need of escape. Good point. Never mind that idea then. >> Might be a good idea to make it disablable with a >> fallback to regular quoting. > Makes sense. "-X disable-dollar-quoting"? Or we could have it turned off > by default and require it to be specifically turned on - that might > conform to the principle of least surprise, at least for now. I don't mind if it's on by default; just thinking that some people might appreciate a way to turn it off. "-X disable-dollar-quoting" sounds fine. regards, tom lane
> I don't mind if it's on by default; just thinking that some people might > appreciate a way to turn it off. "-X disable-dollar-quoting" sounds > fine. Does it _have_ to be dollars? Other languages call this feature 'heretext' IIRC. Chris
Christopher Kings-Lynne wrote: >> I don't mind if it's on by default; just thinking that some people might >> appreciate a way to turn it off. "-X disable-dollar-quoting" sounds >> fine. > > > Does it _have_ to be dollars? Other languages call this feature > 'heretext' IIRC. > This is not heretext, and calling it such would be quite misleading. Nobody has yet come up with a better name for it. cheers andrew
On Wed, 24 Mar 2004, Christopher Kings-Lynne wrote: > > I don't mind if it's on by default; just thinking that some people might > > appreciate a way to turn it off. "-X disable-dollar-quoting" sounds > > fine. > > Does it _have_ to be dollars? Other languages call this feature > 'heretext' IIRC. No, but I think the rough consensus was that the behavior of this feature is different enough from shell and Perl here documents that it deserved a different name. In particular the $ is part of both the opening and closing token, and newlines aren't relevant at either the beginning or end. Jon
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes: > Does it _have_ to be dollars? Other languages call this feature > 'heretext' IIRC. I'm not in love with the name "dollar quoting" either ... but "here text" would be quite misleading. See the archives for the discussions that led us to develop our definition. The features that go by that name usually have dependencies on formatting (ie newline boundaries) that we specifically rejected. Right now would be a fine time to invent a better name if you can think of one ... regards, tom lane