Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION - Mailing list pgsql-hackers
From | Daniel Farina |
---|---|
Subject | Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION |
Date | |
Msg-id | 7b97c5a40912050032g5651c257q49b9cdca6da82e2d@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION (Greg Smith <greg@2ndquadrant.com>) |
Responses |
Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO
FUNCTION
|
List | pgsql-hackers |
On Mon, Nov 30, 2009 at 12:14 PM, Greg Smith <greg@2ndquadrant.com> wrote: > Jeff Davis wrote: > > COPY target FROM FUNCTION foo() WITH RECORDS; > > > In what format would the records be? As a not-quite aside, I think I have a better syntax suggestion. I have recently been digging around in the grammar with the changes made in the following commit: commit a6833fbb85cb5212a9d8085849e7281807f732a6 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Mon Sep 21 20:10:21 2009 +0000 Define a new, more extensible syntax for COPY options. This is intentionally similar to the recently revised syntax for EXPLAIN options, ie, (name value, ...). The old syntaxis still supported for backwards compatibility, but we intend that any options added in future will be providedonly in the new syntax. Robert Haas, Emmanuel Cecchet As it turns out, the following syntax may work pretty well: COPY y TO FUNCTION (setup_function the_setup_function('some arg', 3, 7, 42)) Basically the func_expr reduction fits very neatly into the copy_generic_opt_elem reduction: copy_generic_opt_elem: ColLabel copy_generic_opt_arg { $$ = (Node *) makeDefElem($1, $2); } | ColLabel func_expr { $$ = (Node*) $2; } ; Now we can use more or less any reasonable number of symbol names and function calls we desire. This makes life considerably easier, I think... We can also try to refactor COPY's internals to take advantage of these features (and potentially reduce the number of mechanisms. For example, the legacy "COPY ... TO '/place' WITH CSV" perhaps can be more verbosely/generically expressed as: COPY ... TO FUNCTION (setup_function to_file('/place'), record_converter csv_formatter, stream_function fwrite end_function fclose); We can also add auxiliary symbols for error handling behavior. For example, were the COPY to fail for some reason maybe it would make sense "on_error" to call "unlink" to clean up the partially finished file. I also have what I think is a somewhat interesting hack. Consider some of the functions up there without arguments (presumably they'd be called with a somewhat fixed contract the mechanics of COPY itself): how does one disambiguate them? Ideally, one could sometimes use literal arguments (when the return value of that function is desired to be threaded through the other specified functions) and other times it'd be nice to disambiguate functions via type names. That would look something like the following: COPY ... TO FUNCTION (setup_function to_file('/place'), record_converter csv_formatter(record), stream_function fwrite(bytea), end_function fclose(text)); I think this is possible to implement without much ambiguity, drawing on the observation that the COPY statement does not have -- and probably will never have -- references via Var(iable) node, unlike normal SQL statements such as SELECT, INSERT, et al. That means we might be able disambiguate using the following rules when scanning the funcexpr's arguments during the semantic analysis phase to figure out what to do: * Only literal list items found: it's a function call with the types of those literals. Ex: my_setup_function('foo'::text,3) * Only non-literal list items found: it's type specifiers. Ex: csv_formatter(record). * Both literal and non-literal values found: report an error. This only works because no cases where a non-literal quantity could be confused with a type name come to mind. If one could name a type "3" and being forced to double-quote "3" to get your type disambiguated was just too ugly, then we are at an impasse. But otherwise I think this may work quite well. Common constellations of functions could perhaps be bound together into a DDL to reduce the amount of symbol soup going on here, but that seems like a pretty clean transition strategy at some later time. Most of the functionality could still be captured with this simple approach for now... Also note that factoring out high-performance implementations of things like csv_formatter (and friends: pg_binary_formatter) will probably take some time, but ultimately I think all the existing functionality could be realized as a layer of syntactic sugar over this mechanism. fdr
pgsql-hackers by date: