Thread: plperl and/or insert trigger problem
I'm writing a function that fetches data in an Oracle database and stores it in postgresql database.
The fetching from Oracle and the inserting in PostgreSQL both work correctly. I know this because with an empty target table and without an insert trigger the source data gets 'copied' perfectly to the target table.
Of course the target won't always be empty. So the unique index might cause some inserts to fail. So I wanted to put some error handling in my function. To test what kind of information I would get I added an insert trigger to the target table. This trigger raises to different errors based on the data inserted: if dataareaid is 'lil' an error is raised, if dataareaid = 'bol' another error is raised, other values for dataareaid don't raise an error.
So the plperl function should receive these error messages and handle them appropriatly. For this test the function reports them just as info to the screen.
So the plperl function should receive these error messages and handle them appropriatly. For this test the function reports them just as info to the screen.
When I 'manually' (without using the function and without interference of Oracle) insert a record that should trigger the error raising, the correct error is raised. When I use the function (see below) but change the query that fetches the Oracle data so that only one type of error is triggered (eg "SELECT * FROM AddressFormatHeading WHERE dataareaid = 'lil'" or "SELECT * FROM AddressFormatHeading WHERE dataareaid = 'bol'") the correct error message is passed from the trigger to my function.
However when both error triggering dataareaids exist in the set of data fetched from Oracle only one type of error message seems to get passed from the trigger to my function. Apparently it is the one that is caused first. Obviously that is not what I expect. This is what I get:
executing 14 generated 4 errors
ERROR: lil foutje Address Belgium
ERROR: lil foutje Address Belgium
ERROR: lil foutje Address Belgium
ERROR: lil foutje Address Belgium
And this is what I expect to get:
executing 14 generated 4 errors
ERROR: lil foutje Address Belgium
ERROR: lil foutje Address France
ERROR: bol nog een foutje Italie
ERROR: bol nog een foutje Beglie
This is the target table definition in PostgreSQL
CREATE TABLE "public"."afh_test" (
"addrformat" VARCHAR(10) NOT NULL,
"name" VARCHAR(30) NOT NULL,
"dataareaid" VARCHAR(3) NOT NULL,
"recid" NUMERIC(10,0) NOT NULL
) WITHOUT OIDS;
CREATE UNIQUE INDEX "afh_test_idx" ON "public"."afh_test"
USING btree ("addrformat", "dataareaid");
However when both error triggering dataareaids exist in the set of data fetched from Oracle only one type of error message seems to get passed from the trigger to my function. Apparently it is the one that is caused first. Obviously that is not what I expect. This is what I get:
executing 14 generated 4 errors
ERROR: lil foutje Address Belgium
ERROR: lil foutje Address Belgium
ERROR: lil foutje Address Belgium
ERROR: lil foutje Address Belgium
And this is what I expect to get:
executing 14 generated 4 errors
ERROR: lil foutje Address Belgium
ERROR: lil foutje Address France
ERROR: bol nog een foutje Italie
ERROR: bol nog een foutje Beglie
This is the target table definition in PostgreSQL
CREATE TABLE "public"."afh_test" (
"addrformat" VARCHAR(10) NOT NULL,
"name" VARCHAR(30) NOT NULL,
"dataareaid" VARCHAR(3) NOT NULL,
"recid" NUMERIC(10,0) NOT NULL
) WITHOUT OIDS;
CREATE UNIQUE INDEX "afh_test_idx" ON "public"."afh_test"
USING btree ("addrformat", "dataareaid");
The source table definition is the same as the target table definition, though the syntax differs slightly:
CREATE TABLE ADDRESSFORMATHEADING (
ADDRFORMAT VARCHAR2(10 BYTE) DEFAULT '.' NOT NULL,
NAME VARCHAR2(30 BYTE) DEFAULT '.' NOT NULL,
DATAAREAID VARCHAR2(3 BYTE) DEFAULT 'dat' NOT NULL,
RECID NUMBER(10) NOT NULL
)
CREATE TABLE ADDRESSFORMATHEADING (
ADDRFORMAT VARCHAR2(10 BYTE) DEFAULT '.' NOT NULL,
NAME VARCHAR2(30 BYTE) DEFAULT '.' NOT NULL,
DATAAREAID VARCHAR2(3 BYTE) DEFAULT 'dat' NOT NULL,
RECID NUMBER(10) NOT NULL
)
This is the data in Oracle:
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('national', 'This country', 'ash', 30);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('be', 'Address Belgium', 'lil', 501);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('fr', 'Address France', 'lil', 496);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('it', 'Italie', 'bol', 3138);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('national', 'National', '012', 687181679);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('internatio', 'International countries', 'ash', 29);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('be', 'Beglie', 'bol', 3187);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('sp', 'Address Spain', 'bar', 1302174);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('internatio', 'International countries', 'as0', 29);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('national', 'This country', 'as0', 30);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('national', 'National', '011', 216774985);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('internatio', 'International', '011', 216774984);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('national', 'National', 'hlm', 451094066);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('internatio', 'International', 'hlm', 451094067);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('national', 'This country', 'ash', 30);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('be', 'Address Belgium', 'lil', 501);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('fr', 'Address France', 'lil', 496);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('it', 'Italie', 'bol', 3138);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('national', 'National', '012', 687181679);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('internatio', 'International countries', 'ash', 29);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('be', 'Beglie', 'bol', 3187);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('sp', 'Address Spain', 'bar', 1302174);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('internatio', 'International countries', 'as0', 29);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('national', 'This country', 'as0', 30);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('national', 'National', '011', 216774985);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('internatio', 'International', '011', 216774984);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('national', 'National', 'hlm', 451094066);
Insert into addressformatheading (ADDRFORMAT, NAME, DATAAREAID, RECID) Values ('internatio', 'International', 'hlm', 451094067);
This is the trigger I added to the target table. Like it's written here records with a dataareaid that doesn't trigger an error is not inserted. By using RETURN NEW; in the trigger they do get inserted, but that doesn't change the fact that I only get one type of error message instead of two.
CREATE TRIGGER "afh_test_tr" BEFORE INSERT
ON "public"."afh_test" FOR EACH ROW
EXECUTE PROCEDURE "public"."temp_func1"();
CREATE OR REPLACE FUNCTION "public"."temp_func1" () RETURNS trigger AS
$body$
BEGIN
IF NEW.dataareaid = 'lil' THEN
RAISE EXCEPTION '% foutje %', NEW.dataareaid, NEW.name;
elsIF NEW.dataareaid = 'bol' THEN
RAISE EXCEPTION '% nog een foutje %', NEW.dataareaid, NEW.name;
END IF;
RETURN NULL;
END;
$body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;
This is the function that retrieves the Oracle data and inserts it in the target table
CREATE OR REPLACE FUNCTION "public"."dbi_insert3" () RETURNS integer AS
$body$
use DBI;
$query = 'SELECT * FROM AddressFormatHeading';
$target = 'INSERT INTO afh_test (addrformat, name, dataareaid, recid) VALUES (?,?,?,?)';
my $dbh_ora = DBI->connect('dbi:Oracle:database=bmssa;host=firev120-1.indicator.be;sid=mars', 'bmssa', '8QD6ibmD')
or die "Couldn't connect to database: " . DBI->errstr;
my $dbh_pg = DBI->connect('dbi:Pg:dbname=defrevdev;host=10.100.1.21;port=2345', 'defrevsys', 'Y2I6vbEW')
or die "Couldn't connect to database: " . DBI->errstr;
my $sel = $dbh_ora->prepare($query)
or elog(ERROR, "Couldn't prepare statement: " . $dbh_ora->errstr);
$sel->execute;
my $ins = $dbh_pg->prepare($target)
or elog(ERROR, "Couldn't prepare statement: " . $dbh_pg->errstr);
my $fetch_tuple_sub = sub { $sel->fetchrow_arrayref };
my @tuple_status;
my $rc = $ins->execute_for_fetch($fetch_tuple_sub, \@tuple_status);
if (DBI->err) {
elog(INFO, DBI->errstr."\n");
my @errors = grep { ref $_ } @tuple_status;
foreach my $error (@errors) {
elog(INFO, $error->[1]);
}
}
$dbh_ora->disconnect;
$dbh_pg->disconnect;
$body$
LANGUAGE 'plperlu' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;
And this ... well you can guess...
select dbi_insert3();
Does anyone have an idea what's going on? Am I doing something wrong or might this be a bug?
CREATE TRIGGER "afh_test_tr" BEFORE INSERT
ON "public"."afh_test" FOR EACH ROW
EXECUTE PROCEDURE "public"."temp_func1"();
CREATE OR REPLACE FUNCTION "public"."temp_func1" () RETURNS trigger AS
$body$
BEGIN
IF NEW.dataareaid = 'lil' THEN
RAISE EXCEPTION '% foutje %', NEW.dataareaid, NEW.name;
elsIF NEW.dataareaid = 'bol' THEN
RAISE EXCEPTION '% nog een foutje %', NEW.dataareaid, NEW.name;
END IF;
RETURN NULL;
END;
$body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;
This is the function that retrieves the Oracle data and inserts it in the target table
CREATE OR REPLACE FUNCTION "public"."dbi_insert3" () RETURNS integer AS
$body$
use DBI;
$query = 'SELECT * FROM AddressFormatHeading';
$target = 'INSERT INTO afh_test (addrformat, name, dataareaid, recid) VALUES (?,?,?,?)';
my $dbh_ora = DBI->connect('dbi:Oracle:database=bmssa;host=firev120-1.indicator.be;sid=mars', 'bmssa', '8QD6ibmD')
or die "Couldn't connect to database: " . DBI->errstr;
my $dbh_pg = DBI->connect('dbi:Pg:dbname=defrevdev;host=10.100.1.21;port=2345', 'defrevsys', 'Y2I6vbEW')
or die "Couldn't connect to database: " . DBI->errstr;
my $sel = $dbh_ora->prepare($query)
or elog(ERROR, "Couldn't prepare statement: " . $dbh_ora->errstr);
$sel->execute;
my $ins = $dbh_pg->prepare($target)
or elog(ERROR, "Couldn't prepare statement: " . $dbh_pg->errstr);
my $fetch_tuple_sub = sub { $sel->fetchrow_arrayref };
my @tuple_status;
my $rc = $ins->execute_for_fetch($fetch_tuple_sub, \@tuple_status);
if (DBI->err) {
elog(INFO, DBI->errstr."\n");
my @errors = grep { ref $_ } @tuple_status;
foreach my $error (@errors) {
elog(INFO, $error->[1]);
}
}
$dbh_ora->disconnect;
$dbh_pg->disconnect;
$body$
LANGUAGE 'plperlu' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;
And this ... well you can guess...
select dbi_insert3();
Does anyone have an idea what's going on? Am I doing something wrong or might this be a bug?
Thanks,
Bart
Bart
Bart Degryse wrote: > I'm writing a function that fetches data in an Oracle database and stores it in postgresql database. > > The fetching from Oracle and the inserting in PostgreSQL both work correctly. I know this because with an empty targettable and without an insert trigger the source data gets 'copied' perfectly to the target table. > > Of course the target won't always be empty. So the unique index might cause some inserts to fail. So I wanted to put someerror handling in my function. To test what kind of information I would get I added an insert trigger to the target table.This trigger raises to different errors based on the data inserted: if dataareaid is 'lil' an error is raised, if dataareaid= 'bol' another error is raised, other values for dataareaid don't raise an error. > So the plperl function should receive these error messages and handle them appropriatly. For this test the function reportsthem just as info to the screen. > > When I 'manually' (without using the function and without interference of Oracle) insert a record that should trigger theerror raising, the correct error is raised. When I use the function (see below) but change the query that fetches theOracle data so that only one type of error is triggered (eg "SELECT * FROM AddressFormatHeading WHERE dataareaid = 'lil'"or "SELECT * FROM AddressFormatHeading WHERE dataareaid = 'bol'") the correct error message is passed from the triggerto my function. > However when both error triggering dataareaids exist in the set of data fetched from Oracle only one type of error messageseems to get passed from the trigger to my function. Apparently it is the one that is caused first. Obviously thatis not what I expect. This is what I get: > executing 14 generated 4 errors > ERROR: lil foutje Address Belgium > ERROR: lil foutje Address Belgium > ERROR: lil foutje Address Belgium > ERROR: lil foutje Address Belgium > And this is what I expect to get: > executing 14 generated 4 errors > ERROR: lil foutje Address Belgium > ERROR: lil foutje Address France > ERROR: bol nog een foutje Italie > ERROR: bol nog een foutje Beglie I don't think this is the cause, but you're using DBI->err|errstr - do you not want $dbh_pg->err|errstr? Otherwise, you can't identify different errors on the oracle vs pg connections. Second - it's not just that your function does the inserts in the context of a single transaction, is it? That would mean you're just seeing the original error repeated. -- Richard Huxton Archonet Ltd
Using DBI->err was a leftover from earlier testing. $dbh_pg->err is of course better. But it doesn't solve the problem.
I'm not sure what you mean with your second remark.
The call to my function ( SELECT dbi_insert3(); ) is one transaction I suppose.
According to the documentation on execute_for_fetch (http://search.cpan.org/~timb/DBI-1.48/DBI.pm#execute_for_fetch) however
an execute is done for every fetched record and @tuple_status should contain the error message associated with each failed execute.
>>> Richard Huxton <dev@archonet.com> 2007-06-06 12:19 >>>
I don't think this is the cause, but you're using DBI->err|errstr - do
you not want $dbh_pg->err|errstr? Otherwise, you can't identify
different errors on the oracle vs pg connections.
Second - it's not just that your function does the inserts in the
context of a single transaction, is it? That would mean you're just
seeing the original error repeated.
Bart Degryse wrote: > Using DBI->err was a leftover from earlier testing. $dbh_pg->err is of course better. But it doesn't solve the problem. > > I'm not sure what you mean with your second remark. > The call to my function ( SELECT dbi_insert3(); ) is one transaction I suppose. > According to the documentation on execute_for_fetch (http://search.cpan.org/~timb/DBI-1.48/DBI.pm#execute_for_fetch) however > an execute is done for every fetched record and @tuple_status should contain the error message associated with each failedexecute. I was wondering if there was a hidden BEGIN...COMMIT sneaking into the process somewhere - either from execute_for_fetch() or in the context of using DBI from within plperl. Reading back through, you say that the "good" rows get inserted, so that can't be the case. The only other reasons that spring to mind are: 1. A bug in your looping through tuple-status 2. A bug in execute_for_fetch() filling the tuple-status array. What happens if you elog the whole array (just to get the ref numbers) - that should show whether DBI is filling the array incorrectly. -- Richard Huxton Archonet Ltd
I tend to exclude reason 1: I've dumped the whole array using a debugger and it really contains what I return when looping through it.
As far as I can see it's rather reason 2: execute_for_fetch seems to fill the array incorrectly, that is: it's a valid array, but the last value added to it also seems to overwrite previously added values.
I seem to have found a workaround but my perl knowledge is too limited to evaluate if it's a good one.
When I replace
my $fetch_tuple_sub = sub { $sel->fetchrow_arrayref };
by
my $fetch_tuple_sub = sub {
my $ary_ref = $sel->fetchrow_arrayref;
print "my method: ".$dbh_pg->errstr."\n" if $dbh_pg->err;
return $ary_ref;
};
then the expected exception messages get printed.
Is this a acceptable way to do it in your opinion?
>>> Richard Huxton <dev@archonet.com> 2007-06-06 16:04 >>>
Bart Degryse wrote:
> Using DBI->err was a leftover from earlier testing. $dbh_pg->err is of course better. But it doesn't solve the problem.
>
> I'm not sure what you mean with your second remark.
> The call to my function ( SELECT dbi_insert3(); ) is one transaction I suppose.
> According to the documentation on execute_for_fetch (http://search.cpan.org/~timb/DBI-1.48/DBI.pm#execute_for_fetch) however
> an execute is done for every fetched record and @tuple_status should contain the error message associated with each failed execute.
I was wondering if there was a hidden BEGIN...COMMIT sneaking into the
process somewhere - either from execute_for_fetch() or in the context of
using DBI from within plperl. Reading back through, you say that the
"good" rows get inserted, so that can't be the case.
The only other reasons that spring to mind are:
1. A bug in your looping through tuple-status
2. A bug in execute_for_fetch() filling the tuple-status array.
What happens if you elog the whole array (just to get the ref numbers) -
that should show whether DBI is filling the array incorrectly.
--
Richard Huxton
Archonet Ltd
Bart Degryse wrote: > I tend to exclude reason 1: I've dumped the whole array using a debugger and it really contains what I return when loopingthrough it. > As far as I can see it's rather reason 2: execute_for_fetch seems to fill the array incorrectly, that is: it's a validarray, but the last value added to it also seems to overwrite previously added values. Hmm - see below > > I seem to have found a workaround but my perl knowledge is too limited to evaluate if it's a good one. I'm no guru myself... > When I replace > my $fetch_tuple_sub = sub { $sel->fetchrow_arrayref }; > by > my $fetch_tuple_sub = sub { > my $ary_ref = $sel->fetchrow_arrayref; > print "my method: ".$dbh_pg->errstr."\n" if $dbh_pg->err; > return $ary_ref; > }; > then the expected exception messages get printed. > Is this a acceptable way to do it in your opinion? Looks OK to me, except you'll not get any error message on the last row - the insert will be called after the fetch. I've had a quick look at my copy of DBI.pm (Debian Etch - lives in /usr/lib/perl5/DBI.pm) Around line 1930, we have the error-handling for execute_for_fetch() else { $err_count++; my $err = $sth->err; push @$tuple_status, [ $err, $errstr_cache{$err} ||= $sth->errstr, $sth->state ]; ... Notice how it's taking a copy of the error code ("my $err = ") but not the error-string? What happens if you change the code to take a copy of sth->errstr too: my ($err,$errstr) = ($sth->err, $sth->errstr); push @$tuple_status, [ $err, $errstr_cache{$err} ||= $errstr, $sth->state]; -- Richard Huxton Archonet Ltd
On Wed, Jun 06, 2007 at 04:44:03PM +0100, Richard Huxton wrote: > I've had a quick look at my copy of DBI.pm (Debian Etch - lives in > /usr/lib/perl5/DBI.pm) > > Around line 1930, we have the error-handling for execute_for_fetch() > > else { > $err_count++; > my $err = $sth->err; > push @$tuple_status, [ $err, $errstr_cache{$err} ||= $sth->errstr, > $sth->state ]; The reference to erstr_cache seems to infer that the code assumes there can be only one error string for any particular. Looking at the code I can't work out why that variable even exists. I'd say replace that line with: push @$tuple_status, [ $err, $sth->errstr, $sth->state]; And be done with it... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Richard Huxton wrote:
> I'm no guru myself...
Don't underestimate yourself, after all you found the code where it goes wrong
> Looks OK to me, except you'll not get any error message on the last row
> - the insert will be called after the fetch.
I do get an error message on the last row (assuming that it caused an error of course)
> - the insert will be called after the fetch.
I do get an error message on the last row (assuming that it caused an error of course)
There are [number of rows to fetch] + 1 fetches done by execute for fetch
As the POD says:
"The execute_for_fetch() method calls $fetch_tuple_sub ... until it returns a false value"
So I get the error caused by the last record when execute_for_fetch fetches again and
sees that there's nothing more to fetch.
>What happens if you change the code to take a copy of sth->errstr too:
Nothing. Same result. Which made me wonder why they take a copy of the
Nothing. Same result. Which made me wonder why they take a copy of the
error code anyway. So I dropped that replacing
my $err = $sth->err;
push @$tuple_status, [ $err, $errstr_cache{$err} ||= $sth->errstr,
$sth->state ];
by
push @$tuple_status, [ $err, $errstr_cache{$err} ||= $sth->errstr,
$sth->state ];
by
push @$tuple_status, [ $sth->err, $errstr_cache{$err} ||= $sth->errstr,
That gave me the same (but still unexpected) result.
That gave me the same (but still unexpected) result.
Martijn van Oosterhout wrote:
> The reference to erstr_cache seems to infer that the code assumes there
> can be only one error string for any particular. Looking at the code I
> can't work out why that variable even exists.
> can be only one error string for any particular. Looking at the code I
> can't work out why that variable even exists.
And he was right! There lies the real problem. It seems to me like a (faulty)
way to try to return each different error message only once. But that wouldn't be
the behaviour as described in the POD.
So finally I replaced
my $err = $sth->err;
push @$tuple_status, [ $err, $errstr_cache{$err} ||= $sth->errstr,
$sth->state ];
by
push @$tuple_status, [ $err, $errstr_cache{$err} ||= $sth->errstr,
$sth->state ];
by
push @$tuple_status, [ $sth->err, $sth->errstr, $sth->state ];
That gives the result I would expect when reading the POD
Thanks to both of you for solving this problem!
Bart
Bart