implicit abort harmful? - Mailing list pgsql-general
From | Wayne Armstrong |
---|---|
Subject | implicit abort harmful? |
Date | |
Msg-id | 200305250004.h4P04XSY019198@mail.bacchus.com.au Whole thread Raw |
Responses |
Re: implicit abort harmful?
Re: implicit abort harmful? |
List | pgsql-general |
Hi, I have been using postgresql for about 3 months now (after about 15 years of using various other dbms). I love it :) but - (and you just knew that but was coming :) there is one thing that really causes me grief. . It is the tendency for postgres to rollback automatically on some errors. What this leads to is the need for an application to commit much more frequently than is really desirable. Example1. During import of 120 thousand records from an isam file system, 3 say records fail integrity checks ( files in non-database systems tend not to have referential integrity implemented on them except at an application level (meaning not at all :). The desired result is to drop the records failing integrity checks. Importing into db2 or oracle say - I have the option to ignore the referential integrity errors (i get the error code on the insert anyway), and continue with the insert of the 120 thousand - 3 records. In postgres, I either have to commit after every record write, or guarantee the "cleanness" of the data before I begin the import - which is sometimes difficult given the data sources I may be importing from (Often I neither own nor can modify them). Worse, if the intention behind the automatic rollback is to guarantee data purity (from the postgresql manual 10.4.1 - " Turn off autocommit and just do one commit at the end. (In plain SQL, this means issuing BEGIN at the start and COMMIT at the end. Some client libraries may do this behind your back, in which case you need to make sure the library does it when you want it done.) If you allow each insertion to be committed separately, PostgreSQL is doing a lot of work for each record added. An additional benefit of doing all insertions in one transaction is that if the insertion of one record were to fail then the insertion of all records inserted up to that point would be rolled back, so you won't be stuck with partially loaded data." ), It actually makes more of a mess than leaving the decision to rollback under application control. If in this example the three "bad" records are scattered throughout the dataset, I could end up (if I ignore the error as i would do for most other dbms), with a random number of records in the table. etc etc. Of course also, to do robust imports of the insert ifError update style avaiilable in most other dbms I have worked with, it is also nescessary to issue a commit before each insert/update attempt. Example2 - ODBC driver rollbacks :- Here is a snippet from a postgres log :- 2003-05-24 23:09:14 [1525] LOG: duration: 0.134503 sec 2003-05-24 23:09:14 [1525] LOG: query: select nspname from pg_namespace n, p _class c where c.relnamespace=n.oid and c.oid='select'::regclass 2003-05-24 23:09:14 [1525] ERROR: Relation "select" does not exist 2003-05-24 23:09:14 [1525] LOG: statement: select nspname from pg_namespace , pg_class c where c.relnamespace=n.oid and c.oid='select'::regclass 2003-05-24 23:09:14 [1525] LOG: query: ROLLBACK 2003-05-24 23:09:14 [1525] LOG: duration: 0.000538 sec The rollback here is generated by the odbc driver to clear an error created by the odbc driver incorrectly parsing the select from a subselect statement as a table name. The application is totally unaware an error has occured and a rollback has been issued. This is most likely to lead to data loss, unless, basically, an application using odbc is in autocommit mode or commits after every sql statement issued. This concerns me to the piont where I would really recommend not using the parse statement option in the odbc driver (even though that reduces the odbc drivers ability to mimic a prepare which is also problematic :) unless you are using autocommit. For any application complex enough to be working with autocommit turned off in the odbc driver, the auto rollback is a real problem. Please consider making the automatic abort/rollback feature an option ? Regards, Wayne Armstorng Bacchus Management Systems http://www.bacchus.com.au
pgsql-general by date: