Re: Using multi-row technique with COPY - Mailing list pgsql-hackers
| From | Bruce Momjian | 
|---|---|
| Subject | Re: Using multi-row technique with COPY | 
| Date | |
| Msg-id | 200511292351.jATNprm20699@candle.pha.pa.us Whole thread Raw | 
| In response to | Re: Using multi-row technique with COPY (Simon Riggs <simon@2ndquadrant.com>) | 
| Responses | Re: Using multi-row technique with COPY | 
| List | pgsql-hackers | 
Please let me back up and ask a more simplistic question. I understand the idea of allowing COPY to insert rows with less locking, but I am wondering about the NOLOGGING idea. On commit, we must guarantee that all the rows are in the table, so what advantage is there to a NOLOGGING option? --------------------------------------------------------------------------- Simon Riggs wrote: > On Tue, 2005-11-29 at 14:17 -0500, Bruce Momjian wrote: > > Simon Riggs wrote: > > > As a further enhancement, I would also return to the NOLOGGING option > > > for COPY. Previously we had said that COPY LOCK was the way to go - > > > taking a full table lock to prevent concurrent inserts to a block from a > > > COPY that didn't write WAL and another backend which wanted to write WAL > > > about that block. With the above suggested all-inserts-at-once > > > optimization, it would no longer be a requirement to lock the table. > > > That means we can continue to take advantage of the ability to run > > > multiple COPY loads into the same table. Avoiding writing WAL will > > > further reduce CPU by about 15% and I/O by about 50%. > > > > > > I would also suggest that pgdump be changed to use the NOLOGGING option > > > by default, with an option to work as previously. > > > > For those who have been around, they know I dislike having options that > > 95% of our users desire not be the default behavior. > > I like that approach and it makes for a great product. > > > I think the COPY > > NOLOGGING idea falls in that category. I would like to explore if there > > is a way to have COPY automatically do no logging where possible by > > default. > > see last, below > > > First, I think NOLOGGING is probably the wrong keyword. I am thinking > > SHARE/EXCLUSIVE is best because they are already keywords, and they > > explain the effect of the flag on other applications, rather than the > > LOGGING capability, which is invisible to applications. > > > > I am thinking we would have COPY WITH [ [ EXCLUSIVE | SHARE ] [ LOCK ]] ... > > EXCLUSIVE lock would be NOLOGGING, SHARE would do logging because other > > applications could insert into the table at the same time (and do > > UPDATES/DELETES of the inserted rows). > > I wasn't trying to suggest new wording for COPY LOCK, or whatever it was > called previously. My suggestion was to alter the underlying mechanism > so that it would not actually need to take a lock at all. So I was > trying to find a name that matched that thought.... so your comments > above don't really apply.... but I'm more than happy to have others > suggest names and will go with the flow. > > > One idea for default behavior would be to use EXCLUSIVE when the table > > is zero size. I think that would do pg_dump and most of the user cases, > > and of course users could override the default by using a keyword. We > > could emit a NOTICE if an an exclusive lock is used without an EXCLUSIVE > > keyword. One problem I see is that there is no way to insure zero size > > without a lock that blocks other writers. Is that reliable? > > I think it would be possible to say: if when a COPY starts the table has > zero blocks, then we do a NOLOGGING (change_the_name) mode operation. > > That gives a problem for recovery. If you lose the server and recover > with PITR, that table would come back empty. I'd want to make very sure > that I knew whether it was going to be there, or not, and I want it to > be my choice. So, I agree with the thought from a automatic performance > perspective, but not from a recovery perspective. > > (There is some irony in being the person to both advocate PITR and to > try so hard not to log...) > > I think of PITR as the default mode of operation now; almost all Oracle, > DB2, SQLServer etc users make consistent use of archivelogmode/archive > recovery. > > Best Regards, Simon Riggs > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
pgsql-hackers by date: