Thread: COPY ... FROM and index usage
Hi all. I'd like to know whether the indexes on a table are updated or not during a "COPY ... FROM" request. That is, should I drop all indexes during a "COPY ... FROM" in order to gain the maximum speed to load data? Thanks. -- Reg me Please
On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote: > Hi all. > > I'd like to know whether the indexes on a table are updated or not during > a "COPY ... FROM" request. > > That is, should I drop all indexes during a "COPY ... FROM" in order to gain > the maximum speed to load data? > > Thanks. Although questions of "which is faster" often depend very heavily on the data involved, the database schema, the hardware, etc., typically people find it best to drop all indexes during a large import and recreate them afterward. - Josh/eggyknap
Il Sunday 04 November 2007 14:59:10 Josh Tolley ha scritto: > On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote: > > Hi all. > > > > I'd like to know whether the indexes on a table are updated or not during > > a "COPY ... FROM" request. > > > > That is, should I drop all indexes during a "COPY ... FROM" in order to > > gain the maximum speed to load data? > > > > Thanks. > > Although questions of "which is faster" often depend very heavily on > the data involved, the database schema, the hardware, etc., typically > people find it best to drop all indexes during a large import and > recreate them afterward. > > - Josh/eggyknap This sounds very reasonable to me. But the first question remains unanswered: Are the indexes updated during the COPY ... FROM ? Thanks again. -- Reg me Please
On Nov 4, 2007, at 9:15 AM, Reg Me Please wrote: > Il Sunday 04 November 2007 14:59:10 Josh Tolley ha scritto: >> On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote: >>> Hi all. >>> >>> I'd like to know whether the indexes on a table are updated or >>> not during >>> a "COPY ... FROM" request. >>> >>> That is, should I drop all indexes during a "COPY ... FROM" in >>> order to >>> gain the maximum speed to load data? >>> >>> Thanks. >> >> Although questions of "which is faster" often depend very heavily on >> the data involved, the database schema, the hardware, etc., typically >> people find it best to drop all indexes during a large import and >> recreate them afterward. >> >> - Josh/eggyknap > > This sounds very reasonable to me. > But the first question remains unanswered: > > Are the indexes updated during the COPY ... FROM ? Of course. Why would think that data could be inserted into a table by any means without it updating the table's indexes? That would make the index worthless. Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
Il Sunday 04 November 2007 16:21:41 Erik Jones ha scritto: > On Nov 4, 2007, at 9:15 AM, Reg Me Please wrote: > > Il Sunday 04 November 2007 14:59:10 Josh Tolley ha scritto: > >> On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote: > >>> Hi all. > >>> > >>> I'd like to know whether the indexes on a table are updated or > >>> not during > >>> a "COPY ... FROM" request. > >>> > >>> That is, should I drop all indexes during a "COPY ... FROM" in > >>> order to > >>> gain the maximum speed to load data? > >>> > >>> Thanks. > >> > >> Although questions of "which is faster" often depend very heavily on > >> the data involved, the database schema, the hardware, etc., typically > >> people find it best to drop all indexes during a large import and > >> recreate them afterward. > >> > >> - Josh/eggyknap > > > > This sounds very reasonable to me. > > But the first question remains unanswered: > > > > Are the indexes updated during the COPY ... FROM ? > > Of course. Why would think that data could be inserted into a table > by any means without it updating the table's indexes? That would > make the index worthless. > > Erik Jones > > Software Developer | Emma® > erik@myemma.com > 800.595.4401 or 615.292.5888 > 615.292.0777 (fax) > > Emma helps organizations everywhere communicate & market in style. > Visit us online at http://www.myemma.com I do understand your remarks and would agree. But I was thinking about the COPY...FROM request not as a normal INSERT INTO. If this were the case I could have been running a REINDEX TABLE. -- Reg me Please
"Josh Tolley" <eggyknap@gmail.com> writes: > Although questions of "which is faster" often depend very heavily on > the data involved, the database schema, the hardware, etc., typically > people find it best to drop all indexes during a large import and > recreate them afterward. See also the extensive discussion of this topic at http://www.postgresql.org/docs/8.2/static/populate.html regards, tom lane
Hi, Le Sunday 04 November 2007 11:22:19 Reg Me Please, vous avez écrit : > That is, should I drop all indexes during a "COPY ... FROM" in order to > gain the maximum speed to load data? When looking for a way to speed up data loading, you may want to consider pgbulkload, a project which optimizes index creation while loading data, and bypass constraints: it's useful when you know you trust input. As I've never used it myself, I can only provides following links: http://pgfoundry.org/projects/pgbulkload http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf Regards, -- dim
Dimitri, thank you for your quoting. I'm a pg_bulkload author. pg_bulkload is optimized especially for appending data to table with indexes. If you use it, you don't need to drop index before loading data. But you have to consider conditions carefully as Dimitri said below. See also pg_bulkload README: http://pgfoundry.org/docman/view.php/1000261/473/README.pg_bulkload-2.2.0.txt Best regards, Dimitri Fontaine wrote: > Hi, > > Le Sunday 04 November 2007 11:22:19 Reg Me Please, vous avez écrit : >> That is, should I drop all indexes during a "COPY ... FROM" in order to >> gain the maximum speed to load data? > > When looking for a way to speed up data loading, you may want to consider > pgbulkload, a project which optimizes index creation while loading data, and > bypass constraints: it's useful when you know you trust input. > As I've never used it myself, I can only provides following links: > > http://pgfoundry.org/projects/pgbulkload > http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf > > Regards, -- Toru SHIMOGAKI<shimogaki.toru@oss.ntt.co.jp> NTT Open Source Software Center
Hi, Anyone have comparisons/benchmarks to give some idea of the potential performance gains? Say compared to doing the stuff here: http://www.postgresql.org/docs/8.2/static/populate.html Regards, Link. At 09:35 AM 11/5/2007, Toru SHIMOGAKI wrote: >Dimitri, thank you for your quoting. I'm a pg_bulkload author. > >pg_bulkload is optimized especially for appending data to table with indexes. >If you use it, you don't need to drop index >before loading data. But you have to consider >conditions carefully as Dimitri said below. See also pg_bulkload README: > >http://pgfoundry.org/docman/view.php/1000261/473/README.pg_bulkload-2.2.0.txt > >Best regards, > >Dimitri Fontaine wrote: >>Hi, >>Le Sunday 04 November 2007 11:22:19 Reg Me Please, vous avez écrit : >>>That is, should I drop all indexes during a "COPY ... FROM" in order to >>>gain the maximum speed to load data? >>When looking for a way to speed up data >>loading, you may want to consider pgbulkload, a >>project which optimizes index creation while >>loading data, and bypass constraints: it's >>useful when you know you trust input. >>As I've never used it myself, I can only provides following links: >> http://pgfoundry.org/projects/pgbulkload >> http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf >>Regards, > >-- >Toru SHIMOGAKI<shimogaki.toru@oss.ntt.co.jp> >NTT Open Source Software Center > > >---------------------------(end of broadcast)--------------------------- >TIP 6: explain analyze is your friend