Thread: Batch update of indexes on data loading
This is a proposal of fast data loading using batch update of indexes for 8.4. It is a part of pg_bulkload (http://pgbulkload.projects.postgresql.org/) and I'd like to integrate it in order to cooperate with other parts of postgres. The basic concept is spooling new coming data, and merge the spool and the existing indexes into a new index at the end of data loading. It is 5-10 times faster than index insertion per-row, that is the way in 8.3. One of the problem is locking; Index building in bulkload is similar to REINDEX rather than INSERT, so we need ACCESS EXCLUSIVE LOCK during it. Bulkloading is not a upper compatible method, so I'm thinking about adding a new "WITH LOCK" option for COPY command. COPY tbl FROM 'datafile' WITH LOCK; If the LOCK option is specified, the behavior of COPY will be changed as follows: 1. Lock the target table in ACCESS EXCLUSIVE mode instead of ROW EXCLUSIVE. 2. Prepare spooler (BTSpool) for each indexes. 3. For each new row, put index entries into the spools (_bt_spool) instead of index_insert. 4. At the end of COPY, merge the spool and the existing indexes into a new index file. The relfilenode of the index is changedlike REINDEX. However, there might be better interfaces for bulk index creation. For example, if we want to use it with pgloader, we might need "bulkload mode" for indexes. pgloader commits every 10000 rows, so the index spooler must keep alive until end of the session over transactions. (or end of the transaction over sub-transactions) I'm working toward the simple "COPY WITH LOCK" approach for now, but if there are other better ideas, I want to use them. Advices and suggestions welcome. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
ITAGAKI Takahiro wrote: > The basic concept is spooling new coming data, and merge the spool and > the existing indexes into a new index at the end of data loading. It is > 5-10 times faster than index insertion per-row, that is the way in 8.3. Please see http://thread.gmane.org/gmane.comp.db.postgresql.general/102370/focus=102901 -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> wrote: > > The basic concept is spooling new coming data, and merge the spool and > > the existing indexes into a new index at the end of data loading. It is > > 5-10 times faster than index insertion per-row, that is the way in 8.3. > > Please see > http://thread.gmane.org/gmane.comp.db.postgresql.general/102370/focus=102901 Yeah, BEFORE INSERT FOR EACH ROW trigger is one of the problems. I think it is enough to disallow bulkloading if there are any BEFORE INSERT triggers. It is not a serious limitation because DBA often disables triggers in bulkloading for performance. >> You could work around this if the indexscan code knew to go search in the >> list of pending insertions, but that's pretty ugly and possibly slow too. I heard it is used in Falcon storage engine in MySQL, so it seems to be not so unrealistic approach. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Itagaki-san, > Alvaro Herrera <alvherre@commandprompt.com> wrote: > > > The basic concept is spooling new coming data, and merge the spool and > > > the existing indexes into a new index at the end of data loading. It is > > > 5-10 times faster than index insertion per-row, that is the way in 8.3. Thanks so much for doing this. For one thing, it will vastly improve PostgreSQL's ability to run industry-standard benchmarks. As well as making dump/reload much less painful. > I heard it is used in Falcon storage engine in MySQL, so it seems to be > not so unrealistic approach. I don't think we want to copy any spec from Falcon ... it's a year (or more) behind schedule. -- Josh Berkus PostgreSQL @ Sun San Francisco
On Thu, 2008-02-21 at 13:26 +0900, ITAGAKI Takahiro wrote: > This is a proposal of fast data loading using batch update of indexes for 8.4. > It is a part of pg_bulkload (http://pgbulkload.projects.postgresql.org/) and > I'd like to integrate it in order to cooperate with other parts of postgres. > > The basic concept is spooling new coming data, and merge the spool and > the existing indexes into a new index at the end of data loading. It is > 5-10 times faster than index insertion per-row, that is the way in 8.3. > > > One of the problem is locking; Index building in bulkload is similar to > REINDEX rather than INSERT, so we need ACCESS EXCLUSIVE LOCK during it. > Bulkloading is not a upper compatible method, so I'm thinking about > adding a new "WITH LOCK" option for COPY command. > > COPY tbl FROM 'datafile' WITH LOCK; > I'm very excited to see these concepts going into COPY. One of the reasons why I hadn't wanted to pursue earlier ideas to use LOCK was that applying a lock will prevent running in parallel, which ultimately may prevent further performance gains. Is there a way of doing this that will allow multiple concurrent COPYs? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs <simon@2ndquadrant.com> wrote: > One of the reasons why I hadn't wanted to pursue earlier ideas to use > LOCK was that applying a lock will prevent running in parallel, which > ultimately may prevent further performance gains. > > Is there a way of doing this that will allow multiple concurrent COPYs? I think there is same difficulty as parallel queries. It requires tighter communication among COPY threads whether we will use multi-process model or multi-thread model. We have independent concurrent COPYs now; COPYs are not aware of each other because no intermediate status during COPY. However, COPY will have "phases" if we use bulkbuild. Therefore, we will need joining COPY threads and passing each working memories between threads. Here is a possible multi-threaded workload: A. For each row: 1. Parsing new coming data 2. Add the row into the heap. 3. Spool index entries to each indexspooler. B. Wait for all threads. C. Merge spools and corresponding existing indexes into new ones. Phase A could be concurrently as same as now. A1 and A2 are independent jobs. We could have shared spooler or per-thread spooler. Phase B is needed to build indexes at once, or it will be double work. Phase C could be concurrently for each indexes. A thread is responsible to build one index. It merges the existing index and one shared spool or multiple spools if we use per-thread spooler. One of the issues is how to pass or share spoolers between COPY threads. Another is how to make it transaction safe. If one of the thread fails to build its index, all thread should be rollback. I'm not sure how to do them... Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
On Tue, 2008-02-26 at 15:19 +0900, ITAGAKI Takahiro wrote: > Simon Riggs <simon@2ndquadrant.com> wrote: > > > One of the reasons why I hadn't wanted to pursue earlier ideas to use > > LOCK was that applying a lock will prevent running in parallel, which > > ultimately may prevent further performance gains. > > > > Is there a way of doing this that will allow multiple concurrent COPYs? > > I think there is same difficulty as parallel queries. It requires tighter > communication among COPY threads whether we will use multi-process model > or multi-thread model. > > We have independent concurrent COPYs now; COPYs are not aware of each > other because no intermediate status during COPY. However, COPY will > have "phases" if we use bulkbuild. Therefore, we will need joining > COPY threads and passing each working memories between threads. The use case for this seems to be * an existing table - since we have indexes * large indexes - larger than shared_buffers since we are worried about the effects of index inserts causing random I/O * a table that can be LOCKed for periods of time * load performance is critical I'm worried that the last two items are often mutually exclusive for many people, making this look like a very narrow use case. My experience at Teradata with complicated load mechanisms is that they take a long time to write, are bug-prone and have serious restrictions on how and when we can use them. Even very large data warehouses frequently use a trickle loader or "normal SQL" mechanism because the business requirement for immediate access to data is just as high as the need for load speed. Faster loading is a requirement for most people however. (Dimitri Fontaine is working on parallel COPY statements from pgloader). So I feel we must try very hard to avoid the LOCK. The LOCK is only required because we defer the inserts into unique indexes, yes? Perhaps we might get good performance by making the inserts into the unique index immediately, but deferring the insert of other indexes? Unique index inserts tend to go into the rightmost blocks, which are almost always in memory as a result. So all the random I/O is caused by non-unique indexes. I very much like the idea of index merging, or put another way: batch index inserts. How big do the batch of index inserts have to be for us to gain benefit from this technique? Would it be possible to just buffer the index inserts inside the indexam module so that we perform a batch of index inserts every N rows? Maybe use work_mem? Or specify a batch size as a parameter on COPY? Do we really need to pass data between COPY sessions to gain maximum benefit? Feels like there is a way that is faster in many cases but simpler than the fully parallel route described. The SQL Standard allows us to define a table as GENERATED ALWAYS AS IDENTITY, so in that case we would be able to defer unique index inserts also, since we know they are already unique. Unique index values arriving from outside the database might be able to be checked against each other as a batch first before adding to the index. For example, deferring unique index checks until we have filled a whole block, then checking that all values on the block are unique with respect to each other and then inserting into the unique index. Some other aspects to this, slightly OT * variable load speed - it is often a requirement to have the data load slow down at busy times and speed up again when less busy. Maybe automatically, but definitely manually. If we avoid locks and large batches then we should be able to do this eventually. * driving UPDATEs and DELETEs - we often want to perform more than just INSERTs. If we over-specialise code for INSERTing data then we may miss out on opportunities to improve the overall data maintenance workload. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs <simon@2ndquadrant.com> wrote: > The LOCK is only required because we defer the inserts into unique > indexes, yes? No, as far as present pg_bulkload. It creates a new relfilenode like REINDEX, therefore, access exclusive lock is needed. When there is violations of unique constraints, all of the loading is rollbacked at the end of loading. BTW, why REINDEX requires access exclusive lock? Read-only queries are forbidden during the operation now, but I feel they are ok because REINDEX only reads existing tuples. Can we do REINDEX holding only shared lock on the index? > I very much like the idea of index merging, or put another way: batch > index inserts. How big do the batch of index inserts have to be for us > to gain benefit from this technique? Hmm, we might need to know *why* COPY with indexes is slow. If the major cause is searching position to insert, batch inserts will work well. However, if the cause is index splitting and following random i/o, batch insertion cannot solve the problem; "rebuild" is still required. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: > BTW, why REINDEX requires access exclusive lock? Read-only queries > are forbidden during the operation now, but I feel they are ok > because REINDEX only reads existing tuples. Can we do REINDEX > holding only shared lock on the index? No. When you commit the reindex, the old copy of the index will instantaneously disappear; it will not do for someone to be actively scanning that copy. regards, tom lane
2008/2/29, Tom Lane <tgl@sss.pgh.pa.us>: > ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: > > BTW, why REINDEX requires access exclusive lock? Read-only queries > > are forbidden during the operation now, but I feel they are ok > > because REINDEX only reads existing tuples. Can we do REINDEX > > holding only shared lock on the index? > > No. When you commit the reindex, the old copy of the index will > instantaneously disappear; it will not do for someone to be actively > scanning that copy. Can a shared lock be taken at first, and when the new index is ready, in order to delete the old index, elevate that lock to an exclusive one? Markus -- Markus Bertheau Blog: http://www.bluetwanger.de/blog/
"Markus Bertheau" <mbertheau.pg@googlemail.com> writes: > 2008/2/29, Tom Lane <tgl@sss.pgh.pa.us>: >> No. When you commit the reindex, the old copy of the index will >> instantaneously disappear; it will not do for someone to be actively >> scanning that copy. > Can a shared lock be taken at first, and when the new index is ready, > in order to delete the old index, elevate that lock to an exclusive > one? You could try, but lock upgrades are generally a recipe for increasing your risk of deadlock failure. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Can we do REINDEX > > holding only shared lock on the index? > > No. When you commit the reindex, the old copy of the index will > instantaneously disappear; it will not do for someone to be actively > scanning that copy. Hmm... Is it ok if the index will *not* instantaneously disappear? Keeping the old copy for a while and removing it after all transactions are finished. It would be "pending truncate entries", that is something like pending unlink entries in 8.3. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Can we do REINDEX >>> holding only shared lock on the index? >> >> No. When you commit the reindex, the old copy of the index will >> instantaneously disappear; it will not do for someone to be actively >> scanning that copy. > Hmm... Is it ok if the index will *not* instantaneously disappear? It's not impossible but I really question whether it'd be worth the complexity. There was something very closely related just yesterday about whether DROP INDEX has to take exclusive lock ... regards, tom lane
Added to TODO: o Allow COPY FROM to create index entries in bulk http://archives.postgresql.org/pgsql-hackers/2008-02/msg00811.php --------------------------------------------------------------------------- ITAGAKI Takahiro wrote: > This is a proposal of fast data loading using batch update of indexes for 8.4. > It is a part of pg_bulkload (http://pgbulkload.projects.postgresql.org/) and > I'd like to integrate it in order to cooperate with other parts of postgres. > > The basic concept is spooling new coming data, and merge the spool and > the existing indexes into a new index at the end of data loading. It is > 5-10 times faster than index insertion per-row, that is the way in 8.3. > > > One of the problem is locking; Index building in bulkload is similar to > REINDEX rather than INSERT, so we need ACCESS EXCLUSIVE LOCK during it. > Bulkloading is not a upper compatible method, so I'm thinking about > adding a new "WITH LOCK" option for COPY command. > > COPY tbl FROM 'datafile' WITH LOCK; > > If the LOCK option is specified, the behavior of COPY will be changed > as follows: > > 1. Lock the target table in ACCESS EXCLUSIVE mode instead of ROW EXCLUSIVE. > 2. Prepare spooler (BTSpool) for each indexes. > 3. For each new row, put index entries into the spools (_bt_spool) > instead of index_insert. > 4. At the end of COPY, merge the spool and the existing indexes into a new > index file. The relfilenode of the index is changed like REINDEX. > > However, there might be better interfaces for bulk index creation. > For example, if we want to use it with pgloader, we might need > "bulkload mode" for indexes. pgloader commits every 10000 rows, > so the index spooler must keep alive until end of the session > over transactions. (or end of the transaction over sub-transactions) > > I'm working toward the simple "COPY WITH LOCK" approach for now, > but if there are other better ideas, I want to use them. > Advices and suggestions welcome. > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Tue, 2008-02-26 at 09:08 +0000, Simon Riggs wrote: > I very much like the idea of index merging, or put another way: batch > index inserts. How big do the batch of index inserts have to be for us > to gain benefit from this technique? Would it be possible to just buffer > the index inserts inside the indexam module so that we perform a batch > of index inserts every N rows? Maybe use work_mem? Or specify a batch > size as a parameter on COPY? Itagaki, I think the index merging idea is still useful even if we do not build a whole new index. ISTM we can do this without locking the table also. I understand it is most efficient when you do rebuild the index by merging the old index with the incoming data, but it does seem there are other problems associated with doing that. Your idea still has a great future, IMHO. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com