Re: Support for REINDEX CONCURRENTLY - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Support for REINDEX CONCURRENTLY |
Date | |
Msg-id | 20130306085039.GI13803@alap2.anarazel.de Whole thread Raw |
In response to | Re: Support for REINDEX CONCURRENTLY (Michael Paquier <michael.paquier@gmail.com>) |
Responses |
Re: Support for REINDEX CONCURRENTLY
|
List | pgsql-hackers |
On 2013-03-06 13:21:27 +0900, Michael Paquier wrote: > Please find attached updated patch realigned with your comments. You can > find my answers inline... > The only thing that needs clarification is the comment about > UNIQUE_CHECK_YES/UNIQUE_CHECK_NO. Except that all the other things are > corrected or adapted to what you wanted. I am also including now tests for > matviews. > > On Wed, Mar 6, 2013 at 1:49 AM, Andres Freund <andres@2ndquadrant.com>wrote: > > > On 2013-03-05 22:35:16 +0900, Michael Paquier wrote: > > > > > > + for (count = 0; count < num_indexes; count++) > > > > > + index_insert(toastidxs[count], t_values, > > t_isnull, > > > > > + &(toasttup->t_self), > > > > > + toastrel, > > > > > + > > > > toastidxs[count]->rd_index->indisunique ? > > > > > + UNIQUE_CHECK_YES : > > > > UNIQUE_CHECK_NO); > > > > > > > > The indisunique check looks like a copy & pasto to me, albeit not > > > > yours... > > > > > > > Yes it is the same for all the indexes normally, but it looks more solid > > to > > > me to do that as it is. So unchanged. > > > > Hm, if the toast indexes aren't unique anymore loads of stuff would be > > broken. Anyway, not your "fault". > > > I definitely cannot understand where you are going here. Could you be more > explicit? Why could this be a problem? Without my patch a similar check is > used for toast indexes. There's no problem. I just dislike the pointless check which caters for a situation that doesn't exist... Forget it, sorry. > > > > > + if (count == 0) > > > > > + snprintf(NewToastName, > > > > NAMEDATALEN, "pg_toast_%u_index", > > > > > + OIDOldHeap); > > > > > + else > > > > > + snprintf(NewToastName, > > > > NAMEDATALEN, "pg_toast_%u_index_cct%d", > > > > > + OIDOldHeap, > > > > count); > > > > > + RenameRelationInternal(lfirst_oid(lc), > > > > > + > > > > NewToastName); > > > > > + count++; > > > > > + } > > > > > > > > Hm. It seems wrong that this layer needs to know about _cct. > > > > > > > Any other idea? For the time being I removed cct and added only a suffix > > > based on the index number... > > > > Hm. It seems like throwing an error would be sufficient, that path is > > only entered for shared catalogs, right? Having multiple toast indexes > > would be a bug. > > > Don't think so. Even if now those APIs are used only for catalog tables, I > do not believe that this function has been designed to be used only with > shared catalogs. Removing the cct suffix makes sense though... Forget what I said. > > > > > + /* > > > > > + * Index is considered as a constraint if it is PRIMARY KEY or > > > > EXCLUSION. > > > > > + */ > > > > > + isconstraint = indexRelation->rd_index->indisprimary || > > > > > + indexRelation->rd_index->indisexclusion; > > > > > > > > unique constraints aren't mattering here? > > > > > > > No they are not. Unique indexes are not counted as constraints in the > > case > > > of index_create. Previous versions of the patch did that but there are > > > issues with unique indexes using expressions. > > > > Hm. index_create's comment says: > > * isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION > > constraint > > > > There are unique indexes that are constraints and some that are > > not. Looking at ->indisunique is not sufficient to determine whether its > > one or not. > > > Hum... OK. I changed that using a method based on get_index_constraint for > a given index. So if the constraint Oid is invalid, it means that this > index has no constraints and its concurrent entry won't create an index in > consequence. It is more stable this way. Sounds good. Just to make that clear: To get a unique index without constraint: CREATE TABLE table_u(id int, data int); CREATE UNIQUE INDEX table_u__data ON table_u(data); To get a constraint: ALTER TABLE table_u ADD CONSTRAINT table_u__id_unique UNIQUE(id); > > > > > + /* > > > > > + * Phase 3 of REINDEX CONCURRENTLY > > > > > + * > > > > > + * During this phase the concurrent indexes catch up with the > > > > INSERT that > > > > > + * might have occurred in the parent table and are marked as > > valid > > > > once done. > > > > > + * > > > > > + * We once again wait until no transaction can have the table > > open > > > > with > > > > > + * the index marked as read-only for updates. Each index > > > > validation is done > > > > > + * with a separate transaction to avoid opening transaction > > for an > > > > > + * unnecessary too long time. > > > > > + */ > > > > > > > > Maybe I am being dumb because I have the feeling I said differently in > > > > the past, but why do we not need a WaitForMultipleVirtualLocks() here? > > > > The comment seems to say we need to do so. > > > > > > > Yes you said the contrary in a previous review. The purpose of this > > > function is to first gather the locks and then wait for everything at > > once > > > to reduce possible conflicts. > > > > you say: > > > > + * We once again wait until no transaction can have the table open > > with > > + * the index marked as read-only for updates. Each index > > validation is done > > + * with a separate transaction to avoid opening transaction for an > > + * unnecessary too long time. > > > > Which doesn't seem to be done? > > > > I read back and afaics I only referred to CacheInvalidateRelcacheByRelid > > not being necessary in this phase. Which I think is correct. > > > Regarding CacheInvalidateRelcacheByRelid at phase 3, I think that it is > needed. If we don't use it the pg_index entries will be updated but not the > cache, what is incorrect. A heap_update will cause cache invalidations to be sent. > Anyway, if I claimed otherwise, I think I was wrong: > > > > The reason - I think - we need to wait here is that otherwise its not > > guaranteed that all other backends see the index with ->isready > > set. Which means they might add tuples which are invisible to the mvcc > > snapshot passed to validate_index() (just created beforehand) which are > > not yet added to the new index because those backends think the index is > > not ready yet. > > Any flaws in that logic? > > > Not that I think. In consequence, and I think we will agree on that: I am > removing WaitForMultipleVirtualLocks and add a WaitForVirtualLock on the > parent relation for EACH index before building and validating it. I have the feeling we are talking past each other. Unless I miss something *there is no* WaitForMultipleVirtualLocks between phase 2 and 3. But one WaitForMultipleVirtualLocks for all would be totally sufficient. 20130305_2_reindex_concurrently_v17.patch: + /* we can do away with our snapshot */ + PopActiveSnapshot(); + + /* + * Commit this transaction to make the indisready update visible for + * concurrent index. + */ + CommitTransactionCommand(); + } + + + /* + * Phase 3 of REINDEX CONCURRENTLY + * + * During this phase the concurrent indexes catch up with the INSERT that + * might have occurred in the parent table and are marked as valid once done. + * + * We once again wait until no transaction can have the table open with + * the index marked as read-only for updates. Each index validation is done + * with a separate transaction to avoid opening transaction for an + * unnecessary too long time. + */ + + /* + * Perform a scan of each concurrent index with the heap, then insert + * any missing index entries. + */ + foreach(lc, concurrentIndexIds) + { + Oid indOid = lfirst_oid(lc); + Oid relOid; Thanks! Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: