Re: REINDEX CONCURRENTLY unexpectedly fails - Mailing list pgsql-bugs
From | Michael Paquier |
---|---|
Subject | Re: REINDEX CONCURRENTLY unexpectedly fails |
Date | |
Msg-id | 20200120015913.GA1471@paquier.xyz Whole thread Raw |
In response to | Re: REINDEX CONCURRENTLY unexpectedly fails (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: REINDEX CONCURRENTLY unexpectedly fails
|
List | pgsql-bugs |
On Fri, Jan 17, 2020 at 02:53:11PM +0200, Heikki Linnakangas wrote: > On 15/01/2020 03:39, Michael Paquier wrote: >> diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c >> index 3e59e647e5..4139232b51 100644 >> --- a/src/backend/catalog/index.c >> +++ b/src/backend/catalog/index.c >> @@ -2016,6 +2016,13 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) >> LOCKTAG heaplocktag; >> LOCKMODE lockmode; >> + /* >> + * A relation not supporting concurrent indexing should never do >> + * a concurrent index drop or try to use a concurrent lock mode. >> + */ >> + Assert(RelationSupportsConcurrentIndexing(indexId) || >> + (!concurrent && !concurrent_lock_mode)); >> + >> /* >> * To drop an index safely, we must grab exclusive lock on its parent >> * table. Exclusive lock on the index alone is insufficient because > > Or maybe decide to do non-current drop within index_drop() itself, instead > of requiring the caller to set 'concurrent' differently for temporary > tables? A portion which stresses me with your approach (and that I actually used in the first versions of my patch upthread), is that we have some extra work related to PERFORM_DELETION_CONCURRENTLY being done in dependency.c, which becomes basically useless if you enforce the flag only in index_drop() without making sure that the root flag is set in RemoveRelations() (see for example the top of deleteOneObject()), and this causes the index drop to actually mix more concurrent and non-concurrent concepts than just the lock upgrade issue. >> +bool >> +RelationSupportsConcurrentIndexing(Oid relid) >> +{ > > Sorry to beat a dead hore, but I still don't like this function: > > * Does it take a table OID or index OID? (Answer: both) Yes, the persistency is inherited. The function mentioned a relation, so that applied to any relkind actually. Perhaps the function should have made that clearer with a assertion using a relkind check or such. But the original sounded pretty clear to me. > * There's a hidden assumption that if RelationSupportsConcurrentIndexing() > returns false, then it's OK to upgrade the lock. That's true today, but if > we added other conditions when RelationSupportsConcurrentIndexing() returned > false, there would be trouble. It seems like a bad abstraction. > > This would be better if the function was renamed to something like "Is it OK > to upgrade a CONCURRENTLY build to non-CONCURRENTLY?", but meh. I understand > that it's nice to have a place for this comment, so that it doesn't need to > be repeated in so many places. But I feel that a little bit of repetition is > better in this case. The reasoning isn't exactly the same for CREATE INDEX, > DROP INDEX, and REINDEX anyway. Okay, I see your points. So I am fine with your line of arguments. >> + /* >> + * Enforce non-concurrent build if the relation does not support this >> + * option. Do this before any use of the concurrent option is done. >> + */ >> + if (!RelationSupportsConcurrentIndexing(relationId)) >> + stmt->concurrent = false; >> + > > Is it OK to scribble on the original 'stmt' here? Doesn't seem kosher, > although it probably works fine in practice. The idea was to avoid any future issues if any code refactored in this area passes down IndexStmt to a subroutine and uses the concurrent flag. I guess that would be hard to miss if using a local variable as you do anyway.. > Do we care whether the *table* supports concurrent indexing, rather than > individual indexes? I guess that's academic, since you can't have temporary > indexes on a permanent table, or vice versa. I cared about that enough, but that's a very defensive position :) But I agree that having a check only for individual indexes would be just but fine. >> + /* >> + * Also check for active uses of the relation in the current >> + * transaction, including open scans and pending AFTER trigger >> + * events. >> + */ >> + CheckTableNotInUse(indexRel, "REINDEX"); >> + >> pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX, > > I don't understand why this is required for this patch. It seems like a good > thing to check, I think otherwise you get an error from the > CheckTableNotInUse() call in index_drop(), in phase 6 where the old indexes > are dropped. But it seems unrelated to the rest of the patch. Maybe commit > it as a separate patch? I added that per a suggestion from Andres while touching this area of the code. But you are right that it would make sense to remove it from this patch, and get that committed separately. I don't mind starting a new thread for this matter instead of having this discussing buried within this thread. Does it make sense? > @@ -943,8 +962,11 @@ DefineIndex(Oid relationId, > /* > * A valid stmt->oldNode implies that we already have a built form of the > * index. The caller should also decline any index build. > + * > + * FIXME: should this check 'concurrent' or 'stmt->concurrent'? I don't > + * quite understand the conditions here. > */ > - Assert(!OidIsValid(stmt->oldNode) || (skip_build && !stmt->concurrent)); > + Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent)); [ .. thinks .. ] It seems to me that this should be using the local variable. SKIP_BUILD is used in some cases for ALTER TABLE, CIC and REINDEX CONCURRENTLY (for the creation of the duplicate index entry). oldNode gets used in ALTER TABLE when attempting to reuse an existing index when changing a column type for example. > - if (concurrent) > + /* > + * Obtain the current persistence of the existing table. We already hold > + * lock on it. > + */ > + heapRel = table_open(heapOid, NoLock); > + persistence = heapRel->rd_rel->relpersistence; > + table_close(heapRel, NoLock); Why not just using get_rel_persistence() here, as done in ReindexMultipleTables()? > + /* This function shouldn't be called for temporary relations. */ > + if (indexRel->rd_rel->relpersistence == RELPERSISTENCE_TEMP) > + elog(ERROR, "cannot reindex a temporary table concurrently"); You are right that an elog() is better than an assertion here as this uses a catalog data. > I came up with the attached version. It seems a bit more clear to me. I'm > not 100% wedded to this, though, so if you want to proceed based on your > version instead, feel free. The docs and the tests are unchanged. Except for the part with index_drop() where I would rather do the decision-making for the concurrent behavior in RemoveRelations() rather than index_drop() (as I did in v4), what you have here looks fine to me. Would you prefer wrapping up this stuff yourself or should I? This needs a backpatch down to 9.4 for the CIC/DIC part. -- Michael
Attachment
pgsql-bugs by date: