Re: Support for REINDEX CONCURRENTLY - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Support for REINDEX CONCURRENTLY |
Date | |
Msg-id | CA+TgmoaY23ouHSo3TwVHJZuAmKjk-he05Rp4_BMk29Mf6xhFmg@mail.gmail.com Whole thread Raw |
In response to | Re: Support for REINDEX CONCURRENTLY (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: Support for REINDEX CONCURRENTLY
|
List | pgsql-hackers |
On Wed, Aug 28, 2013 at 9:02 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> During swap phase, process was waiting for transactions with older >> snapshots than the one taken by transaction doing the swap as they >> might hold the old index information. I think that we can get rid of >> it thanks to the MVCC snapshots as other backends are now able to see >> what is the correct index information to fetch. > > I don't see MVCC snapshots guaranteeing that. The only thing changed due > to them is that other backends see a self consistent picture of the > catalog (i.e. not either, neither or both versions of a tuple as > earlier). It's still can be out of date. And we rely on those not being > out of date. > > I need to look into the patch for more details. I agree with Andres. The only way in which the MVCC catalog snapshot patch helps is that you can now do a transactional update on a system catalog table without fearing that other backends will see the row as nonexistent or duplicated. They will see exactly one version of the row, just as you would naturally expect. However, a backend's syscaches can still contain old versions of rows, and they can still cache older versions of some tuples and newer versions of other tuples. Those caches only get reloaded when shared-invalidation messages are processed, and that only happens when the backend acquires a lock on a new relation. I have been of the opinion for some time now that the shared-invalidation code is not a particularly good design for much of what we need. Waiting for an old snapshot is often a proxy for waiting long enough that we can be sure every other backend will process the shared-invalidation message before it next uses any of the cached data that will be invalidated by that message. However, it would be better to be able to send invalidation messages in some way that causes them to processed more eagerly by other backends, and that provides some more specific feedback on whether or not they have actually been processed. Then we could send the invalidation messages, wait just until everyone confirms that they have been seen, which should hopefully happen quickly, and then proceed. This would probably lead to much shorter waits. Or maybe we should have individual backends process invalidations more frequently, and try to set things up so that once an invalidation is sent, the sending backend is immediately guaranteed that it will be processed soon enough, and thus it doesn't need to wait at all. This is all pie in the sky, though. I don't have a clear idea how to design something that's an improvement over the (rather intricate) system we have today. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: