Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < |
Date | |
Msg-id | CACjxUsO9psWdP_B73Txy4CDZuQPatBm8KdhiSb3Y1Scu4Qa09g@mail.gmail.com Whole thread Raw |
In response to | Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Re: [COMMITTERS] pgsql: Avoid extra locks in
GetSnapshotData if old_snapshot_threshold <
|
List | pgsql-hackers |
[Thanks to Robert to stepping up to keep this moving while I was down yesterday with a minor injury. I'm back on it today.] On Wed, Jun 8, 2016 at 3:11 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Jun 8, 2016 at 4:04 PM, Kevin Grittner <kgrittn@gmail.com> wrote: >> -- connection 1 >> drop table if exists t1; >> create table t1 (c1 int not null); >> insert into t1 select generate_series(1, 1000000); >> begin transaction isolation level repeatable read; >> select 1; >> >> -- connection 2 >> insert into t2 values (1); >> delete from t1 where c1 between 200000 and 299999; >> delete from t1 where c1 = 1000000; >> vacuum analyze verbose t1; >> select pg_sleep_for('2min'); >> vacuum analyze verbose t1; -- repeat if needed until dead rows vacuumed >> >> -- connection 1 >> select c1 from t1 where c1 = 100; >> select c1 from t1 where c1 = 250000; >> >> The problem occurs when an index is built while an old snapshot >> exists which can't see the effects of early pruning/vacuuming. The >> fix prevents use of such an index until all snapshots early enough >> to have a problem have been released. > > This example doesn't seem to involve any CREATE INDEX or REINDEX > operations, so I'm a little confused. Sorry; pasto -- the index build is supposed to be on connection 2 right after the dead rows are confirmed vacuumed away: create index t1_c1 on t1 (c1); > Generally, I think I see the hazard you're concerned about: I had > failed to realize, as you mentioned upthread, that new index pages > would have an LSN of 0. So if a tuple is pruned early and then the > index is reindexed, old snapshots won't realize that data is missing. > What I'm less certain about is whether you can actually get by with > reusing ii_BrokenHotChain to handle this case. v2 and later does not do that. v1 did, but that was a more blunt instrument. > For example, note this comment: > > * However, when reindexing an existing index, we should do nothing here. > * Any HOT chains that are broken with respect to the index must predate > * the index's original creation, so there is no need to change the > * index's usability horizon. Moreover, we *must not* try to change the > * index's pg_index entry while reindexing pg_index itself, and this > * optimization nicely prevents that. > > This logic doesn't apply to the old snapshot case; there, you'd need > to distrust the index whether it was an initial build or a REINDEX, > but it doesn't look like that's what the patch does. I'm worried > there could be other places where we rely on ii_BrokenHotChain to > detect only one specific condition that isn't quite the same as what > you're trying to use it for here. Well spotted. I had used a lot of discreet calls to get that reindex logic right, but it was verbose and ugly, so I had just added the new macros in this patch and started to implement them before I knocked off for the day. At handover I was too distracted to remember where I was with it. :-( See if it looks right to you now. Attached is v3. I will commit this patch to resolve this issue tomorrow, barring any objections before then. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: