Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < - Mailing list pgsql-hackers
| From | Kevin Grittner |
|---|---|
| Subject | Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < |
| Date | |
| Msg-id | CACjxUsO9psWdP_B73Txy4CDZuQPatBm8KdhiSb3Y1Scu4Qa09g@mail.gmail.com Whole thread Raw |
| In response to | Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < (Robert Haas <robertmhaas@gmail.com>) |
| Responses |
Re: Re: [COMMITTERS] pgsql: Avoid extra locks in
GetSnapshotData if old_snapshot_threshold <
|
| List | pgsql-hackers |
[Thanks to Robert to stepping up to keep this moving while I was
down yesterday with a minor injury. I'm back on it today.]
On Wed, Jun 8, 2016 at 3:11 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 8, 2016 at 4:04 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
>> -- connection 1
>> drop table if exists t1;
>> create table t1 (c1 int not null);
>> insert into t1 select generate_series(1, 1000000);
>> begin transaction isolation level repeatable read;
>> select 1;
>>
>> -- connection 2
>> insert into t2 values (1);
>> delete from t1 where c1 between 200000 and 299999;
>> delete from t1 where c1 = 1000000;
>> vacuum analyze verbose t1;
>> select pg_sleep_for('2min');
>> vacuum analyze verbose t1; -- repeat if needed until dead rows vacuumed
>>
>> -- connection 1
>> select c1 from t1 where c1 = 100;
>> select c1 from t1 where c1 = 250000;
>>
>> The problem occurs when an index is built while an old snapshot
>> exists which can't see the effects of early pruning/vacuuming. The
>> fix prevents use of such an index until all snapshots early enough
>> to have a problem have been released.
>
> This example doesn't seem to involve any CREATE INDEX or REINDEX
> operations, so I'm a little confused.
Sorry; pasto -- the index build is supposed to be on connection 2
right after the dead rows are confirmed vacuumed away:
create index t1_c1 on t1 (c1);
> Generally, I think I see the hazard you're concerned about: I had
> failed to realize, as you mentioned upthread, that new index pages
> would have an LSN of 0. So if a tuple is pruned early and then the
> index is reindexed, old snapshots won't realize that data is missing.
> What I'm less certain about is whether you can actually get by with
> reusing ii_BrokenHotChain to handle this case.
v2 and later does not do that. v1 did, but that was a more blunt
instrument.
> For example, note this comment:
>
> * However, when reindexing an existing index, we should do nothing here.
> * Any HOT chains that are broken with respect to the index must predate
> * the index's original creation, so there is no need to change the
> * index's usability horizon. Moreover, we *must not* try to change the
> * index's pg_index entry while reindexing pg_index itself, and this
> * optimization nicely prevents that.
>
> This logic doesn't apply to the old snapshot case; there, you'd need
> to distrust the index whether it was an initial build or a REINDEX,
> but it doesn't look like that's what the patch does. I'm worried
> there could be other places where we rely on ii_BrokenHotChain to
> detect only one specific condition that isn't quite the same as what
> you're trying to use it for here.
Well spotted. I had used a lot of discreet calls to get that
reindex logic right, but it was verbose and ugly, so I had just
added the new macros in this patch and started to implement them
before I knocked off for the day. At handover I was too distracted
to remember where I was with it. :-( See if it looks right to you
now.
Attached is v3. I will commit this patch to resolve this issue
tomorrow, barring any objections before then.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: