Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements - Mailing list pgsql-hackers

From Mihail Nikalayeu
Subject Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date
Msg-id CADzfLwWkYi3r-CD_Bbkg-Mx0qxMBzZZFQTL2ud7yHH2KDb1hdw@mail.gmail.com
Whole thread Raw
In response to Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Antonin Houska <ah@cybertec.at>)
Responses Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
List pgsql-hackers
Hello, Antonin!

> I haven't read the whole thread yet, but the effort to minimize the impact of
> C/RIC on VACUUM seems to prevail
Yes, the thread is super long and probably you missed annotations to
most important emails in [0].

> Of course, I could have missed some important point, so please explain why
> this concept is broken :-) Or let me know if something needs to be explained
> more in detail. Thanks.

Looks like your idea is not broken, but... It is actually an almost
1-1 to idea used in the "full" version of the patch.
Explanations are available in [1] and [2].
In [3] I reduced the patch scope to find a solution compatible with REPACK.

Few comments:

> 1. Create an empty index.
Yes, patch does exactly the same, introducing special lightweight AM -
STIR (Short Term Index Replacement) to collect new tuples.

> 4.1 Acquire (shared) content lock on the buffer.
>  4.3 Collect the root tuples of HOT chains - these and only these need to be
       inserted into the index.
>   4.4 Unlock the buffer.

Instead of such technique essentially the same is used - it keeps the
snapshot to be used, it just rotates it every few pages for a fresh
one.
It solves some of the issues with selection of alive tuples you
mentioned without any additional logic.

> Concurrent (re)build of unique index appears to be another topic of this
> thread, but I think this approach should handle the problem too.
It is solved with a special commit in the original patchset.

You know, clever people think the same :)
Interesting fact, it is not the first time - at [4] Sergey also
proposed an idea of an "empty" index to collect tuples (which gives
the single scan).

So, it is very good knews the approach feels valid for multiple people
(also Mathias introduced the idea about "fresh snapshot"~"no snapshot"
initially).

One thing I am not happy about - it is not applicable to the REPACK case.

Best regards,
Mikhail.

[0]: https://commitfest.postgresql.org/patch/4971/
[1]: https://www.postgresql.org/message-id/CADzfLwVOcZ9mg8gOG+KXWurt=MHRcqNv3XSECYoXyM3ENrxyfQ@mail.gmail.com
[2]: https://www.postgresql.org/message-id/CADzfLwW9QczZW-E=McxcjUv0e5VMDctQNETbgao0K-SimVhFPA@mail.gmail.com
[3]: https://www.postgresql.org/message-id/CADzfLwVaV15R2rUNZmKqLKweiN3SnUBg=6_qGE_ERb7cdQUD8g@mail.gmail.com
[4]: https://www.postgresql.org/message-id/flat/CAMAof6_FY0MrNJOuBrqvQqJKiwskFvjRtgpVHf-D7A%3DKvTtYXg%40mail.gmail.com



pgsql-hackers by date:

Previous
From: Sugamoto Shinya
Date:
Subject: Re: [PATCH] Add error hints for invalid COPY options
Next
From: Dean Rasheed
Date:
Subject: Re: Second RewriteQuery complains about first RewriteQuery in edge case