Re: Rework the way multixact truncations work - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: Rework the way multixact truncations work |
Date | |
Msg-id | 20151108085407.GA1097830@tornado.leadboat.com Whole thread Raw |
In response to | Re: Rework the way multixact truncations work (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Rework the way multixact truncations work
|
List | pgsql-hackers |
On Thu, Oct 29, 2015 at 08:46:52AM +0100, Andres Freund wrote: > On October 29, 2015 7:59:03 AM GMT+01:00, Noah Misch <noah@leadboat.com> wrote: > >That helps; thanks. Your design seems good. I've located only insipid > >defects. > > Great! > > > I propose to save some time by writing a patch series > >eliminating > >them, which you could hopefully review. Does that sound good? > > Yes, it does. I have pushed a stack of branches to https://github.com/nmisch/postgresql.git: mxt0-revert - reverts commits 4f627f8 and aa29c1c mxt1-disk-independent - see below mxt2-cosmetic - update already-wrong comments and formatting mxt3-main - replaces commit 4f627f8 mxt4-rm-legacy - replaces commit aa29c1c The plan is to squash each branch into one PostgreSQL commit. In addition to examining overall "git diff mxt2-cosmetic mxt3-main", I recommend reviewing itemized changes and commit log entries in "git log -p --reverse --no-merges mxt2-cosmetic..mxt3-main". In particular, when a change involves something you discussed upthread or was otherwise not obvious, I put a statement of rationale in the commit log. > + * Make sure to only attempt truncation if there's values to truncate > + * away. In normal processing values shouldn't go backwards, but there's > + * some corner cases (due to bugs) where that's possible. Which bugs are those? I would like to include more detail if available. If anything here requires careful study, it's the small mxt1-disk-independent change, which I have also attached in patch form. That patch removes the SlruScanDirCbFindEarliest() test from TruncateMultiXact(), which in turn makes multixact control decisions independent of whether TruncateMultiXact() is successful at unlinking segments. Today, one undeletable segment file can cause TruncateMultiXact() to give up on truncation completely for a span of hundreds of millions of MultiXactId. Patched multixact.c will, like CLOG, make its decisions strictly based on the content of files it expects to exist. It will no longer depend on the absence of files it hopes will not exist. To aid in explaining the change's effects, I will define some terms. A "logical wrap" occurs when no range of 2^31 integers covers the set of MultiXactId stored in xmax fields. A "file-level wrap" occurs when there exists a pair of pg_multixact/offsets segment files such that: | segno_a * SLRU_PAGES_PER_SEGMENT * MULTIXACT_OFFSETS_PER_PAGE - segno_b * SLRU_PAGES_PER_SEGMENT * MULTIXACT_OFFSETS_PER_PAGE | > 2^31 A logical wrap implies either a file-level wrap or missing visibility metadata, but a file-level wrap does not imply other consequences. The SlruScanDirCbFindEarliest() test is usually redundant with find_multixact_start(), because MultiXactIdPrecedes(oldestMXact, earliest) almost implies that find_multixact_start() will fail. The exception arises when pg_multixact/offsets files compose a file-level wrap, which can happen when TruncateMultiXact() fails to unlink segments as planned. When it does happen, the result of SlruScanDirCbFindEarliest(), and therefore the computed "earliest" value, is undefined. (This outcome is connected to our requirement to use only half the pg_clog or pg_multixact/offsets address space at any one time. The PagePrecedes callbacks for these SLRUs cease to be transitive if more than half their address space is in use.) The SlruScanDirCbFindEarliest() test can be helpful when a file-level wrap coexists with incorrect oldestMultiXactId (extant xmax values define "correct oldestMultiXactId"). If we're lucky with readdir() order, the test will block truncation so we don't delete still-needed segments. I am content to lose that, because (a) the code isn't reliable for (or even directed toward) that purpose and (b) sites running on today's latest point releases no longer have incorrect oldestMultiXactId.
Attachment
pgsql-hackers by date: