Re: POC: make mxidoff 64 bits - Mailing list pgsql-hackers
From | Maxim Orlov |
---|---|
Subject | Re: POC: make mxidoff 64 bits |
Date | |
Msg-id | CACG=ezb680eb=JXh1ns=t5eGH3h9y-uTfT4tf3Xc8t2UH2q6tQ@mail.gmail.com Whole thread Raw |
In response to | Re: POC: make mxidoff 64 bits (Heikki Linnakangas <hlinnaka@iki.fi>) |
List | pgsql-hackers |
On Fri, 15 Nov 2024 at 14:06, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Hmm, so if I understand correctly, this is related to how we determine
the length of the members array, by looking at the next multixid's
offset. This is explained in GetMultiXactIdMembers:
Correct.
If we accept that, we don't need to worry about case 3 anymore. But if
we upgrade wrapped-around members files by just renaming them, there
could still be a members array where we had skipped offset 0, and
reading that after the upgrade might get confused. We could continue to
ignore a 0 XID in the members array like the comment says; I think that
would be enough. But yeah, maybe it's better to bite the bullet in
pg_upgrade and squeeze those out.
Correct. I couldn't explain this better. I'm more for the squeeze those out. Overwise, we're ending up in adding another hack in multixact, but one of the benefits from switching to 64-bits, it should make XID's logic more straight forward. After all, mxact juggling in pg_upgrade is one time inconvenience.
Does your upgrade test suite include case 3, where the next multixact's
offset is 1?
Not exactly.
simple
Latest checkpoint's NextMultiXactId: 119441
Latest checkpoint's NextMultiOffset: 5927049
offset-wrap
Latest checkpoint's NextMultiXactId: 119441
Latest checkpoint's NextMultiOffset: 5591183
multi-wrap
Latest checkpoint's NextMultiXactId: 82006
Latest checkpoint's NextMultiOffset: 7408811
offset-multi-wrap
Latest checkpoint's NextMultiXactId: 52146
Latest checkpoint's NextMultiOffset: 5591183
You want test case where NextMultiOffset will be 1?
simple
Latest checkpoint's NextMultiXactId: 119441
Latest checkpoint's NextMultiOffset: 5927049
offset-wrap
Latest checkpoint's NextMultiXactId: 119441
Latest checkpoint's NextMultiOffset: 5591183
multi-wrap
Latest checkpoint's NextMultiXactId: 82006
Latest checkpoint's NextMultiOffset: 7408811
offset-multi-wrap
Latest checkpoint's NextMultiXactId: 52146
Latest checkpoint's NextMultiOffset: 5591183
You want test case where NextMultiOffset will be 1?
Can we remove MaybeExtendOffsetSlru() now? There are a bunch of other
comments and checks that talk about binary-upgraded values too that we
can hopefully clean up now.
Yes, technically we can. But this is kinda unrelated to the offsets and will make the patch set significantly complicated, thus more complicated to review and less likely to be committed. Again, I'm not opposing the idea, I'm not sure if it is worth to do it right now.
If we are to parse the member segments in detail in upgrade anyway, I'd
be tempted to make some further changes / optimizations:
- You could leave out all locking XID members in upgrade, because
they're not relevant after upgrade any more (all the XIDs will be
committed or aborted and have released the locks; we require prepared
transactions to be completed before upgrading too). It'd be enough to
include actual UPDATE/DELETE XIDs.
- The way we determine the length of the members array by looking at the
next multixid's offset is a bit complicated. We could have one extra
flag per XID in the members to indicate "this is the last member of this
multixid". That could either to replace the current mechanism of looking
at the next offset, or be just an additional cross-check.
- Do we still like the "group" representation, with 4 bytes of flags
followed by 4 XIDs? I wonder if it'd be better to just store 5 bytes per
XID unaligned.
Not really. But I would leave it for next iteration - switching multi to 64 bit. I already have some drafts for this. In any case, we'll must do adjustments in pg_upgrade again. My goal is to move towards 64 XIDs, but with the small steps, and I plan changes in "group" representation in combination with switching multi to 64 bit. This seems a bit more appropriate in my view.
As for your optimization suggestions, I like them. I don’t against them, but I’m afraid to disrupt the clarity of thought, especially since the algorithm is not the simplest.
--
Best regards,
Maxim Orlov.
pgsql-hackers by date: