On 20/06/2025 18:45, Maxim Orlov wrote:
> Hi!
>
> I've noticed two bugs reported [0] and [1] which are both related to the
> wraparound of mxid and
> mxoff. Problems for mxid and mxoff are minor, as they require hitting
> the exact overflow limit to
> occur. But it's better to correct it.
>
> I included a test to reproduce the problem (see 0001). It is not
> intended to be committed, I
> guess. I then added a commit with a fix.
>
> There two fixes.
> 1) pg_upgrade does not consider the mxid to be in a wraparound state. In
> this case, I adjust
> the mxid value to the FirstMultiXactId. Alternatively, it might be
> conceivable to include the
> ability to set InvalidMultiXactId in pg_resetwal, but this seems
> odd to me.
I think we should allow InvalidMultiXactId in pg_resetwal. It can appear
in the control file of existing clusters, so it seems weird if you
cannot use pg_resetwal to reset to that value.
I also find it weird that we ever store InvalidMultiXactId in the
control file, and have code in all the places that read it to deal with
the possibility. It would seem much more straightforward to skip over
InvalidMultiXactId when we set 'nextMXact'. I think we should change
that in v19, but that's a separate patch.
> 2) pg_resetwall forbids to set mxoff to UINT_MAX32. I'm not sure if this
> was done on
> purpose or not, but perhaps this check can be removed.
With your patch, pg_resetwal will *always* reset nextMultiOffset, even
if you didn't specify --multixact-offset.
I propose the attached fix. It adds separate booleans to pg_resetwal to
track whether -m or -O option was given, and allows 0 for the next
multixid and UINT32_MAX for next multixact offset.
Can oldestMultiXactId ever be 0? I think not, but then again
ReadMultiXactIdRange() seems to be prepared to deal with that.
- Heikki