Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring - Mailing list pgsql-hackers
From | Naga Appani |
---|---|
Subject | Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring |
Date | |
Msg-id | CA+QeY+C3tc12wWX56yHozLYK=MmCkn1fODzFWC6-dSY+TZghTQ@mail.gmail.com Whole thread Raw |
In response to | Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
List | pgsql-hackers |
Hi Michael, Ashutosh, Thanks a lot for the detailed reviews and feedback. Please find attached v4 of the patchset. Summary of changes in v4: - Split into two patches as suggested: 1. Expose and rename ReadMultiXactCounts() -> GetMultiXactInfo() in multixact.h with clearer comments. 2. Add pg_get_multixact_stats() as a SQL-callable function in a new file (multixactfuncs.c), with docs and tests. - Function now also returns oldestOffset for consistency. - Field names updated to num_mxids, num_members, oldest_multixact, oldest_offset. - Documentation revised to describe thresholds only in terms of member counts (disk size wording removed). - Added a minimal example in maintenance.sgml where multixact wraparound is already described. - Isolation tests are rewritten so they no longer depend on exact counts, but only on monotonic properties guaranteed while a multixact is pinned. Replies inline below: On Thu, Aug 7, 2025 at 7:35 PM Michael Paquier <michael@paquier.xyz> wrote: > > I really think that we should move the SQL function parts of multixact.c > into their own new file, exposing ReadMultiXactCounts() in multixact.h... Done. The SQL-callable code now lives in src/backend/utils/adt/multixactfuncs.c and the accessor is declared in src/include/access/multixact.h. > ReadMultiXactCounts() is also incorrectly named with your proposal to > expose oldestMultiXactId in the information returned to the caller... > So perhaps this should be named GetMultiXactInformation() or something > similar? Renamed to GetMultiXactInfo(). > The top of ReadMultiXactCounts() (or whatever its new name) should > also document the information returned across a single call. Added detailed comments about consistency under a single LWLock and the meaning of each field. > It looks inconsistent to return oldestMultiXactId if the > oldestOffsetKnown is false. What about oldestOffset itself? GetMultiXactInfo() now returns oldestOffset as well. If the oldest offset isn’t currently known, the function returns false and clears all outputs, so callers don’t see a partially valid struct. --- On Fri, Aug 8, 2025 at 4:33 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Would it be better to do that math in the function and output the result? That’s a cool idea, thanks for pointing it out. For now I have kept the SQL function focused only on exposing the raw counts (num_mxids, num_members, oldest IDs). My thought was that keeping the API lean makes it easier to maintain across versions, while leaving any derived calculations like approximate storage size to SQL or external tooling. This way the function remains simple and future-proof, while still giving users the building blocks to get the view they need. I’m happy to revisit this if others feel it would be better for the function to provide an approximate size directly — I wanted to start with the simplest surface and gather feedback first. > Now that the name of the function is changed, we need the names to > indicate that they are counts e.g. num_mxids, num_members. Adjusted. The SQL function returns: num_mxids, num_members, oldest_multixact, oldest_offset. > This file doesn't provide usage examples of other functions. This > function doesn't seem to be an exception. Earlier I thought it was fine to add an example since pg_input_error_info() also has one, so in this version I placed the example in maintenance.sgml, where we already discuss multixact wraparound. That seemed like the most natural place for it. I agree with your point about consistency, though, so I kept the style minimal and aligned with the surrounding text. > I think we should mention that the statistics may get stale as soon as > it's fetched, even with REPEATABLE READ isolation level. Added a note that values are a live snapshot and can change immediately. > In case each member starts consuming more or less space than it does > today what would be the basis of triggering wraparound? Space or > number of members? I think we should mention only that. I updated the docs to describe wraparound in terms of member counts only. The earlier mention of disk size has been dropped, since the thresholds are defined by counts. > This is the right place to elaborate the usage of this function with an > example. Expanded with a short example, while keeping it consistent with nearby entries. > ... since startup or the number of existing members? Clarified that the values reflect what’s *currently in use* (i.e., derived from next/oldest) and that NULLs are returned if the multixact subsystem has not been initialized yet. > The last two lines are not required, I think. One of its usage is > monitoring but users may find other usages. Dropped those lines. > Vacuum may clean the multixact between commit2 and check, in which > case the result won't be stable. Right, the earlier version of the test assumed stable counts, which could fail if autovacuum or background cleanup removed entries in between steps. In v4 the isolation test no longer relies on exact numbers. Instead it asserts only the monotonic properties that are guaranteed while a multixact is pinned, and avoids assumptions once locks are released. That makes the test robust against concurrent vacuum activity. --- Thanks again for the thoughtful reviews and guidance. Please let me know if you see further adjustments needed. Best regards, Naga
Attachment
pgsql-hackers by date: