Re: Add session statistics to pg_stat_database - Mailing list pgsql-hackers
From | Laurenz Albe |
---|---|
Subject | Re: Add session statistics to pg_stat_database |
Date | |
Msg-id | 2099659e5d679d070f0bb2edb573f296f3c9813f.camel@cybertec.at Whole thread Raw |
In response to | Re: Add session statistics to pg_stat_database (Magnus Hagander <magnus@hagander.net>) |
Responses |
Re: Add session statistics to pg_stat_database
|
List | pgsql-hackers |
On Tue, 2020-11-17 at 17:33 +0100, Magnus Hagander wrote: > I've taken a look as well, and here are a few short notes: Much appreciated! > * It talks about "number of connections" but "number of aborted sessions". We should probably > be consistent about talking either about connections or sessions? In particular, connections > seems wrong in this case, because it only starts counting after authentication is complete > (since otherwise we send no stats)? (This goes for both docs and actual function names) Yes, that is true. I have changed "connections" to "sessions" and renamed the new column "connections" to "session_count". I think that most people will understand a session as started after a successful connection. > * Is there a reason we're counting active and idle in transaction (including aborted), > but not fastpath? In particular, we seem to ignore fastpath -- if we don't want to single > it out specifically, it should probably be included in active? The only reason is that I didn't think of it. Fixed. > * pgstat_send_connstat() but pgstat_recv_connection(). Let's call both connstat or both > connection (I'd vote connstat)? Agreed, done. > * Is this actually a fix that's independent of the new stats? It seems in general to be > changing the behaviour of "force", which is more generic? > - !have_function_stats) > + !have_function_stats && !force) The comment right above that reads: /* Don't expend a clock check if nothing to do */ So it is just a quick exit if there is nothing to do. But with that patch we have something to do if "force" (see below) is true: Report the remaining session duration and if the session was closed normally. Thus the additional check. > * in pgstat_send_connstat() you pass the parameter "force" in as "disconnect". > That behaviour at least requires a comment saying why, I think. My understanding is > it relies on that "force" means this is a "backend is shutting down", but that is not > actually documented anywhere. Maybe the "force" parameter should actually be renamed > to indicate this is really what it means, to avoid a future mistake in the area? > But even with that, how does that turn into disconnect? "pgstat_report_stat(true)" is only called from "pgstat_beshutdown_hook()", so it is currently only called when the backend is about to exit. According the the comments the flag means that "caller wants to force stats out". I guess that the author thought that there may arise other reasons to force sending statistics in the future (commit 641912b4d from 2007). However, since that has not happened, I have renamed the flag to "disconnect" and adapted the documentation. This doesn't change the current behavior, but establishes a new rule. > * Maybe rename pgStatSessionDisconnected to pgStatSessionNormalDisconnected? > To avoid having to go back to the setting point and look it up in a comment. Long, descriptive names are a good thing. I have decided to use "pgStatSessionDisconnectedNormally", since that is even longer and seems to fit the "yes or no" category better. > I wonder if there would also be a way to count "sessions that crashed" as well. > That is,the ones that failed in a way that caused the postmaster to restart the system. > But that's information we'd have to send from the postmaster, but I'm actually unsure > if we're "allowed" to send things to the stats collector from the postmaster. > But I think it could be quite useful information to have. Maybe we can find some way > to piggyback on the fact that we're restarting the stats collector as a result? Sure, a crash count would be useful. I don't know if it is easy for the stats collector to tell the difference between a start after a backend crash and - say - starting from a base backup. Patch v6 attached. I think that that would be material for another patch, and I don't think it should go to "pg_stat_database", because a) it might be hard to tell to which database the crashed backend was attached, b) it might be a background process that doesn't belong to a database and c) if the crash were caused by - say - corruption in a shared catalog, it would be misleading. Yours, Laurenz Albe
Attachment
pgsql-hackers by date: