Re: shared-memory based stats collector - v68 - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: shared-memory based stats collector - v68 |
Date | |
Msg-id | CA+hUKGL9hY_VY=+oUK+Gc1iSRx-Ls5qeYJ6q=dQVZnT3R63Taw@mail.gmail.com Whole thread Raw |
In response to | Re: shared-memory based stats collector - v66 (Andres Freund <andres@anarazel.de>) |
Responses |
Re: shared-memory based stats collector - v68
|
List | pgsql-hackers |
On Mon, Apr 4, 2022 at 4:16 PM Andres Freund <andres@anarazel.de> wrote: > Please take a look! A few superficial comments: > [PATCH v68 01/31] pgstat: consistent function header formatting. > [PATCH v68 02/31] pgstat: remove some superflous comments from pgstat.h. +1 > [PATCH v68 03/31] dshash: revise sequential scan support. Logic looks good. That is, lock-0-and-ensure_valid_bucket_pointers()-only-once makes sense. Just some comment trivia: + * dshash_seq_term needs to be called when a scan finished. The caller may + * delete returned elements midst of a scan by using dshash_delete_current() + * if exclusive = true. s/scan finished/scan is finished/ s/midst of/during/ (or /in the middle of/, ...) > [PATCH v68 04/31] dsm: allow use in single user mode. LGTM. + Assert(IsUnderPostmaster || !IsPostmasterEnvironment); (Not this patch's fault, but I wish we had a more explicit way to say "am single user".) > [PATCH v68 05/31] pgstat: stats collector references in comments LGTM. I could think of some alternative suggested names for this subsystem, but don't think it would be helpful at this juncture so I will refrain :-) > [PATCH v68 06/31] pgstat: add pgstat_copy_relation_stats(). > [PATCH v68 07/31] pgstat: move transactional code into pgstat_xact.c. LGTM. > [PATCH v68 08/31] pgstat: introduce PgStat_Kind enum. +#define PGSTAT_KIND_FIRST PGSTAT_KIND_DATABASE +#define PGSTAT_KIND_LAST PGSTAT_KIND_WAL +#define PGSTAT_NUM_KINDS (PGSTAT_KIND_LAST + 1) It's a little confusing that PGSTAT_NUM_KINDS isn't really the number of kinds, because there is no kind 0. For the two users of it... maybe just use pgstat_kind_infos[] = {...}, and global_valid[PGSTAT_KIND_LAST + 1]? > [PATCH v68 10/31] pgstat: scaffolding for transactional stats creation / drop. + /* + * Dropping the statistics for objects that dropped transactionally itself + * needs to be transactional. ... Hard to parse. How about: "Objects are dropped transactionally, so related statistics need to be dropped transactionally too." > [PATCH v68 13/31] pgstat: store statistics in shared memory. + * Single-writer stats use the changecount mechanism to achieve low-overhead + * writes - they're obviously performance critical than reads. Check the + * definition of struct PgBackendStatus for some explanation of the + * changecount mechanism. Missing word "more" after obviously? + /* + * Whenever the for a dropped stats entry could not be freed (because + * backends still have references), this is incremented, causing backends + * to run pgstat_gc_entry_refs(), allowing that memory to be reclaimed. + */ + pg_atomic_uint64 gc_count; Whenever the ...? Would it be better to call this variable gc_request_count? + * Initialize refcount to 1, marking it as valid / not tdroped. The entry s/tdroped/dropped/ + * further if a longer lived references is needed. s/references/reference/ + /* + * There are legitimate cases where the old stats entry might not + * yet have been dropped by the time it's reused. The easiest case + * are replication slot stats. But oid wraparound can lead to + * other cases as well. We just reset the stats to their plain + * state. + */ + shheader = pgstat_reinit_entry(kind, shhashent); This whole comment is repeated in pgstat_reinit_entry and its caller. + /* + * XXX: Might be worth adding some frobbing of the allocation before + * freeing, to make it easier to detect use-after-free style bugs. + */ + dsa_free(pgStatLocal.dsa, pdsa); FWIW dsa_free() clobbers memory in assert builds, just like pfree(). +static Size +pgstat_dsa_init_size(void) +{ + Size sz; + + /* + * The dshash header / initial buckets array needs to fit into "plain" + * shared memory, but it's beneficial to not need dsm segments + * immediately. A size of 256kB seems works well and is not + * disproportional compared to other constant sized shared memory + * allocations. NB: To avoid DSMs further, the user can configure + * min_dynamic_shared_memory. + */ + sz = 256 * 1024; It kinda bothers me that the memory reserved by min_dynamic_shared_memory might eventually fill up with stats, and not be available for temporary use by parallel queries (which can benefit more from fast acquire/release on DSMs, and probably also huge pages, or maybe not...), and that's hard to diagnose. + * (4) turn off the idle-in-transaction, idle-session and + * idle-state-update timeouts if active. We do this before step (5) so s/idle-state-/idle-stats-/ + /* + * Some of the pending stats may have not been flushed due to lock + * contention. If we have such pending stats here, let the caller know + * the retry interval. + */ + if (partial_flush) + { I think it's better for a comment that is outside the block to say "If some of the pending...". Or the comment should be inside the blocks. +static void +pgstat_build_snapshot(void) +{ ... + dshash_seq_init(&hstat, pgStatLocal.shared_hash, false); + while ((p = dshash_seq_next(&hstat)) != NULL) + { ... + entry->data = MemoryContextAlloc(pgStatLocal.snapshot.context, ... + } + dshash_seq_term(&hstat); Doesn't allocation failure leave the shared hash table locked? > PATCH v68 16/31] pgstat: add pg_stat_exists_stat() for easier testing. pg_stat_exists_stat() is a weird name, ... would it be better as pg_stat_object_exists()? > [PATCH v68 28/31] pgstat: update docs. + Determines the behaviour when cumulative statistics are accessed AFAIK our manual is written in en_US, so s/behaviour/behavior/. + memory. When set to <literal>cache</literal>, the first access to + statistics for an object caches those statistics until the end of the + transaction / until <function>pg_stat_clear_snapshot()</function> is s|/|unless| + <literal>none</literal> is most suitable for monitoring solutions. If I'd change "solutions" to "tools" or maybe "systems". + When using the accumulated statistics views and functions to monitor collected data, it is important Did you intend to write "accumulated" instead of "cumulative" here? + You can invoke <function>pg_stat_clear_snapshot</function>() to discard the + current transaction's statistics snapshot / cache (if any). The next use I'd change s|/ cache|or cached values|. I think "/" like this is an informal thing.
pgsql-hackers by date: