Flush some statistics within running transactions - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Flush some statistics within running transactions
Date
Msg-id aWTVEycKj7Qh/SXH@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
List pgsql-hackers
Hi hackers,

Long running transactions can accumulate significant statistics (WAL, IO, ...)
that remain unflushed until the transaction ends. This delays visibility of
resource usage in monitoring views like pg_stat_io and pg_stat_wal.

This patch series introduce the ability to $SUBJECT (suggested in [1]) to:

- improve monitoring of long running transactions
- avoid missing places where we should flush statistics (like the one fixed in
039549d70f6)

The patch series is made of 3 sub-patches:

0001: Add pgstat_report_anytime_stat() for periodic stats flushing

It introduces pgstat_report_anytime_stat(), which flushes non transactional
statistics even inside active transactions. A new timeout handler fires every
second to call this function, ensuring timely stats visibility without waiting
for transaction completion.

Implementation details:

- Add PgStat_FlushBehavior enum to classify stats kinds:
  * FLUSH_ANYTIME: Stats that can always be flushed (WAL, IO, ...)
  * FLUSH_AT_TXN_BOUNDARY: Stats requiring transaction boundaries

- Modify pgstat_flush_pending_entries() and pgstat_flush_fixed_stats() to accept
a boolean anytime_only parameter:
   * When false: flushes all stats (existing behavior)
   * When true: flushes only FLUSH_ANYTIME stats and skips FLUSH_AT_TXN_BOUNDARY
     stats

- Register ANYTIME_STATS_UPDATE_TIMEOUT that fires every 1 second, calling
pgstat_report_anytime_stat(false)

Remarks:

- The force parameter in pgstat_report_anytime_stat() is currently unused (always
called with force=false) but reserved for future use cases requiring immediate flushing.

The 1 second flush interval is currently hardcoded but we could imagine increase
it or make it configurable. I ran some benchmarks and did not notice any noticeable
performance regression even with a large number of pending entries.

0002: Remove useless calls to flush some stats

Now that some stats can be flushed outside of transaction boundaries, remove
useless calls to flush some stats. Those calls were in place because
before 0001 stats were flushed only at transaction boundaries.

Remarks:

- it reverts 039549d70f6 (it just keeps its tests)
- it can't be done for checkpointer and bgworker for example because they don't
have a flush callback to call
- it can't be done for auxiliary process (walsummarizer for example) because they
currently do not register the new timeout handler
- we may want to improve the current behavior to "fix" the 2 above

0003: Add FLUSH_MIXED support and implement it for RELATION stats

This patch extends the non transactional stats infrastructure to support statistics
kinds with mixed transaction behavior: some fields are transactional (e.g., tuple
inserts/updates/deletes) while others are non transactional (e.g., sequential scans
blocks read, ...).

It introduces FLUSH_MIXED as a third flush behavior type, alongside FLUSH_ANYTIME
and FLUSH_AT_TXN_BOUNDARY. For FLUSH_MIXED kinds, a new flush_anytime_cb callback
enables partial flushing of only the non transactional fields during running
transactions.

Some tests are also added.

Implementation details:

- Add FLUSH_MIXED to PgStat_FlushBehavior enum
- Add flush_anytime_cb to PgStat_KindInfo for partial flushing callback
- Update pgstat_flush_pending_entries() to call flush_anytime_cb for
  FLUSH_MIXED entries when in anytime_only mode
- Keep FLUSH_MIXED entries in the pending list after partial flush, as
  transactional fields still need to be flushed at transaction boundary

RELATION stats are making use of FLUSH_MIXED:

- Change RELATION from TXN_ALL to FLUSH_MIXED
- Implement pgstat_relation_flush_anytime_cb() to flush only read related
  stats: numscans, tuples_returned, tuples_fetched, blocks_fetched,
  blocks_hit
- Clear these fields after flushing to prevent double counting when
  pgstat_relation_flush_cb() runs at transaction commit
- Transactional stats (tuples_inserted, tuples_updated, tuples_deleted,
  live_tuples, dead_tuples) remain pending until transaction boundary

Remark:

We could also imagine adding a new flush_anytime_static_cb() callback for
future FLUSH_MIXED fixed amount stats.

[1]: https://postgr.es/m/erpzwxoptqhuptdrtehqydzjapvroumkhh7lc6poclbhe7jk7l%40l3yfsq5q4pw7

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Alena Vinter
Date:
Subject: Re: Resetting recovery target parameters in pg_createsubscriber
Next
From: Henson Choi
Date:
Subject: Re: Row pattern recognition