Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) |
Date | |
Msg-id | CALj2ACXsmvEDsHtADoQeDE-NVyBh4A-D80NmNanTUc37g=FQ-Q@mail.gmail.com Whole thread Raw |
In response to | Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) (Nitin Jadhav <nitinjadhavpostgres@gmail.com>) |
Responses |
Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
|
List | pgsql-hackers |
On Wed, Mar 2, 2022 at 4:45 PM Nitin Jadhav <nitinjadhavpostgres@gmail.com> wrote: > > Also, how about special phases for SyncPostCheckpoint(), > > SyncPreCheckpoint(), InvalidateObsoleteReplicationSlots(), > > PreallocXlogFiles() (it currently pre-allocates only 1 WAL file, but > > it might be increase in future (?)), TruncateSUBTRANS()? > > SyncPreCheckpoint() is just incrementing a counter and > PreallocXlogFiles() currently pre-allocates only 1 WAL file. I feel > there is no need to add any phases for these as of now. We can add in > the future if necessary. Added phases for SyncPostCheckpoint(), > InvalidateObsoleteReplicationSlots() and TruncateSUBTRANS(). Okay. > > 10) I'm not sure if it's discussed, how about adding the number of > > snapshot/mapping files so far the checkpoint has processed in file > > processing while loops of > > CheckPointSnapBuild/CheckPointLogicalRewriteHeap? Sometimes, there can > > be many logical snapshot or mapping files and users may be interested > > in knowing the so-far-processed-file-count. > > I had thought about this while sharing the v1 patch and mentioned my > views upthread. I feel it won't give meaningful progress information > (It can be treated as statistics). Hence not included. Thoughts? Okay. If there are any complaints about it we can always add them later. > > > > As mentioned upthread, there can be multiple backends that request a > > > > checkpoint, so unless we want to store an array of pid we should store a number > > > > of backend that are waiting for a new checkpoint. > > > > > > Yeah, you are right. Let's not go that path and store an array of > > > pids. I don't see a strong use-case with the pid of the process > > > requesting checkpoint. If required, we can add it later once the > > > pg_stat_progress_checkpoint view gets in. > > > > I don't think that's really necessary to give the pid list. > > > > If you requested a new checkpoint, it doesn't matter if it's only your backend > > that triggered it, another backend or a few other dozen, the result will be the > > same and you have the information that the request has been seen. We could > > store just a bool for that but having a number instead also gives a bit more > > information and may allow you to detect some broken logic on your client code > > if it keeps increasing. > > It's a good metric to show in the view but the information is not > readily available. Additional code is required to calculate the number > of requests. Is it worth doing that? I feel this can be added later if > required. Yes, we can always add it later if required. > Please find the v4 patch attached and share your thoughts. I reviewed v4 patch, here are my comments: 1) Can we convert below into pgstat_progress_update_multi_param, just to avoid function calls? pgstat_progress_update_param(PROGRESS_CHECKPOINT_LSN, checkPoint.redo); pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE, 2) Why are we not having special phase for CheckPointReplicationOrigin as it does good bunch of work (writing to disk, XLogFlush, durable_rename) especially when max_replication_slots is large? 3) I don't think "requested" is necessary here as it doesn't add any value or it's not a checkpoint kind or such, you can remove it. 4) s:/'recycling old XLOG files'/'recycling old WAL files' + WHEN 16 THEN 'recycling old XLOG files' 5) Can we place CREATE VIEW pg_stat_progress_checkpoint AS definition next to pg_stat_progress_copy in system_view.sql? It looks like all the progress reporting views are next to each other. 6) How about shutdown and end-of-recovery checkpoint? Are you planning to have an ereport_startup_progress mechanism as 0002? 7) I think you don't need to call checkpoint_progress_start and pgstat_progress_update_param, any other progress reporting function for shutdown and end-of-recovery checkpoint right? 8) Not for all kinds of checkpoints right? pg_stat_progress_checkpoint can't show progress report for shutdown and end-of-recovery checkpoint, I think you need to specify that here in wal.sgml and checkpoint.sgml. + command <command>CHECKPOINT</command>. The checkpointer process running the + checkpoint will report its progress in the + <structname>pg_stat_progress_checkpoint</structname> view. See + <xref linkend="checkpoint-progress-reporting"/> for details. 9) Can you add a test case for pg_stat_progress_checkpoint view? I think it's good to add one. See, below for reference: -- Add a trigger to catch and print the contents of the catalog view -- pg_stat_progress_copy during data insertion. This allows to test -- the validation of some progress reports for COPY FROM where the trigger -- would fire. create function notice_after_tab_progress_reporting() returns trigger AS $$ declare report record; 10) Typo: it's not "is happens" + The checkpoint is happens without delays. 11) Can you be specific what are those "some operations" that forced a checkpoint? May be like, basebackup, createdb or something? + The checkpoint is started because some operation forced a checkpoint. 12) Can you be a bit elobartive here who waits? Something like the backend that requested checkpoint will wait until it's completion .... + Wait for completion before returning. 13) "removing unneeded or flushing needed logical rewrite mapping files" + The checkpointer process is currently removing/flushing the logical 14) "old WAL files" + The checkpointer process is currently recycling old XLOG files. Regards, Bharath Rupireddy.
pgsql-hackers by date: