Home > mailing lists

Re: pg_stat_io for the startup process - Mailing list pgsql-hackers

From	Kyotaro Horiguchi
Subject	Re: pg_stat_io for the startup process
Date	April 28, 2023 02:15:51
Msg-id	20230428.111551.498122818064927621.horikyota.ntt@gmail.com Whole thread Raw
In response to	Re: pg_stat_io for the startup process (Melanie Plageman <melanieplageman@gmail.com>)
Responses	Re: pg_stat_io for the startup process
List	pgsql-hackers

Tree view

At Thu, 27 Apr 2023 17:30:40 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in 
> On Wed, Apr 26, 2023 at 2:53 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> >
> > Hi all,
> >
> > Robert Haas <robertmhaas@gmail.com>, 26 Nis 2023 Çar, 15:34 tarihinde şunu yazdı:
> >>
> >> On Wed, Apr 26, 2023 at 5:47 AM Kyotaro Horiguchi
> >> <horikyota.ntt@gmail.com> wrote:
> >> > 3. When should we call pgstat_report_stats on the startup process?
> >> >
> >> > During recovery, I think we can call pgstat_report_stats() (or a
> >> > subset of it) right before invoking WaitLatch and at segment
> >> > boundaries.
> >>
> >> I think this kind of idea is worth exploring. Andres mentioned timers,
> >> but it seems to me that something where we just do it at certain
> >> "convenient" points might be good enough and simpler. I'd much rather
> >> have statistics that were up to date as of the last time we finished a
> >> segment than have nothing at all.
> >
> >
> > I created a rough prototype of a timer-based approach for comparison. Please see attached.
> >
> > Registered a new timeout named "STARTUP_STAT_FLUSH_TIMEOUT", The timeout's
> > handler sets a flag indicating that io stats need to be flushed.
> > HandleStartupProcInterrupts checks the flag to flush stats. It's enabled if
> > any WAL record is read (in the main loop of PerformWalRecovery). And it's
> > disabled before WaitLatch to avoid unnecessary wakeups if the startup process
> > decides to sleep.
> 
> I was reviewing this and found that if I remove the calls to
> disable_startup_stat_flush_timeout() the number of calls to
> pgstat_report_wal() on a briefly active and then idle standby are about
> 8 in 100 seconds whereas with timer disablement, the calls over the same
> period are about 40. I would have thought that disabling the timer would
> have caused us to try and flush the stats less often.
> 
> With the calls to disable_startup_stat_flush_timeout(), we do have much
> fewer calls to setitimer(), of course.
> 
> Given the below suggestion by Andres, I tried doing some traces of the
> various approaches.
> 
> > > +       disable_startup_stat_flush_timeout();
> > > +
> > >         (void) WaitLatch(&XLogRecoveryCtl->recoveryWakeupLatch,
> > >                         WL_LATCH_SET | WL_TIMEOUT |
> > >                         WL_EXIT_ON_PM_DEATH,
> >
> > I think always disabling the timer here isn't quite right - we want to wake up
> > *once* in WaitForWALToBecomeAvailable(), otherwise we'll not submit pending
> > stats before waiting - potentially for a long time - for WAL. One way would be
> > just explicitly report before the WaitLatch().
> >
> > Another approach, I think better, would be to not use enable_timeout_every(),
> > and to explicitly rearm the timer in HandleStartupProcInterrupts(). When
> > called from WaitForWALToBecomeAvailable(), we'd not rearm, and instead do so
> > at the end of WaitForWALToBecomeAvailable().  That way we also wouldn't
> > repeatedly fire between calls to HandleStartupProcInterrupts().
> 
> After a quick example implementation of this, I found that it seemed to
> try and flush the stats less often on an idle standby (good) than using
> enable_timeout_every().

Just rearming with the full-interval will work that way. Our existing
strategy for this is seen in PostgresMain().

   stats_timeout = pgstat_report_stat(false);
   if (stats_timeout > 0)
   {
      if (!get_timeout_active(BLAH_TIMEOUT))
        enable_timeout_after(BLAH_TIMEOUT, stats_timeout);
   }
   else
   {
       if (get_timeout_active(BLAH_TIMEOUT))
         disable_timeout(BLAH_TIMEOUT, false);
   }
   WaitLatch();

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

From: "Yu Shi (Fujitsu)"
Date: 28 April 2023, 02:05:39
Subject: RE: Fix a test case in 035_standby_logical_decoding.pl

From: Kyotaro Horiguchi
Date: 28 April 2023, 02:43:48
Subject: Re: pg_stat_io for the startup process

Re: pg_stat_io for the startup process - Mailing list pgsql-hackers

Previous

Next