Thread: "Time of latest checkpoint" stays too old on both master and slave

"Time of latest checkpoint" stays too old on both master and slave

From

rihad

Date:

30 June 2019, 06:49:27

Current time is 10:44. pg_controldata shows on both on master & slave 
server which uses streaming replication:

Time of latest checkpoint:            Sun Jun 30 07:49:18 2019

So it was almost 3 hours ago. There are always some heavy writes and a 
new WAL file in the pg_wal/ directory is created every few minutes. 
checkpoint_timeout is 20min so it should have triggered long ago.

checkpoint_timeout = 20min #5min                # range 30s-1d
max_wal_size = 8GB
min_wal_size = 80MB
checkpoint_completion_target = 0.9


hot_standby is enabled on the slave, hot_standby_feedback is off not to 
bloat the master,

hot_standby_streaming_delay is 30min.

Experiencing this long delay after the upgrade (via dump/restore) from 
PG 9.6 to 11.4.


Thanks for any tips.

Re: "Time of latest checkpoint" stays too old on both master andslave

From

rihad

Date:

30 June 2019, 06:59:06

Damn. Sorry, and please disregard my post. The master server had the 
wrong time. Not wrong TZ, simply wrong time.

$ date
Sun Jun 30 08:34:52 +04 2019

while it's currently 10:58

Re: "Time of latest checkpoint" stays too old on both master andslave

From

rihad

Date:

30 June 2019, 07:07:10

On 06/30/2019 10:59 AM, rihad wrote:
> Damn. Sorry, and please disregard my post. The master server had the 
> wrong time. Not wrong TZ, simply wrong time.
>
> $ date
> Sun Jun 30 08:34:52 +04 2019
>
> while it's currently 10:58

There's a weird problem, even when the time is initially set by openntpd 
it keeps lagging by one second every few seconds:


$ sudo /usr/local/etc/rc.d/openntpd restart
Performing sanity check on openntpd configuration:
configuration OK
Stopping openntpd.
Waiting for PIDS: 85893.
Performing sanity check on openntpd configuration:
configuration OK
Starting openntpd.
$ ssh good-server date; date
Sun Jun 30 11:04:17 +04 2019
Sun Jun 30 11:04:17 +04 2019
$ ssh good-server date; date
Sun Jun 30 11:04:25 +04 2019
Sun Jun 30 11:04:24 +04 2019
$ ssh good-server date; date
Sun Jun 30 11:04:32 +04 2019
Sun Jun 30 11:04:31 +04 2019
$ ssh good-server date; date
Sun Jun 30 11:04:39 +04 2019
Sun Jun 30 11:04:37 +04 2019
$ ssh good-server date; date
Sun Jun 30 11:04:48 +04 2019
Sun Jun 30 11:04:45 +04 2019


Really weird. But this isn't a PG problem at all, just a server glitch 
maybe. sorry again.

Re: "Time of latest checkpoint" stays too old on both master and slave

From

Andrew Gierth

Date:

30 June 2019, 17:45:03

>>>>> "rihad" == rihad  <rihad@mail.ru> writes:

 rihad> There's a weird problem, even when the time is initially set by
 rihad> openntpd it keeps lagging by one second every few seconds:

 rihad> $ sudo /usr/local/etc/rc.d/openntpd restart

What OS is this?

I've seen this kind of thing with FreeBSD where the kernel timecounter
source has been chosen badly (i.e. choosing TSC when the TSC isn't
actually invariant enough). Forcing TSC not to be used fixes it. The
configuration I've especially noticed it on is when running in a VM with
a single virtual CPU.

-- 
Andrew (irc:RhodiumToad)

Re: "Time of latest checkpoint" stays too old on both master andslave

From

rihad

Date:

30 June 2019, 17:55:09

On 06/30/2019 09:45 PM, Andrew Gierth wrote:
>>>>>> "rihad" == rihad  <rihad@mail.ru> writes:
>   rihad> There's a weird problem, even when the time is initially set by
>   rihad> openntpd it keeps lagging by one second every few seconds:
>
>   rihad> $ sudo /usr/local/etc/rc.d/openntpd restart
>
> What OS is this?
>
> I've seen this kind of thing with FreeBSD where the kernel timecounter
> source has been chosen badly (i.e. choosing TSC when the TSC isn't
> actually invariant enough). Forcing TSC not to be used fixes it. The
> configuration I've especially noticed it on is when running in a VM with
> a single virtual CPU.
>

Exactly. You're right. It's on FreeBSD 11.2. After some googling earlier 
I changed kern.timecounter.hardware=HPET and solved the problem. The 
default chosen value TSC-low seems to misbehave for this box, although 
it works on others (running the same FreeBSD version).