Re: Add Information during standby recovery conflicts - Mailing list pgsql-hackers
From | Drouvot, Bertrand |
---|---|
Subject | Re: Add Information during standby recovery conflicts |
Date | |
Msg-id | 48a1c817-cbde-33c4-2672-13ef37e1d946@amazon.com Whole thread Raw |
In response to | Re: Add Information during standby recovery conflicts (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Add Information during standby recovery conflicts
|
List | pgsql-hackers |
Hi, On 11/25/20 2:20 PM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Fri, Nov 20, 2020 at 6:18 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> Hi, >> >> On 11/17/20 4:44 PM, Fujii Masao wrote: >>> Thanks for updating the patch! Here are review comments. >>> >>> + Controls whether a log message is produced when the startup >>> process >>> + is waiting longer than <varname>deadlock_timeout</varname> >>> + for recovery conflicts. >>> >>> But a log message can be produced also when the backend is waiting >>> for recovery conflict. Right? If yes, this description needs to be >>> corrected. >> Thanks for looking at it! >> >> I don't think so, only the startup process should write those new log >> messages. >> >> What makes you think that would not be the case? >> >>> >>> + for recovery conflicts. This is useful in determining if >>> recovery >>> + conflicts prevents the recovery from applying WAL. >>> >>> "prevents" should be "prevent"? >> Indeed: fixed in the new attached patch. >> >>> >>> + TimestampDifference(waitStart, GetCurrentTimestamp(), &secs, >>> &usecs); >>> + msecs = secs * 1000 + usecs / 1000; >>> >>> GetCurrentTimestamp() is basically called before LogRecoveryConflict() >>> is called. So isn't it better to avoid calling GetCurrentTimestamp() >>> newly in >>> LogRecoveryConflict() and to reuse the timestamp that we got? >>> It's helpful to avoid the waste of cycles. >>> >> good catch! fixed in the new attached patch. >> >>> + while (VirtualTransactionIdIsValid(*vxids)) >>> + { >>> + PGPROC *proc = >>> BackendIdGetProc(vxids->backendId); >>> >>> BackendIdGetProc() can return NULL if the backend is not active >>> at that moment. This case should be handled. >>> >> handled in the new attached patch. >>> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: >>> + reasonDesc = gettext_noop("recovery is still >>> waiting recovery conflict on buffer pin"); >>> >>> It's natural to use "waiting for recovery" rather than "waiting >>> recovery"? >>> >> I would be tempted to say so, the new patch makes use of "waiting for". >>> + /* Also, set deadlock timeout for logging purpose if >>> necessary */ >>> + if (log_recovery_conflict_waits) >>> + { >>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>> + timeouts[cnt].type = TMPARAM_AFTER; >>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>> + cnt++; >>> + } >>> >>> This needs to be executed only when the message has not been logged yet. >>> Right? >>> >> good catch: fixed in the new attached patch. >> > Thank you for updating the patch! Here are review comments on the > latest version patch. > > + while (VirtualTransactionIdIsValid(*vxids)) > + { > + PGPROC *proc = BackendIdGetProc(vxids->backendId); > + > + if (proc) > + { > + if (nprocs == 0) > + appendStringInfo(&buf, "%d", proc->pid); > + else > + appendStringInfo(&buf, ", %d", proc->pid); > + > + nprocs++; > + vxids++; > + } > + } > > We need to increment vxids even if *proc is null. Otherwise, the loop won't end. My bad, that's fixed. > > --- > + TimestampTz cur_ts = GetCurrentTimestamp();; Fixed > > There is an extra semi-colon. > > --- > int max_standby_streaming_delay = 30 * 1000; > +bool log_recovery_conflict_waits = false; > +bool logged_lock_conflict = false; > > > + if (log_recovery_conflict_waits && !logged_lock_conflict) > + { > + timeouts[cnt].id = STANDBY_TIMEOUT; > + timeouts[cnt].type = TMPARAM_AFTER; > + timeouts[cnt].delay_ms = DeadlockTimeout; > + cnt++; > + } > > Can we pass a bool indicating if a timeout may be needed for recovery > conflict logging from ProcSleep() to ResolveRecoveryConflictWithLock() > instead of using a static variable? Yeah that makes more sense, specially as we already have logged_recovery_conflict at our disposal. New patch version attached. Thanks! Bertrand
Attachment
pgsql-hackers by date: