Re: Race condition in SyncRepGetSyncStandbysPriority - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Race condition in SyncRepGetSyncStandbysPriority |
Date | |
Msg-id | CA+fd4k4oJZYmi8+U=_8_CH3Lwk2ZQoeV2rmGkYtvoAnHXsQsTQ@mail.gmail.com Whole thread Raw |
In response to | Re: Race condition in SyncRepGetSyncStandbysPriority (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Race condition in SyncRepGetSyncStandbysPriority
|
List | pgsql-hackers |
On Sat, 18 Apr 2020 at 00:31, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes: > > At Fri, 17 Apr 2020 16:03:34 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > >> I agree that it might be worth considering the removal of am_sync for > >> the master branch or v14. But I think that it should not be > >> back-patched. > > > Ah! Agreed. > > Yeah, that's not necessary to fix the bug. I'd be inclined to leave > it for v14 at this point. > > I don't much like the patch Fujii-san posted, though. An important part > of the problem, IMO, is that SyncRepGetSyncStandbysPriority is too > complicated and it's unclear what dependencies it has on the set of > priorities in shared memory being consistent. His patch does not improve > that situation; if anything it makes it worse. > > If we're concerned about not breaking ABI in the back branches, what > I propose we do about that is just leave SyncRepGetSyncStandbys in > place but not used by the core code, and remove it only in HEAD. > We can do an absolutely minimal fix for the assertion failure, in > case anybody is calling that code, by just dropping the Assert and > letting SyncRepGetSyncStandbys return NIL if it falls out. (Or we > could let it return the incomplete list, which'd be the behavior > you get today in a non-assert build.) +1 > > Also, I realized while re-reading my patch that Kyotaro-san is onto > something about the is_sync_standby flag not being necessary: instead > we can just have the new function SyncRepGetCandidateStandbys return > a reduced count. I'd initially believed that it was necessary for > that function to return the rejected candidate walsenders along with > the accepted ones, but that was a misunderstanding. I still don't > want its API spec to say anything about ordering of the result array, > but we don't need to. > > So that leads me to the attached. I propose applying this to the > back branches except for the rearrangement of WALSnd field order. > In HEAD, I'd remove SyncRepGetSyncStandbys and subroutines altogether. > + /* Quick out if not even configured to be synchronous */ + if (SyncRepConfig == NULL) + return false; I felt strange a bit that we do the above check in SyncRepGetSyncRecPtr() because SyncRepReleaseWaiters() which is the only caller says the following before calling it: /* * We're a potential sync standby. Release waiters if there are enough * sync standbys and we are considered as sync. */ LWLockAcquire(SyncRepLock, LW_EXCLUSIVE); Can we either change it to an assertion, move it to before acquiring SyncRepLock in SyncRepReleaseWaiters or just remove it? Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: