Re: [HACKERS] some review comments on logical rep code - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [HACKERS] some review comments on logical rep code |
Date | |
Msg-id | CAD21AoDQFxeJ==n3o3ce1Wzp3gdxFU+pJYQJnuOH9FhHJ58dKw@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] some review comments on logical rep code (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: [HACKERS] some review comments on logical rep code
|
List | pgsql-hackers |
On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoB3L50jWwuyaGLxJzmrmXmYJAOBHzDPR=FKwjw29mxj0Q@mail.gmail.com> >> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> > Hi, >> > >> > Though I've read only a part of the logical rep code yet, I'd like to >> > share some (relatively minor) review comments that I got so far. >> >> It seems nobody is working on dealing with these review comments, so >> I've attached patches. Since there are lots of review comment I >> numbered each review comment. The prefix of patches I attached is >> corresponding to review comment number. >> Thank you for reviewing. >> 1. >> > >> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer >> > value from the argument, instead of DatumGetObjectId(). >> >> Attached 001 patch fixes it. > > Hmm. I looked at the launcher side and found that it assigns bare > integer to bgw_main_arg. It also should use Int32GetDatum. Yeah, I agreed. Will fix it. > >> logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid userid, >> Oid relid) >> { >> int slot; > ... >> for (slot = 0; slot < max_logical_replication_workers; slot++) > ... >> bgw.bgw_main_arg = slot; > > > >> 2. >> > >> > No one resets on_commit_launcher_wakeup flag to false. >> >> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke >> up the launcher process. > > It is omitting the abort case. Maybe it should be > AtEOXact_ApplyLauncher(bool commit)? Should we wake up the launcher process even when abort? > >> 3. >> > >> > ApplyLauncherWakeup() should be static function. >> >> Attached 003 patch fixes it (and also fixes #5 review comment). >> >> 4. >> > >> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's >> > subcommands (e.g., ENABLED one) should wake the launcher up on commit. >> >> This is also reported by Horiguchi-san on another thread[1], and patch exists. >> >> 5. >> > >> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? >> >> I also guess it's necessary. This change is included in #3 patch. > > IsBackendPid() is not free, I suppose that just ignoring failure > with ESRCH is enough. After thought, since LogicalRepCtx->launcher_pid could be 0 should, we check if launcher_pid != 0? > >> 6. >> > >> > Normal statements that the walsender for logical rep runs are logged >> > by log_replication_commands. So if log_statement is also set, >> > those commands are logged twice. >> >> Attached 006 patch fixes it by adding "-c log_statement = none" to >> connection parameter of libpqrcv if logical = true. > > The control by log_replication_commands is performed on > walsender, so this also shuld be done on the same place. Anyway > log_statement is irrelevant to walsender. Patch 006 emits all logs executed by the apply worker as a log of log_replication_command but perhaps you're right. It would be better to leave the log of log_statement if the command executed by the apply worker is a SQL command. Will fix. > >> 7. >> > >> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits >> > an error. The callback function to reset it needs to be registered >> > via on_shmem_exit(). >> >> Attached 007 patch adds callback function to reset LogicalRepCtx->launcher_pid. >> >> 8. >> > >> > Typo: "an subscriber" should be "a subscriber" in some places. >> >> It seems that there is no longer these typo. >> >> 9. >> > /* Get conninfo */ >> > datum = SysCacheGetAttr(SUBSCRIPTIONOID, >> > tup, >> > Anum_pg_subscription_subconninfo, >> > &isnull); >> > Assert(!isnull); >> > >> > This assertion makes me think that pg_subscription.subconnifo should >> > have NOT NULL constraint. Also subslotname and subpublications >> > should have NOT NULL constraint because of the same reason. >> >> Agreed. Attached 009 patch add NOT NULL constraint to all other >> columns of pg_subscription. pg_subscription.subsynccommit should have >> it as well. > > I'm not sure the policy here, but I suppose that the assertions > there are still required irrelevantly from the nun-nullness of > the attribute. You're right. Will fix it. >> 10. >> > >> > SpinLockAcquire(&MyLogicalRepWorker->relmutex); >> > MyLogicalRepWorker->relstate = >> > GetSubscriptionRelState(MyLogicalRepWorker->subid, >> > MyLogicalRepWorker->relid, >> > &MyLogicalRepWorker->relstate_lsn, >> > false); >> > SpinLockRelease(&MyLogicalRepWorker->relmutex); >> > >> > Non-"short-term" function like GetSubscriptionRelState() should not >> > be called while spinlock is being held. >> > >> >> One option is adding new LWLock for the relation state change but this >> lock is used at many locations where the operation actually doesn't >> take time. I think that the discussion would be needed to get >> consensus, so patch for it is not attached. > > From the point of view of time, I agree that it doesn't seem to > harm. Bt I thing it as more significant problem that > potentially-throwable function is called within the mutex > region. It potentially causes a kind of dead lock. (That said, > theoretically dosn't occur and I'm not sure what happens by the > dead lock..) > > >> [1] https://www.postgresql.org/message-id/20170406.212441.177025736.horiguchi.kyotaro@lab.ntt.co.jp > > -- > Kyotaro Horiguchi > NTT Open Source Software Center > Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: