Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Perform streaming logical transactions by background workers and parallel apply |
Date | |
Msg-id | CAA4eK1LGmZsevrqJra0V4O8oBU_eKyzm2VMpSAYQaDgC6n4fkA@mail.gmail.com Whole thread Raw |
In response to | Re: Perform streaming logical transactions by background workers and parallel apply (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Perform streaming logical transactions by background workers and parallel apply
|
List | pgsql-hackers |
On Fri, Jan 6, 2023 at 11:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Jan 6, 2023 at 9:37 AM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Thursday, January 5, 2023 7:54 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > Thanks, I have started another thread[1] > > > > Attach the parallel apply patch set here again. I didn't change the patch set, > > attach it here just to let the CFbot keep testing it. > > I have completed the review and some basic testing and it mostly looks > fine to me. Here is my last set of comments/suggestions. > > 1. > /* > * Don't start a new parallel worker if user has set skiplsn as it's > * possible that they want to skip the streaming transaction. For > * streaming transactions, we need to serialize the transaction to a file > * so that we can get the last LSN of the transaction to judge whether to > * skip before starting to apply the change. > */ > if (!XLogRecPtrIsInvalid(MySubscription->skiplsn)) > return false; > > > I think this is fine to block parallelism in this case, but it is also > possible to make it less restrictive, basically, only if the first lsn > of the transaction is <= skiplsn, then only it is possible that the > final_lsn might match with skiplsn otherwise that is not possible. And > if we want then we can allow parallelism in that case. > > I understand that currently we do not have first_lsn of the > transaction in stream start message but I think that should be easy to > do? Although I am not sure if it is worth it, it's good to make a > note at least. > Yeah, I also don't think sending extra eight bytes with stream_start message is worth it. But it is fine to mention the same in the comments. > 2. > > + * XXX Additionally, we also stop the worker if the leader apply worker > + * serialize part of the transaction data due to a send timeout. This is > + * because the message could be partially written to the queue and there > + * is no way to clean the queue other than resending the message until it > + * succeeds. Instead of trying to send the data which anyway would have > + * been serialized and then letting the parallel apply worker deal with > + * the spurious message, we stop the worker. > + */ > + if (winfo->serialize_changes || > + list_length(ParallelApplyWorkerPool) > > + (max_parallel_apply_workers_per_subscription / 2)) > > IMHO this reason (XXX Additionally, we also stop the worker if the > leader apply worker serialize part of the transaction data due to a > send timeout) for stopping the worker looks a bit hackish to me. It > may be a rare case so I am not talking about the performance but the > reasoning behind stopping is not good. Ideally we should be able to > clean up the message queue and reuse the worker. > TBH, I don't know what is the better way to deal with this with the current infrastructure. I thought we can do this as a separate enhancement in the future. > 3. > + else if (shmq_res == SHM_MQ_WOULD_BLOCK) > + { > + /* Replay the changes from the file, if any. */ > + if (pa_has_spooled_message_pending()) > + { > + pa_spooled_messages(); > + } > > I think we do not need this pa_has_spooled_message_pending() function. > Because this function is just calling pa_get_fileset_state() which is > acquiring mutex and getting filestate then if the filestate is not > FS_EMPTY then we call pa_spooled_messages() that will again call > pa_get_fileset_state() which will again acquire mutex. I think when > the state is FS_SERIALIZE_IN_PROGRESS it will frequently call > pa_get_fileset_state() consecutively 2 times, and I think we can > easily achieve the same behavior with just one call. > This is just to keep the code easy to follow. As this would be a rare case, so thought of giving preference to code clarity. -- With Regards, Amit Kapila.
pgsql-hackers by date: