RE: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers
From | houzj.fnst@fujitsu.com |
---|---|
Subject | RE: Perform streaming logical transactions by background workers and parallel apply |
Date | |
Msg-id | OS0PR01MB5716C9F2C77BD3E403A713A994C29@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Perform streaming logical transactions by background workers and parallel apply (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Perform streaming logical transactions by background workers and parallel apply
|
List | pgsql-hackers |
On Friday, January 13, 2023 1:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Thu, Jan 12, 2023 at 9:34 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Thursday, January 12, 2023 7:08 PM Amit Kapila > <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, Jan 12, 2023 at 4:21 PM shveta malik <shveta.malik@gmail.com> > wrote: > > > > > > > > On Thu, Jan 12, 2023 at 10:34 AM Amit Kapila > > > > <amit.kapila16@gmail.com> > > > wrote: > > > > > > > > > > On Thu, Jan 12, 2023 at 9:54 AM Peter Smith > > > > > <smithpb2250@gmail.com> > > > wrote: > > > > > > > > > > > > > > > > > > doc/src/sgml/monitoring.sgml > > > > > > > > > > > > 5. pg_stat_subscription > > > > > > > > > > > > @@ -3198,11 +3198,22 @@ SELECT pid, wait_event_type, > > > > > > wait_event FROM pg_stat_activity WHERE wait_event i > > > > > > > > > > > > <row> > > > > > > <entry role="catalog_table_entry"><para > > > > > > role="column_definition"> > > > > > > + <structfield>apply_leader_pid</structfield> > > > <type>integer</type> > > > > > > + </para> > > > > > > + <para> > > > > > > + Process ID of the leader apply worker, if this process is a > apply > > > > > > + parallel worker. NULL if this process is a leader apply worker > or a > > > > > > + synchronization worker. > > > > > > + </para></entry> > > > > > > + </row> > > > > > > + > > > > > > + <row> > > > > > > + <entry role="catalog_table_entry"><para > > > > > > + role="column_definition"> > > > > > > <structfield>relid</structfield> <type>oid</type> > > > > > > </para> > > > > > > <para> > > > > > > OID of the relation that the worker is synchronizing; null for > the > > > > > > - main apply worker > > > > > > + main apply worker and the parallel apply worker > > > > > > </para></entry> > > > > > > </row> > > > > > > > > > > > > 5a. > > > > > > > > > > > > (Same as general comment #1 about terminology) > > > > > > > > > > > > "apply_leader_pid" --> "leader_apply_pid" > > > > > > > > > > > > > > > > How about naming this as just leader_pid? I think it could be > > > > > helpful in the future if we decide to parallelize initial sync > > > > > (aka parallel > > > > > copy) because then we could use this for the leader PID of > > > > > parallel sync workers as well. > > > > > > > > > > -- > > > > > > > > I still prefer leader_apply_pid. > > > > leader_pid does not tell which 'operation' it belongs to. 'apply' > > > > gives the clarity that it is apply related process. > > > > > > > > > > But then do you suggest that tomorrow if we allow parallel sync > > > workers then we have a separate column leader_sync_pid? I think that > > > doesn't sound like a good idea and moreover one can refer to docs for > clarification. > > > > I agree that leader_pid would be better not only for future parallel > > copy sync feature, but also it's more consistent with the leader_pid column in > pg_stat_activity. > > > > And here is the version patch which addressed Peter's comments and > > renamed all the related stuff to leader_pid. > > Here are two comments on v79-0003 patch. Thanks for the comments. > > + /* Force to serialize messages if stream_serialize_threshold > is reached. */ > + if (stream_serialize_threshold != -1 && > + (stream_serialize_threshold == 0 || > + stream_serialize_threshold < parallel_stream_nchunks)) > + { > + parallel_stream_nchunks = 0; > + return false; > + } > > I think it would be better if we show the log message ""logical replication apply > worker will serialize the remaining changes of remote transaction %u to a file" > even in stream_serialize_threshold case. Agreed and changed. > > IIUC parallel_stream_nchunks won't be reset if pa_send_data() failed due to the > timeout. Changed. Best Regards, Hou zj
pgsql-hackers by date: