Re: Time delayed LR (WAS Re: logical replication restrictions) - Mailing list pgsql-hackers
From | Euler Taveira |
---|---|
Subject | Re: Time delayed LR (WAS Re: logical replication restrictions) |
Date | |
Msg-id | ea766b46-c46c-494d-adc7-b6666e9424e9@app.fastmail.com Whole thread Raw |
In response to | RE: Time delayed LR (WAS Re: logical replication restrictions) ("Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com>) |
Responses |
Re: Time delayed LR (WAS Re: logical replication restrictions)
RE: Time delayed LR (WAS Re: logical replication restrictions) Re: Time delayed LR (WAS Re: logical replication restrictions) |
List | pgsql-hackers |
On Sun, Jan 22, 2023, at 9:42 AM, Takamichi Osumi (Fujitsu) wrote:
On Saturday, January 21, 2023 3:36 AM I wrote:> Kindly have a look at the patch v18.I've conducted some refactoring for v18.Now the latest patch should be tidier andthe comments would be clearer and more aligned as a whole.Attached the updated patch v19.
[I haven't been following this thread for a long time...]
Good to know that you keep improving this patch. I have a few suggestions that
were easier to provide a patch on top of your latest patch than to provide an
inline suggestions.
There are a few documentation polishing. Let me comment some of them above.
- The length of time (ms) to delay the application of changes.
+ Total time spent delaying the application of changes, in milliseconds
I don't remember if I suggested this description for catalog but IMO the
suggestion reads better for me.
- For time-delayed logical replication (i.e. when the subscription is
- created with parameter min_apply_delay > 0), the apply worker sends a
- Standby Status Update message to the publisher with a period of
- <literal>wal_receiver_status_interval</literal>. Make sure to set
- <literal>wal_receiver_status_interval</literal> less than the
- <literal>wal_sender_timeout</literal> on the publisher, otherwise, the
- walsender will repeatedly terminate due to the timeout errors. If
- <literal>wal_receiver_status_interval</literal> is set to zero, the apply
- worker doesn't send any feedback messages during the subscriber's
- <literal>min_apply_delay</literal> period. See
- <xref linkend="sql-createsubscription"/> for details.
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. If <varname>wal_receiver_status_interval</varname> is set to
+ zero, the apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal> interval.
I removed the parenthesis explanation about time-delayed logical replication.
If you are reading the documentation and does not know what it means you should
(a) read the logical replication chapter or (b) check the glossary (maybe a new
entry should be added). I also removed the Standby status Update message but it
is a low level detail; let's refer to it as feedback message as the other
sentences do. I changed "literal" to "varname" that's the correct tag for
parameters. I replace "period" with "interval" that was the previous
terminology. IMO we should be uniform, use one or the other.
- The subscriber replication can be instructed to lag behind the publisher
- side changes by specifying the <literal>min_apply_delay</literal>
- subscription parameter. See <xref linkend="sql-createsubscription"/> for
- details.
+ A logical replication subscription can delay the application of changes by
+ specifying the <literal>min_apply_delay</literal> subscription parameter.
+ See <xref linkend="sql-createsubscription"/> for details.
This feature refers to a specific subscription, hence, "logical replication
subscription" instead of "subscriber replication".
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
Peter S referred to this missing piece of code too.
-int
+static int
defGetMinApplyDelay(DefElem *def)
{
It seems you forgot static keyword.
- elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %lld ms, Remaining wait time: %ld ms",
- xid, (long long) MySubscription->minapplydelay, diffms);
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = " INT64_FORMAT " ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
int64 should use format modifier INT64_FORMAT.
- (long) wal_receiver_status_interval * 1000,
+ wal_receiver_status_interval * 1000L,
Cast is not required. I added a suffix to the constant.
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X in-delayed: %d",
+ elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X, apply delay: %s",
force,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos),
- in_delayed_apply);
+ in_delayed_apply? "yes" : "no");
It is better to use a string to represent the yes/no option.
- gettext_noop("Min apply delay (ms)"));
+ gettext_noop("Min apply delay"));
I don't know if it was discussed but we don't add units to headers. When I
think about this parameter representation (internal and external), I decided to
use the previous code because it provides a unit for external representation. I
understand that using the same representation as recovery_min_apply_delay is
good but the current code does not handle the external representation
accordingly. (recovery_min_apply_delay uses the GUC machinery to adds the unit
but for min_apply_delay, it doesn't).
# Setup for streaming case
-$node_publisher->append_conf('postgres.conf',
+$node_publisher->append_conf('postgresql.conf',
'logical_decoding_mode = immediate');
$node_publisher->reload;
Fix configuration file name.
Maybe tests should do a better job. I think check_apply_delay_time is fragile
because it does not guarantee that time is not shifted. Time-delayed
replication is a subscriber feature and to check its correctness it should
check the logs.
# Note that we cannot call check_apply_delay_log() here because there is a
# possibility that the delay is skipped. The event happens when the WAL
# replication between publisher and subscriber is delayed due to a mechanical
# problem. The log output will be checked later - substantial delay-time case.
If you might not use the logs for it, it should adjust the min_apply_delay, no?
It does not exercise the min_apply_delay vs parallel streaming mode.
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s mode for subscription with %s",
+ "streaming = parallel", "min_apply_delay"));
+
Is this code correct? I also didn't like this message. "cannot enable streaming
= parallel mode for subscription with min_apply_delay" is far from a good error
message. How about refer parallelism to "parallel streaming mode".
Attachment
pgsql-hackers by date: