Thread: Report bytes and transactions actually sent downtream

Report bytes and transactions actually sent downtream

From
Ashutosh Bapat
Date:
Hi All,
In a recent logical replication issue, there were multiple replication
slots involved, each using a different publication. Thus the amount of
data that was replicated through each slot was expected to be
different. However, total_bytes and total_txns were reported the same
for all the replication slots as expected. One of the slots started
lagging and we were trying to figure out whether its the WAL sender
slowing down or the consumer  (in this case Debezium). The lagging
slot then showed total_txns and total_bytes lesser than other slots
giving an impression that the WAL sender is processing the data
slowly. Had pg_stat_replication_slot reported the amount of data
actually sent downstream, it would have been easier to compare it with
the amount of data received by the consumer and thus pinpoint the
bottleneck.

Here's a patch to do the same. It adds two columns
    - sent_txns: The total number of transactions sent downstream.
    - sent_bytes: The total number of bytes sent downstream in data messages
to pg_stat_replication_slots. sent_bytes includes only the bytes sent
as part of 'd' messages and does not include keep alive messages or
CopyDone messages for example. But those are very few and can be
ignored. If others feel that those are important to be included, we
can make that change.

Plugins may choose not to send an empty transaction downstream. It's
better to increment sent_txns counter in the plugin code when it
actually sends a BEGIN message, for example in pgoutput_send_begin()
and pg_output_begin(). This means that every plugin will need to be
modified to increment the counter for it to reported correctly.

I first thought of incrementing sent_bytes in OutputPluginWrite()
which is a central function for all logical replication message
writes. But that calls LogicalDecodingContext::write() which may
further add bytes to the message e.g. WalSndWriteData() and
LogicalOutputWrite(). So it's better to increment the counter in
implementations of LogicalDecodingContext::write(), so that we count
the exact number of bytes. These implementations are within core code
so they won't miss updating sent_bytes.

I think we should rename total_txns and total_bytes to reordered_txns
and reordered_bytes respectively, and also update the documentation
accordingly to make better sense of those numbers. But these patches
do not contain that change. If others feel the same way, I will
provide a patch with that change.

-- 
Best Wishes,
Ashutosh Bapat

Attachment

Re: Report bytes and transactions actually sent downtream

From
Amit Kapila
Date:
On Mon, Jun 30, 2025 at 3:24 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi All,
> In a recent logical replication issue, there were multiple replication
> slots involved, each using a different publication. Thus the amount of
> data that was replicated through each slot was expected to be
> different. However, total_bytes and total_txns were reported the same
> for all the replication slots as expected. One of the slots started
> lagging and we were trying to figure out whether its the WAL sender
> slowing down or the consumer  (in this case Debezium). The lagging
> slot then showed total_txns and total_bytes lesser than other slots
> giving an impression that the WAL sender is processing the data
> slowly. Had pg_stat_replication_slot reported the amount of data
> actually sent downstream, it would have been easier to compare it with
> the amount of data received by the consumer and thus pinpoint the
> bottleneck.
>
> Here's a patch to do the same. It adds two columns
>     - sent_txns: The total number of transactions sent downstream.
>     - sent_bytes: The total number of bytes sent downstream in data messages
> to pg_stat_replication_slots. sent_bytes includes only the bytes sent
> as part of 'd' messages and does not include keep alive messages or
> CopyDone messages for example. But those are very few and can be
> ignored. If others feel that those are important to be included, we
> can make that change.
>
> Plugins may choose not to send an empty transaction downstream. It's
> better to increment sent_txns counter in the plugin code when it
> actually sends a BEGIN message, for example in pgoutput_send_begin()
> and pg_output_begin(). This means that every plugin will need to be
> modified to increment the counter for it to reported correctly.
>

What if some plugin didn't implemented it or does it incorrectly?
Users will then complain that PG view is showing incorrect value.
Shouldn't the plugin specific stats be shown differently, for example,
one may be interested in how much plugin has filtered the data because
it was not published or because something like row_filter caused it
skip sending such data?

--
With Regards,
Amit Kapila.