Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
Date
Msg-id CAHGQGwF5SX=PA+xANQ0fme4XnZAFEB=XNerheesOX+0EqBU6wg@mail.gmail.com
Whole thread Raw
In response to Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication  (Chao Li <li.evan.chao@gmail.com>)
List pgsql-hackers
On Mon, Jan 12, 2026 at 4:08 PM Chao Li <li.evan.chao@gmail.com> wrote:
> Thanks for the patch. Here are my comments on v4.

Thanks for the review!


> 1 - 0001
> ```
> +       /*
> +        * Save the last flushed position as the replication start point. On
> +        * reconnect, replication resumes from there to avoid re-sending flushed
> +        * data.
> +        */
> +       startpos = output_fsync_lsn;
> ```
>
> Looking at function OutputFsync(), fsync() may fail and there a few branches to return early without fsync(), so
shouldwe only update startpos after fsync()? 

Maybe not, but I might be missing something. Could you clarify what
concrete scenario would be problematic with the current code?


> 2 - 0001
> ```
> +               if (outfd != -1 && strcmp(outfile, "-") != 0)
> +                       OutputFsync(feGetCurrentTimestamp());
> ```
>
> Do we need to check return code of OutputFsync()? I checked this file, the only caller that doesn’t check return code
ofOutputFsync() has a comment for why: 
> ```
>         /* no need to jump to error on failure here, we're finishing anyway */
>         OutputFsync(t);
> ```
>
> I saw 0004 has changed OutputFsync to return nothing. So, it’s ok to not adding a comment here. But I just feel that,
ifwe want to make the commit self-contained, it would be better to add a comment here, but that’s not a strong opinion. 

Yeah, I think that making the patch "self-contained" in that sense isn't
really worth the extra effort.


> 3 - 0001
>
> No a direct issue of this patch. I noticed that, everywhere that calls OutputFsync(), it checks outfd != -1 &&
strcmp(outfile,"-") != 0. However, two places miss the check: 

The 0001 patch updates pg_recvlogical to call OutputFsync() before
restarting replication. That call is guarded by strcmp(outfile, "-") != 0.
Your comment makes me reconsider this: when "--file -" is used,
OutputFsync() would be skipped, so startpos would not be updated before
restarting replication. That can lead to duplicate output on stdout,
which is clearly problematic. For this reason, I removed that check
in the latest 0001 patch.

In other places where we check strcmp(outfile, "-") != 0, such as:

if (outfd != -1 && output_reopen && strcmp(outfile, "-") != 0)
{
now = feGetCurrentTimestamp();
OutputFsync(now);
close(outfd);
outfd = -1;
}

on second thought, the check seems necessary for close(outfd) and
"outfd = -1", but not for calling OutputFsync(). If that understanding is
correct, it might make sense to adjust where the check is applied.
However, I think that should be handled in a separate patch.


> 4 - 0002
> ```
> +       croak "timed out waiting for match: $regexp”;
> ```
>
> Is it more helpful to include filename in the error message?

OK, I've updated the message to include the filename.


> 5 - 0003
> ```
> +my ($stdout, $stderr);
> +my $recv = IPC::Run::start(
> +       [@pg_recvlogical_cmd],
> +       '>' => \$stdout,
> +       '2>' => \$stderr);
> ```
>
> $stdout and $stderr are never used.

Yes, but I'm fine with keeping them as they are.

I've attached the updated version of the patches upthread.

Regards,

--
Fujii Masao



pgsql-hackers by date:

Previous
From: Marcos Magueta
Date:
Subject: Re: WIP - xmlvalidate implementation from TODO list
Next
From: Andreas Karlsson
Date:
Subject: Re: Proposal - Enabling btree_gist by default