Re: Logical replication failed with SSL SYSCALL error - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Logical replication failed with SSL SYSCALL error |
Date | |
Msg-id | CALDaNm3Yabfvm1=Wef1u8cHO517uRdzMr3eKD9SJShQvpftsJg@mail.gmail.com Whole thread Raw |
In response to | Re: Logical replication failed with SSL SYSCALL error (shaurya jain <12345shaurya@gmail.com>) |
Responses |
Re: Logical replication failed with SSL SYSCALL error
|
List | pgsql-hackers |
On Wed, 19 Apr 2023 at 17:26, shaurya jain <12345shaurya@gmail.com> wrote: > > Hi Team, > > Could you please help me with this, It's urgent for the production environment. > > On Wed, Apr 19, 2023 at 3:44 PM shaurya jain <12345shaurya@gmail.com> wrote: >> >> Hi Team, >> >> Could you please help, It's urgent for the production env? >> >> On Sun, Apr 16, 2023 at 2:40 AM shaurya jain <12345shaurya@gmail.com> wrote: >>> >>> Hi Team, >>> >>> Postgres Version:- 13.8 >>> Issue:- Logical replication failing with SSL SYSCALL error >>> Priority:-High >>> >>> We are migrating our database through logical replications, and all of sudden below error pops up in the source and targetlogs which leads us to nowhere. >>> >>> Logs from Source:- >>> LOG: could not send data to client: Connection reset by peer >>> STATEMENT: COPY public.test TO STDOUT >>> FATAL: connection to client lost >>> STATEMENT: COPY public.test TO STDOUT >>> >>> Logs from Target:- >>> 2023-04-15 19:07:02 UTC::@:[1250]:ERROR: could not receive data from WAL stream: SSL SYSCALL error: Connection timedout >>> 2023-04-15 19:07:02 UTC::@:[1250]:CONTEXT: COPY test, line 365326932 >>> 2023-04-15 19:07:03 UTC::@:[505]:LOG: background worker "logical replication worker" (PID 1250) exited with exit code1 >>> 2023-04-15 19:07:03 UTC::@:[7155]:LOG: logical replication table synchronization worker for subscription " sub_tables_2_180",table "test" has started >>> 2023-04-15 19:12:05 UTC:10.144.19.34(33276):postgres@webadmit_staging:[7112]:WARNING: there is no transaction in progress >>> 2023-04-15 19:14:08 UTC:10.144.19.34(33324):postgres@webadmit_staging:[6052]:LOG: could not receive data from client:Connection reset by peer >>> 2023-04-15 19:17:23 UTC::@:[2112]:ERROR: could not receive data from WAL stream: SSL SYSCALL error: Connection timedout >>> 2023-04-15 19:17:23 UTC::@:[1089]:ERROR: could not receive data from WAL stream: SSL SYSCALL error: Connection timedout >>> 2023-04-15 19:17:23 UTC::@:[2556]:ERROR: could not receive data from WAL stream: SSL SYSCALL error: Connection timedout >>> 2023-04-15 19:17:23 UTC::@:[505]:LOG: background worker "logical replication worker" (PID 2556) exited with exit code1 >>> 2023-04-15 19:17:23 UTC::@:[505]:LOG: background worker "logical replication worker" (PID 2112) exited with exit code1 >>> 2023-04-15 19:17:23 UTC::@:[505]:LOG: background worker "logical replication worker" (PID 1089) exited with exit code1 >>> 2023-04-15 19:17:23 UTC::@:[7287]:LOG: logical replication apply worker for subscription "sub_tables_2_180" has started >>> 2023-04-15 19:17:23 UTC::@:[7288]:LOG: logical replication apply worker for subscription "sub_tables_3_192" has started >>> 2023-04-15 19:17:23 UTC::@:[7289]:LOG: logical replication apply worker for subscription "sub_tables_1_180" has started >>> >>> Just after this error, all other replication slots get disabled for some time and come back online along with COPY commandwith the new PID in pg_stat_activity. >>> >>> I have a few queries regarding this:- >>> >>> The exact reason for disconnection (Few articles claim memory and few network) This might be because of network failure, did you notice any network instability, could you check the TCP settings. You could check the following configurations tcp_keepalives_idle, tcp_keepalives_interval and tcp_keepalives_count. This means it will connect the server based on tcp_keepalives_idle seconds specified , if the server does not respond in tcp_keepalives_interval seconds it'll try again, and will consider the connection gone after tcp_keepalives_count failures. >>> Will it lead to data inconsistency? It will not lead to inconsistency. In case of failure the failed transaction will be rolled back. >>> Does this new PID COPY command again migrate the whole data of the test table once again? Yes, it will migrate the whole table data again in case of failures. Regards, Vignesh
pgsql-hackers by date: