Thread: slave stops replica
Hello,
I had master slave replica and yesterday the slave server's device was full on /var where the archive copy from master. After trying to figure out why the pg_archivecleanup doesn't work, SA removed all wal files except the current, August 25 wal files on slave and freed up /var with 504G available. SA patched OS included reboot the server. After the db started up after reboot, I found out that
database/instance on slave is not sync with master. The rows counted of queries containing these tables are not match with master from slave.
I tried to create a test table on master and didn't see it pick up from slave. I found out again /var is full.
What should I do to make slave streaming from master again without rebuilt the slave or must to rebuild it? Why my pg_archivecleanup is not working? All the wal files stored since Nov 2020 ~ August 25.
Bach-Nga
Attachment
On Thu, 2021-08-26 at 21:09 +0000, Pepe TD Vo wrote: > I had master slave replica and yesterday the slave server's device was full on /var > where the archive copy from master. After trying to figure out why the > pg_archivecleanup doesn't work, SA removed all wal files except the current > > I tried to create a test table on master and didn't see it pick up from slave. > I found out again /var is full. If you have deleted WAL that has no yet been applied to the standby, replication is broken and you have to build the standby from scratch. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Can we delete the old WALs? May I know what is the retention to perform tuning on the server like Oracle to keep only 7 days and why pg_archivecleanup doesn't work?
Bach-Nga
On Friday, August 27, 2021, 03:29:54 AM EDT, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2021-08-26 at 21:09 +0000, Pepe TD Vo wrote:
> I had master slave replica and yesterday the slave server's device was full on /var
> where the archive copy from master. After trying to figure out why the
> pg_archivecleanup doesn't work, SA removed all wal files except the current
>
> I tried to create a test table on master and didn't see it pick up from slave.
> I found out again /var is full.
If you have deleted WAL that has no yet been applied to the standby, replication
is broken and you have to build the standby from scratch.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
> I had master slave replica and yesterday the slave server's device was full on /var
> where the archive copy from master. After trying to figure out why the
> pg_archivecleanup doesn't work, SA removed all wal files except the current
>
> I tried to create a test table on master and didn't see it pick up from slave.
> I found out again /var is full.
If you have deleted WAL that has no yet been applied to the standby, replication
is broken and you have to build the standby from scratch.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
On Fri, 2021-08-27 at 10:56 +0000, Pepe TD Vo wrote: > > > I had master slave replica and yesterday the slave server's device was full on /var > > > where the archive copy from master. After trying to figure out why the > > > pg_archivecleanup doesn't work, SA removed all wal files except the current > > > > > > I tried to create a test table on master and didn't see it pick up from slave. > > > I found out again /var is full. > > > > If you have deleted WAL that has no yet been applied to the standby, replication > > is broken and you have to build the standby from scratch. > > Can we delete the old WALs? Only the ones you don't need for replication. Of course, if you rebuild replication from scratch, you can delete the WAL archives. > May I know what is the retention to perform tuning on the server like Oracle to > keep only 7 days and why pg_archivecleanup doesn't work? I guess that pg_archivecleanup doesn't delete anything because it is only called once replication has processed a WAL segment and doesn't need it any more. If you are missing a WAL segment because you deleted it, replication gets stuck at that point and will wait indefinitely for that WAL segment. So nothing is processedm and nothing is deleted. A gap in the WAL will stop and break replication, as I said. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com