Re: Trimming transaction logs after extended WAL archive failures - Mailing list pgsql-general
From | Steven Schlansker |
---|---|
Subject | Re: Trimming transaction logs after extended WAL archive failures |
Date | |
Msg-id | 85ED84E1-663B-49CC-8EC1-5C69929B46B5@likeness.com Whole thread Raw |
In response to | Re: Trimming transaction logs after extended WAL archive failures (Adrian Klaver <adrian.klaver@aklaver.com>) |
Responses |
Re: Trimming transaction logs after extended WAL archive
failures
|
List | pgsql-general |
On Mar 25, 2014, at 4:45 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > On 03/25/2014 04:17 PM, Steven Schlansker wrote: >> >> On Mar 25, 2014, at 4:02 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >>> On 03/25/2014 03:54 PM, Steven Schlansker wrote: >>>> >>>> On Mar 25, 2014, at 3:52 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >>>> >>>>> On 03/25/2014 01:56 PM, Steven Schlansker wrote: >>>>>> Hi everyone, >>>>>> >>>>>> I have a Postgres 9.3.3 database machine. Due to some intelligent work on the part of someone who shall remain nameless,the WAL archive command included a ‘> /dev/null 2>&1’ which masked archive failures until the disk entirely filledwith 400GB of pg_xlog entries. >>>>>> >>>>>> I have fixed the archive command and can see WAL segments being shipped off of the server, however the xlog remainsat a stable size and is not shrinking. In fact, it’s still growing at a (much slower) rate. >>>>> >>>>> So what is wal_keep_segments set at in postgresql.conf? >>>>> >>>> >>>> 5000. There are currently about 18000 WAL segments in pg_xlog. >>> >>> I guess what I should have also asked previously is what exactly are you doing, are you streaming as well as archiving? >> >> Yes, we have both enabled. Here’s some hopefully relevant configuration stanzas and information: >> > >> >> I have verified that WAL segments are being archived to the archive destination, and that the slave is connected and receivingsegments. > > Some more questions, what happens when things begin to dawn on me:) > > You said the disk filled up entirely with log files yet currently the number(size) of logs is growing. It’s holding stable now. I tried to vacuum up to clean some space which turned out to generate more pg_xlog activity thanit saved space, and (I assume) the archiver fell behind and that was the source of the growing log. There haven’t beenany new segments since I stopped doing that. > > So did you grow the disk, move the logs or find some way to reduce the number? I used tune2fs to use some of the “reserved” filesystem space temporarily. I was too scared to move log segments away, thisis a production database. > > What happened to the server when the disk filled up? Postgresql PANICed due to failed writes. Mar 25 22:46:41 prd-db1a postgres[18995]: [12-1] db=checkin,user=postgres PANIC: could not write to file "pg_xlog/xlogtemp.18995":No space left on device > In other words do the log entries at the time show it recovered gracefully? The database is currently up and running, although I do not have much time until it fails again, there are only a few preciousGB free. > If not what did you do to get it running again? > tune2fs and restarted postgres > The concern being that the server is actually fully recovered. I believe it is. Our production site is back up and running seemingly normally, the postgres log has no obvious complaining.
pgsql-general by date: