Home > mailing lists

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG - Mailing list pgsql-admin

From	Mark Kirkwood
Subject	Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG
Date	November 4, 2017 01:00:03
Msg-id	45c0bedc-9ce8-b99b-80df-94a1180fbc88@catalyst.net.nz Whole thread Raw
In response to	Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG (Stephen Frost <sfrost@snowman.net>)
Responses	Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG
List	pgsql-admin

Tree view

Stephen,

On 03/11/17 00:11, Stephen Frost wrote:

>
> Sure, that'll work much of the time, but that's about like saying that
> PG could run without fsync being enabled much of the time and everything
> will be ok.  Both are accurate, but hopefully you'll agree that PG
> really should always be run with fsync enabled.

It is completely different - this is a 'straw man' argument, and justs 
serves to confuse this discussion.

>
>> Also, if what you are suggesting were actually the case, almost
>> everyone's streaming replication (and/or log shipping) would be
>> broken all the time.
> No, again, this isn't an argument about if it'll work most of the time
> or not, it's about if it's correct.  PG without fsync will work most of
> the time too, but that doesn't mean it's actually correct.

No, it is pointing out that if your argument were correct, then there 
should be the above side effects - there are not, which is significant.

The crux of your argument seems to be concerning the synchronization 
between pg_basbackup finishing and being sure you have the required 
archive logs. Now just so we are all clear, when pg_basebackup ends it 
essentially calls do_pg_stop_backup (from xlog.c) which ensures that all 
required WAL files are archived, or to be precise here makes sure 
archive_command has been run successfully for each required WAL file.

Your entire argument seems about whether said WAL is fsync'ed to disk, 
and how this is impossible to ensure in a shell script. Actually it is 
possible quite simply: e.g suppose you archive command is:

rsync ... targetserver:/disk

There are several ways to get that to sync:

rsync .. targetserver:/disk && ssh target server sync

Alternatively amend  vm.dirty_bytes on targetserver to be < 16M, or 
mount the /disk with sync option!

So it is clearly *possible*.

However, I think you are obsessing over the minutiae of fsync to single 
server/disk when there are much more important (read likely to happen) 
problems to consider. For me, the critical consideration is, not 'are 
the WAL files there *right now*'..but 'will they be there tomorrow when 
I need them for a restore'? Next is 'will they be the same/undamaged 
when I read them tomorrow'?

This is why I'm *not* obsessing about fsyncing...make where you store 
these WAL files *reliable*...either via proxying/ip splitting so you 
send stuff to more that one server (if we are still thinking server + 
disk = backup solution). Alternatively use a distributed object store 
(Swift, S3 etc) that handle that for you, and in addition they checksum 
and heal any individual node data corruption for you as well.
>> With respect to 'If I would like to develop etc etc..' - err, all I
>> was doing in this thread was helping the original poster make his
>> stuff a bit better - I'll continue to do that.
> Ignoring the basic requirements which I outlined isn't helping him get
> to a reliable backup system.

Actually I was helping him get a *reliable* backup system, I think you 
misunderstood how swift changes the picture compared to a single 
server/single disk design.

regards

Mark

-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

pgsql-admin by date:

From: Payal Singh
Date: 03 November 2017, 18:01:02
Subject: Re: [ADMIN] postgresql9.4 aws - no pg_upgrade

From: Günce Kaya
Date: 06 November 2017, 10:19:33
Subject: [ADMIN] Partitions

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG - Mailing list pgsql-admin

Previous

Next