Re: Online enabling of checksums - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Online enabling of checksums |
Date | |
Msg-id | 81bfd944-2f44-39f6-a4de-7b3876ac1455@2ndquadrant.com Whole thread Raw |
In response to | Re: Online enabling of checksums (Magnus Hagander <magnus@hagander.net>) |
Responses |
Re: Online enabling of checksums
|
List | pgsql-hackers |
On 03/02/2018 11:01 PM, Magnus Hagander wrote: > On Fri, Mar 2, 2018 at 5:50 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote: > > > > On 03/02/2018 02:35 PM, Magnus Hagander wrote: > > > > > > On Wed, Feb 28, 2018 at 6:06 PM, Robert Haas <robertmhaas@gmail.com <mailto:robertmhaas@gmail.com> > > <mailto:robertmhaas@gmail.com <mailto:robertmhaas@gmail.com>>> wrote: > > > > On Sun, Feb 25, 2018 at 9:54 AM, Magnus Hagander > > <magnus@hagander.net <mailto:magnus@hagander.net> > <mailto:magnus@hagander.net <mailto:magnus@hagander.net>>> wrote: > > > Also if that wasn't clear -- we only do the full page write if there isn't > > > already a checksum on the page and that checksum is correct. > > > > Hmm. > > > > Suppose that on the master there is a checksum on the page and that > > checksum is correct, but on the standby the page contents differ in > > some way that we don't always WAL-log, like as to hint bits, and there > > the checksum is incorrect. Then you'll enable checksums when the > > standby still has some pages without valid checksums, and disaster > > will ensue. > > > > I think this could be hard to prevent if checksums are turned on and > > off multiple times. > > > > > > Do we ever make hintbit changes on the standby for example? If so, it > > would definitely cause problems. I didn't realize we did, actually... > > > > I don't think we do. SetHintBits does TransactionIdIsValid(xid) and > AFAIK that can't be true on a standby. > > > I guess we could get there even if we don't by: > > * All checksums are correct > > * Checkums are disabled (which replicates) > > * Non-WAL logged change on the master, which updates checksum but does > > *not* replicate > > * Checksums re-enabled > > * Worker sees the checksum as correct, and thus does not force a full > > page write. > > * Worker completes and flips checksums on which replicates. At this > > point, if the replica reads the page, boom. > > > > Maybe. > > My understanding of Robert's example is that you can start with an > instance that has wal_log_hints=off, and so pages on master/standby may > not be 100% identical. Then we do the online checksum thing, and the > standby may get pages with incorrect checksums. > > > No, in that case the master will issue full page writes for *all* pages, > since they dind't hvae a checksum. The current patch only avoids doing > that if the checksum on the master is correct, which it isn't when you > start from checksums=off. So this particular problem only shows up if > you iterate between off/on/off multiple times. > Hmmm, OK. So we need to have a valid checksum on a page, disable checksums, set some hint bits on the page (which won't be WAL-logged), enable checksums again and still get a valid checksum even with the new hint bits? That's possible, albeit unlikely. > > > I guess we have to remove that optimisation. It's definitely a > > bummer, but I don't think it's an absolute dealbreaker. > > > > I agree it's not a deal-breaker. Or at least I don't see why it should > be - any other maintenance activity on the database (freezing etc.) will > also generate full-page writes. > > > Yes. > > > > The good thing is the throttling also limits the amount of WAL, so it's > possible to prevent generating too many checkpoints etc. > > I suggest we simply: > > 1) set the checksums to in-progress > 2) wait for a checkpoint > 3) use the regular logic for full-pages (i.e. first change after > checkpoint does a FPW) > > > This is very close to what it does now, except it does not wait for a > checkpoint in #2. Why does it need that? > To guarantee that the page has a FPW with all the hint bits, before we start messing with the checksums (or that setting the checksum itself triggers a FPW). > > BTW speaking of checkpoints, I see ChecksumHelperLauncherMain does > > RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT | \ > CHECKPOINT_IMMEDIATE); > > I'm rather unhappy about that - immediate checkpoints have massive > impact on production systems, so we try not doing them (That's one of > the reasons why CREATE DATABASE is somewhat painful). It usually > requires a bit of thinking about when to do such commands. But in this > case it's unpredictable when exactly the checksumming completes, so it > may easily be in the middle of peak activity. > > Why not to simply wait for regular spread checkpoint, the way > pg_basebackup does it? > > > Actually, that was my original idea. I changed it for testing, and shuld > go change it back. > OK > > > > We could say that we keep the optimisation if wal_level=minimal for > > example, because then we know there is no replica. But I doubt > > that's worth it? > > > > If it doesn't require a lot of code, why not? But I don't really see > much point in doing that. > > > Yeah, I doubt there are a lot of people using "minimal" these days, not > since we changed the default. > Yeah. Although as I said, it depends on how much code would be needed to enable that optimization (I guess not much). If someone is running with wal_level=minimal intentionally, why not to help them. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: