Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supportinginfrastructure - Mailing list pgsql-hackers
From | David Christensen |
---|---|
Subject | Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supportinginfrastructure |
Date | |
Msg-id | 39130985-8BDD-46CD-8CB6-B8B9E90CC3B7@endpoint.com Whole thread Raw |
In response to | Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supportinginfrastructure (Jim Nasby <Jim.Nasby@BlueTreble.com>) |
Responses |
Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supportinginfrastructure
|
List | pgsql-hackers |
> On Feb 19, 2017, at 8:14 PM, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote: > > On 2/19/17 11:02 AM, David Christensen wrote: >> My design notes for the patch were submitted to the list with little comment; see: https://www.postgresql.org/message-id/1E6E64E9-634B-43F4-8AA2-CD85AD92D2F8%40endpoint.com >> >> I have since added the WAL logging of checksum states, however I’d be glad to take feedback on the other proposed approaches(particularly the system catalog changes + the concept of a checksum cycle).] > > A couple notes: > > - AFAIK unlogged tables get checksummed today if checksums are enabled; the same should hold true if someone enables checksumson the whole cluster. Agreed; AFAIK this should already be working if it’s using the PageIsVerified() API, since I just effectively modified thelogic there, depending on state. > - Shared relations should be handled as well; you don't mention them. I agree that they should be handled as well; I thought I had mentioned it later in the design doc, though TBH I’m not sureif there is more involved than just visiting the global relations in pg_class. In addition we need to visit all forksof each relation. I’m not certain if loading those into shared_buffers would be sufficient; I think so, but I’d beglad to be informed otherwise. (As long as they’re already using the PageIsVerified() API call they get this logic forfree. > - If an entire cluster is going to be considered as checksummed, then even databases that don't allow connections wouldneed to get enabled. Yeah, the workaround for now would be to require “datallowconn" to be set to ’t' for all databases before proceeding, unlessthere’s a way to connect to those databases internally that bypasses that check. Open to ideas, though for a firstpass seems like the “datallowconn” approach is the least amount of work. > I like the idea of revalidation, but I'd suggest leaving that off of the first pass. Yeah, agreed. > It might be easier on a first pass to look at supporting per-database checksums (in this case, essentially treating sharedcatalogs as their own database). All normal backends do per-database stuff (such as setting current_database) duringstartup anyway. That doesn't really help for things like recovery and replication though. :/ And there's still thequestion of SLRUs (or are those not checksum'd today??). So you’re suggesting that the data_checksums GUC get set per-database context, so once it’s fully enabled in the specificdatabase it treats it as in enforcing state, even if the rest of the cluster hasn’t completed? Hmm, might thinkon that a bit, but it seems pretty straightforward. What issues do you see affecting replication and recovery specifically (other than the entire cluster not being complete)? Since the checksum changes are WAL logged, seems you be no worse the wear on a standby if you had to change things. > BTW, it occurs to me that this is related to the problem we have with trying to make changes that break page binary compatibility.If we had a method for handling that it would probably be useful for enabling checksums as well. You'd essentiallytreat an un-checksum'd page as if it was an "old page version". The biggest problem there is dealing with thepotential that the new page needs to be larger than the old one was, but maybe there's some useful progress to be hadin this area before tackling the "page too small" problem. I agree it’s very similar; my issue is I don’t want to have to postpone handling a specific case for some future infrastructure. That being said, what I could see being a general case is the piece which basically walks all pages in thecluster; as long as the page checksum/format validation happens at Page read time, we could do page version checks/conversionsat the same time, and the only special code we’d need is to keep track of which files still need to bevisited and how to minimize the impact on the cluster via some sort of throttling/leveling. (It’s also similar to vacuumin that way, however we have been going out of our way to make vacuum smart enough to *avoid* visiting every page,so I think it is a different enough use case that we can’t tie the two systems together, nor do I feel like taking thatproject on.) We could call the checksum_cycle something else (page_walk_cycle? bikeshed time!) and basically have the infrastructureto initiate online/gradual conversion/processing of all pages for free. Thoughts? David -- David Christensen End Point Corporation david@endpoint.com 785-727-1171
pgsql-hackers by date: