Home > mailing lists

Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supportinginfrastructure - Mailing list pgsql-hackers

From	David Christensen
Subject	Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supportinginfrastructure
Date	February 20, 2017 20:22:48
Msg-id	39130985-8BDD-46CD-8CB6-B8B9E90CC3B7@endpoint.com Whole thread Raw
In response to	Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supportinginfrastructure (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses	Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supportinginfrastructure
List	pgsql-hackers

Tree view

> On Feb 19, 2017, at 8:14 PM, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote:
>
> On 2/19/17 11:02 AM, David Christensen wrote:
>> My design notes for the patch were submitted to the list with little comment; see:
https://www.postgresql.org/message-id/1E6E64E9-634B-43F4-8AA2-CD85AD92D2F8%40endpoint.com
>>
>> I have since added the WAL logging of checksum states, however I’d be glad to take feedback on the other proposed
approaches(particularly the system catalog changes + the concept of a checksum cycle).] 
>
> A couple notes:
>
> - AFAIK unlogged tables get checksummed today if checksums are enabled; the same should hold true if someone enables
checksumson the whole cluster. 

Agreed; AFAIK this should already be working if it’s using the PageIsVerified() API, since I just effectively modified
thelogic there, depending on state. 

> - Shared relations should be handled as well; you don't mention them.

I agree that they should be handled as well; I thought I had mentioned it later in the design doc, though TBH I’m not
sureif there is more involved than just visiting the global relations in pg_class.  In addition we need to visit all
forksof each relation.  I’m not certain if loading those into shared_buffers would be sufficient; I think so, but I’d
beglad to be informed otherwise.  (As long as they’re already using the PageIsVerified() API call they get this logic
forfree. 

> - If an entire cluster is going to be considered as checksummed, then even databases that don't allow connections
wouldneed to get enabled. 

Yeah, the workaround for now would be to require “datallowconn" to be set to ’t' for all databases before proceeding,
unlessthere’s a way to connect to those databases internally that bypasses that check.  Open to ideas, though for a
firstpass seems like the “datallowconn” approach is the least amount of work. 

> I like the idea of revalidation, but I'd suggest leaving that off of the first pass.

Yeah, agreed.

> It might be easier on a first pass to look at supporting per-database checksums (in this case, essentially treating
sharedcatalogs as their own database). All normal backends do per-database stuff (such as setting current_database)
duringstartup anyway. That doesn't really help for things like recovery and replication though. :/ And there's still
thequestion of SLRUs (or are those not checksum'd today??). 

So you’re suggesting that the data_checksums GUC get set per-database context, so once it’s fully enabled in the
specificdatabase it treats it as in enforcing state, even if the rest of the cluster hasn’t completed?  Hmm, might
thinkon that a bit, but it seems pretty straightforward. 

What issues do you see affecting replication and recovery specifically (other than the entire cluster not being
complete)? Since the checksum changes are WAL logged, seems you be no worse the wear on a standby if you had to change
things.

> BTW, it occurs to me that this is related to the problem we have with trying to make changes that break page binary
compatibility.If we had a method for handling that it would probably be useful for enabling checksums as well. You'd
essentiallytreat an un-checksum'd page as if it was an "old page version". The biggest problem there is dealing with
thepotential that the new page needs to be larger than the old one was, but maybe there's some useful progress to be
hadin this area before tackling the "page too small" problem. 

I agree it’s very similar; my issue is I don’t want to have to postpone handling a specific case for some future
infrastructure. That being said, what I could see being a general case is the piece which basically walks all pages in
thecluster; as long as the page checksum/format validation happens at Page read time, we could do page version
checks/conversionsat the same time, and the only special code we’d need is to keep track of which files still need to
bevisited and how to minimize the impact on the cluster via some sort of throttling/leveling.  (It’s also similar to
vacuumin that way, however we have been going out of our way to make vacuum smart enough to *avoid* visiting every
page,so I think it is a different enough use case that we can’t tie the two systems together, nor do I feel like taking
thatproject on.) 

We could call the checksum_cycle something else (page_walk_cycle?  bikeshed time!) and basically have the
infrastructureto initiate online/gradual conversion/processing of all pages for free. 

Thoughts?

David
--
David Christensen
End Point Corporation
david@endpoint.com
785-727-1171

pgsql-hackers by date:

From: Simon Riggs
Date: 20 February 2017, 20:19:05
Subject: Re: [HACKERS] Should we cacheline align PGXACT?

From: Robert Haas
Date: 20 February 2017, 20:32:26
Subject: Re: [HACKERS] Should we cacheline align PGXACT?

Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supportinginfrastructure - Mailing list pgsql-hackers

Previous

Next