Checksums, state of play - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Checksums, state of play |
Date | |
Msg-id | CA+U5nMJYJXzFiTBwaa94W5WWymWBCXThotRNZUMGs_cR+Gt6zw@mail.gmail.com Whole thread Raw |
Responses |
Re: Checksums, state of play
Re: Checksums, state of play Re: Checksums, state of play |
List | pgsql-hackers |
To avoid any confusion as to where this proposed feature is now, I'd like to summarise my understanding, make proposals and also request clear feedback on them. Checksums have a number of objections to them outstanding. 1. We don't need them because there will be something better in a later release. I don't think anybody disagrees that a better solution is possible in the future; doubts have been expressed as to what will be required and when that is likely to happen. Opinions differ. We can and should do something now unless there is reason not to. 2. Turning checksums on/off/on/off in rapid succession can cause false positive reports of checksum failure if crashes occur and are ignored. That may lead to the feature and PostgreSQL being held in disrepute. This can be resolved, if desired, by having having a two-stage enabling process where we must issue a command that scans every block in the database before checksum checking begins. VACUUM is easily modified to the task, we just need to agree that is suitable and agree syntax. A suggestion is VACUUM ENABLE CHECKSUMS; others are possible. 3. Pages with checksums set need to have a version marking to show that they are a later version of the page layout. That version number needs to be extensible to many later versions. Pages of multiple versions need to exist within the server to allow simple upgrades and migration. 4. Checksums that are dependent upon a bit setting on the block are somewhat fragile. Requests have been made to add bits in certain positions and also to remove them again. No set of bits seems to please everyone. (3) and (4) are in conflict with each other, but there is a solution. We mark the block with a version number, but we don't make the checking dependant upon the version number. We simply avoid making any checks until the command to scan all blocks is complete, per point (2). That way we need to use 1 flag bit to mark the new version and zero flag bits to indicate checks should happen. (Various other permutations of solutions for (2), (3), (4) have been discussed and may also be still open) 5. The part of the page header that can be used as a checksum has been disputed. Using the 16 bits dedicated to a version number seems like the least useful consecutive 2 bytes of data in the page header. It can't be < 16 bits because that wouldn't be an effective checksum for database blocks. We might prefer 32 bits, but that would require use of some other parts of the page header and possibly split that into two parts. Splitting the checksum into 2 parts will cause the code to be more complex and fragile. 6. Performance impacts. Measured to be a small regression. 7. Hint bit setting requires WAL logging. The worst case for that would be setting hints on newly loaded tables. Work has been done on other patches to remove that case. If those don't fly, this would be a cost paid by those that wish to take advantage of this feature. If there are other points I've missed for whatever reason, please add them here again for clarity. My own assessment of the above is that the checksum feature can be added to 9.2, as long as we agree the changes above and then proceed to implement them and also that no further serious problems emerge. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: