Home > mailing lists

Re: Online enabling of checksums - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Online enabling of checksums
Date	September 30, 2018 11:48:36
Msg-id	e78bb22b-3f22-9f17-b9d8-7d76829cee43@2ndquadrant.com Whole thread Raw
In response to	Re: Online enabling of checksums (Stephen Frost <sfrost@snowman.net>)
Responses	Re: Online enabling of checksums
List	pgsql-hackers

Tree view


On 09/29/2018 06:51 PM, Stephen Frost wrote:
> Greetings,
> 
> * Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
>> On 09/29/2018 02:19 PM, Stephen Frost wrote:
>>> * Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
>>>> While looking at the online checksum verification patch (which I guess
>>>> will get committed before this one), it occurred to me that disabling
>>>> checksums may need to be more elaborate, to protect against someone
>>>> using the stale flag value (instead of simply switching to "off"
>>>> assuming that's fine).
>>>>
>>>> The signals etc. seem good enough for our internal stuff, but what if
>>>> someone uses the flag in a different way? E.g. the online checksum
>>>> verification runs as an independent process (i.e. not a backend) and
>>>> reads the control file to find out if the checksums are enabled or not.
>>>> So if we just switch from "on" to "off" that will break.
>>>>
>>>> Of course, we may also say "Don't disable checksums while online
>>>> verification is running!" but that's not ideal.
>>>
>>> I'm not really sure what else we could say here..?  I don't particularly
>>> see an issue with telling people that if they disable checksums while
>>> they're running a tool that's checking the checksums that they're going
>>> to get odd results.
>>
>> I don't know, to be honest. I was merely looking at the online
>> verification patch and realized that if someone disables checksums it
>> won't notice it (because it only reads the flag once, at the very
>> beginning) and will likely produce bogus errors.
>>
>> Although, maybe it won't - it now uses a checkpoint LSN, so that might
>> fix it. The checkpoint LSN is read from the same controlfile as the
>> flag, so we know the checksums were enabled during that checkpoint. Soi
>> if we ignore failures with a newer LSN, that should do the trick, no?
>>
>> So perhaps that's the right "protocol" to handle this?
> 
> I certainly don't think we need to do anything more.
> 

Not sure I agree. I'm not suggesting we absolutely have to write huge
amount of code to deal with this issue, but I hope we agree we need to
at least understand the issue so that we can put warnings into docs.

FWIW pg_basebackup (in the default "verify checksums") has this issue
too AFAICS, and it seems rather unfriendly to just start reporting
checksum errors during backup in that case.

But as I mentioned, maybe there's no problem at all and using the
checkpoint LSN deals with it automatically.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Fabien COELHO
Date: 30 September 2018, 10:23:50
Subject: Re: libpq host/hostaddr/conninfo inconsistencies

From: Matteo Beccati
Date: 30 September 2018, 11:49:21
Subject: Re: [HACKERS] kqueue

Re: Online enabling of checksums - Mailing list pgsql-hackers

Previous

Next