Re: Online checksums verification in the backend - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Online checksums verification in the backend |
Date | |
Msg-id | CA+fd4k5_gxSrcjXoxBk_47vB+xYGKzvcLGtwaLCbcBmW3v5oGw@mail.gmail.com Whole thread Raw |
In response to | Re: Online checksums verification in the backend (Julien Rouhaud <rjuju123@gmail.com>) |
Responses |
Re: Online checksums verification in the backend
|
List | pgsql-hackers |
On Wed, 11 Mar 2020 at 16:18, Julien Rouhaud <rjuju123@gmail.com> wrote: > > On Tue, Dec 10, 2019 at 11:12:34AM +0100, Julien Rouhaud wrote: > > On Tue, Dec 10, 2019 at 3:26 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > On Mon, Dec 09, 2019 at 07:02:43PM +0100, Julien Rouhaud wrote: > > > > On Mon, Dec 9, 2019 at 5:21 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > >> Some people might prefer notices, because you can get those while the > > > >> thing is still running, rather than a result set, which you will only > > > >> see when the query finishes. Other people might prefer an SRF, because > > > >> they want to have the data in structured form so that they can > > > >> postprocess it. Not sure what you mean by "more globally." > > > > > > > > I meant having the results available system-wide, not only to the > > > > caller. I think that emitting a log/notice level should always be > > > > done on top on whatever other communication facility we're using. > > > > > > The problem of notice and logs is that they tend to be ignored. Now I > > > don't see no problems either in adding something into the logs which > > > can be found later on for parsing on top of a SRF returned by the > > > caller which includes all the corruption details, say with pgbadger > > > or your friendly neighborhood grep. I think that any backend function > > > should also make sure to call pgstat_report_checksum_failure() to > > > report a report visible at database-level in the catalogs, so as it is > > > possible to use that as a cheap high-level warning. The details of > > > the failures could always be dug from the logs or the result of the > > > function itself after finding out that something is wrong in > > > pg_stat_database. > > > > I agree that adding extra information in the logs and calling > > pgstat_report_checksum_failure is a must do, and I changed that > > locally. However, I doubt that the logs is the right place to find > > the details of corrupted blocks. There's no guarantee that the file > > will be accessible to the DBA, nor that the content won't get > > truncated by the time it's needed. I really think that corruption is > > important enough to justify more specific location. > > > The cfbot reported a build failure, so here's a rebased v2 which also contains > the pg_stat_report_failure() call and extra log info. In addition to comments from Michael-san, here are my comments: 1. + if (!is_member_of_role(GetUserId(), DEFAULT_ROLE_READ_SERVER_FILES)) + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("only superuser or a member of the pg_read_server_files role may use this function"))); + + if (!DataChecksumsEnabled()) + elog(ERROR, "Data checksums are not enabled"); I think it's better to reverse the order of the above checks. 2. +#define CRF_COLS 5 /* Number of output arguments in the SRF */ Should it be SRF_COLS? 3. +static void +check_delay_point(void) +{ + /* Always check for interrupts */ + CHECK_FOR_INTERRUPTS(); + + /* Nap if appropriate */ + if (!InterruptPending && VacuumCostBalance >= VacuumCostLimit) + { + int msec; + + msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit; + if (msec > VacuumCostDelay * 4) + msec = VacuumCostDelay * 4; + + pg_usleep(msec * 1000L); + + VacuumCostBalance = 0; + + /* Might have gotten an interrupt while sleeping */ + CHECK_FOR_INTERRUPTS(); + } +} Even if we use vacuum delay for this function, I think we need to set VacuumDelayActive and return if it's false, or it's better to just return if VacuumCostDelay == 0. 4. +static void +check_all_relations(TupleDesc tupdesc, Tuplestorestate *tupstore, + ForkNumber forknum) I also agree with Michael-san to remove this function. Instead we can check all relations by: select pg_check_relation(oid) from pg_class; 6. Other typos s/dirted/dirtied/ s/explictly/explicitly/ Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: