Re: WAL consistency check facility - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: WAL consistency check facility |
Date | |
Msg-id | CAB7nPqQTLGvn_XePjS07kZMMw46kS6S7LfsTocK+gLpTN0bcZw@mail.gmail.com Whole thread Raw |
In response to | Re: WAL consistency check facility (Kuntal Ghosh <kuntalghosh.2007@gmail.com>) |
Responses |
Re: WAL consistency check facility
|
List | pgsql-hackers |
On Wed, Nov 2, 2016 at 11:30 PM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote: > On Wed, Nov 2, 2016 at 1:11 PM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote: >> Rest of the suggestions are well-taken. I'll update the patch accordingly. > I've updated the last submitted patch(v10) with the following changes: > - added a block level flag BKPIMAGE_APPLY to distinguish backup image > blocks which needs to be restored during replay. > - at present, hash index operations are not WAL-logged. Hence, I've removed > the consistency check option for hash indices. It can be added later. Both make sense. > - Another suggestion was to remove wal_consistency from PostgresNode.pm > because small buildfarm machines may suffer on it. Although I've no > experience in this matter, I would like to be certain that nothings breaks > in recovery tests after some modifications. An extra idea that I have here would be to extend the TAP tests to accept an environment variable that would be used to specify extra options when starting Postgres instances. Buildfarm machines could use it. + /* + * Remember that, if WAL consistency check is enabled for the current rmid, + * we always include backup image with the WAL record. But, during redo we + * restore the backup block only if needs_backup is set. + */ + if (needs_backup) + bimg.bimg_info |= BKPIMAGE_APPLY; + + You should be careful about extra newlines and noise in the code. - /* If it's a full-page image, restore it. */ - if (XLogRecHasBlockImage(record, block_id)) + /* If full-page image should be restored, do it. */ + if (XLogRecBlockImageApply(record, block_id)) Hm. It seems to me that this modification is incorrect. If the page does not need to be applied, aka if it needs to be used for consistency checks, what should be done is more something like the following in XLogReadBufferForRedoExtended: if (Apply(record, block_id)) return; if (HasImage) { //do what needs to be done with an image } else { //do default things } XLogRecBlockImageApply() should only check for BKP_APPLY and not imply HasImage(). This will be more flexible when for example using it for assertions. With this patch the comments on top of XLogReadBufferForRedo are wrong. A block is not unconditionally applied. +#define XLogRecBlockImageApply(decoder, block_id) \ + (XLogRecHasBlockImage(decoder, block_id) && \ + (((decoder)->blocks[block_id].bimg_info & BKPIMAGE_APPLY) > 0)) Maybe != 0? That's the most common practice in the code. It would be more consistent to have a boolean flag to treat BKPIMAGE_APPLY in xlogreader.c. Have a look at has_image for example. This will as well reduce dependencies on the header xlog_record.h + /* + * For a speculative tuple, the content of t_ctid is conflicting + * between the backup page and current page. Hence, we set it + * to the current block number and current offset. + */ + if (HeapTupleHeaderIsSpeculative(page_htup)) + ItemPointerSet(&page_htup->t_ctid, blkno, off); In the set of masking functions this is the only portion of the code depending on blkno. But isn't that actually a bug in speculative inserts? Andres (added now in CC), Peter, could you provide some input regarding that? The masking functions should not prevent the detection of future errors, and this code is making me uneasy. -- Michael
pgsql-hackers by date: