Re: new heapcheck contrib module - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: new heapcheck contrib module |
Date | |
Msg-id | CA+TgmoZT3kCOaKq6=m5wHGKh_9XQDbCC1LZXL6w_+NMB-n1+mw@mail.gmail.com Whole thread Raw |
In response to | Re: new heapcheck contrib module (Mark Dilger <mark.dilger@enterprisedb.com>) |
Responses |
Re: new heapcheck contrib module
Re: new heapcheck contrib module Re: new heapcheck contrib module |
List | pgsql-hackers |
On Mon, Jul 27, 2020 at 1:02 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > Not at all! I appreciate all the reviews. Reviewing 0002, reading through verify_heapam.c: +typedef enum SkipPages +{ + SKIP_ALL_FROZEN_PAGES, + SKIP_ALL_VISIBLE_PAGES, + SKIP_PAGES_NONE +} SkipPages; This looks inconsistent. Maybe just start them all with SKIP_PAGES_. + if (PG_ARGISNULL(0)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("missing required parameter for 'rel'"))); This doesn't look much like other error messages in the code. Do something like git grep -A4 PG_ARGISNULL | grep -A3 ereport and study the comparables. + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("unrecognized parameter for 'skip': %s", skip), + errhint("please choose from 'all-visible', 'all-frozen', or 'none'"))); Same problem. Check pg_prewarm's handling of the prewarm type, or EXPLAIN's handling of the FORMAT option, or similar examples. Read the message style guidelines concerning punctuation of hint and detail messages. + * Bugs in pg_upgrade are reported (see commands/vacuum.c circa line 1572) + * to have sometimes rendered the oldest xid value for a database invalid. + * It seems unwise to report rows as corrupt for failing to be newer than + * a value which itself may be corrupt. We instead use the oldest xid for + * the entire cluster, which must be at least as old as the oldest xid for + * our database. This kind of reference to another comment will not age well; line numbers and files change a lot. But I think the right thing to do here is just rely on relfrozenxid and relminmxid. If the table is inconsistent with those, then something needs fixing. datfrozenxid and the cluster-wide value can look out for themselves. The corruption detector shouldn't be trying to work around any bugs in setting relfrozenxid itself; such problems are arguably precisely what we're here to find. +/* + * confess + * + * Return a message about corruption, including information + * about where in the relation the corruption was found. + * + * The msg argument is pfree'd by this function. + */ +static void +confess(HeapCheckContext *ctx, char *msg) Contrary to what the comments say, the function doesn't return a message about corruption or anything else. It returns void. I don't really like the name, either. I get that it's probably inspired by Perl, but I think it should be given a less-clever name like report_corruption() or something. + * corrupted table from using workmem worth of memory building up the This kind of thing destroys grep-ability. If you're going to refer to work_mem, you gotta spell it the same way we do everywhere else. + * Helper function to construct the TupleDesc needed by verify_heapam. Instead of saying it's the TupleDesc somebody needs, how about saying that it's the TupleDesc that we'll use to report problems that we find while scanning the heap, or something like that? + * Given a TransactionId, attempt to interpret it as a valid + * FullTransactionId, neither in the future nor overlong in + * the past. Stores the inferred FullTransactionId in *fxid. It really doesn't, because there's no such thing as 'fxid' referenced anywhere here. You should really make the effort to proofread your patches before posting, and adjust comments and so on as you go. Otherwise reviewing takes longer, and if you keep introducing new stuff like this as you fix other stuff, you can fail to ever produce a committable patch. + * Determine whether tuples are visible for verification. Similar to + * HeapTupleSatisfiesVacuum, but with critical differences. Not accurate, because it also reports problems, which is not mentioned anywhere in the function header comment that purports to be a detailed description of what the function does. + else if (TransactionIdIsCurrentTransactionId(raw_xmin)) + return true; /* insert or delete in progress */ + else if (TransactionIdIsInProgress(raw_xmin)) + return true; /* HEAPTUPLE_INSERT_IN_PROGRESS */ + else if (!TransactionIdDidCommit(raw_xmin)) + { + return false; /* HEAPTUPLE_DEAD */ + } One of these cases is not punctuated like the others. + pstrdup("heap tuple with XMAX_IS_MULTI is neither LOCKED_ONLY nor has a valid xmax")); 1. I don't think that's very grammatical. 2. Why abbreviate HEAP_XMAX_IS_MULTI to XMAX_IS_MULTI and HEAP_XMAX_IS_LOCKED_ONLY to LOCKED_ONLY? I don't even think you should be referencing C constant names here at all, and if you are I don't think you should abbreviate, and if you do abbreviate I don't think you should omit different numbers of words depending on which constant it is. I wonder what the intended division of responsibility is here, exactly. It seems like you've ended up with some sanity checks in check_tuple() before tuple_is_visible() is called, and others in tuple_is_visible() proper. As far as I can see the comments don't really discuss the logic behind the split, but there's clearly a close relationship between the two sets of checks, even to the point where you have "heap tuple with XMAX_IS_MULTI is neither LOCKED_ONLY nor has a valid xmax" in tuple_is_visible() and "tuple xmax marked incompatibly as keys updated and locked only" in check_tuple(). Now, those are not the same check, but they seem like closely related things, so it's not ideal that they happen in different functions with differently-formatted messages to report problems and no explanation of why it's different. I think it might make sense here to see whether you could either move more stuff out of tuple_is_visible(), so that it really just checks whether the tuple is visible, or move more stuff into it, so that it has the job not only of checking whether we should continue with checks on the tuple contents but also complaining about any other visibility problems. Or if neither of those make sense then there should be a stronger attempt to rationalize in the comments what checks are going where and for what reason, and also a stronger attempt to rationalize the message wording. + curchunk = DatumGetInt32(fastgetattr(toasttup, 2, + ctx->toast_rel->rd_att, &isnull)); Should we be worrying about the possibility of fastgetattr crapping out if the TOAST tuple is corrupted? + if (ctx->tuphdr->t_hoff + ctx->offset > ctx->lp_len) + { + confess(ctx, + psprintf("tuple attribute should start at offset %u, but tuple length is only %u", + ctx->tuphdr->t_hoff + ctx->offset, ctx->lp_len)); + return false; + } + + /* Skip null values */ + if (infomask & HEAP_HASNULL && att_isnull(ctx->attnum, ctx->tuphdr->t_bits)) + return true; + + /* Skip non-varlena values, but update offset first */ + if (thisatt->attlen != -1) + { + ctx->offset = att_align_nominal(ctx->offset, thisatt->attalign); + ctx->offset = att_addlength_pointer(ctx->offset, thisatt->attlen, + tp + ctx->offset); + return true; + } This looks like it's not going to complain about a fixed-length attribute that overruns the tuple length. There's code further down that handles that case for a varlena attribute, but there's nothing comparable for the fixed-length case. + confess(ctx, + psprintf("%s toast at offset %u is unexpected", + va_tag == VARTAG_INDIRECT ? "indirect" : + va_tag == VARTAG_EXPANDED_RO ? "expanded" : + va_tag == VARTAG_EXPANDED_RW ? "expanded" : + "unexpected", + ctx->tuphdr->t_hoff + ctx->offset)); I suggest "unexpected TOAST tag %d", without trying to convert to a string. Such a conversion will likely fail in the case of genuine corruption, and isn't meaningful even if it works. Again, let's try to standardize terminology here: most of the messages in this function are now of the form "tuple attribute %d has some problem" or "attribute %d has some problem", but some have neither. Since we're separately returning attnum I don't see why it should be in the message, and if we weren't separately returning attnum then it ought to be in the message the same way all the time, rather than sometimes writing "attribute" and other times "tuple attribute". + /* Check relminmxid against mxid, if any */ + xmax = HeapTupleHeaderGetRawXmax(ctx->tuphdr); + if (infomask & HEAP_XMAX_IS_MULTI && + MultiXactIdPrecedes(xmax, ctx->relminmxid)) + { + confess(ctx, + psprintf("tuple xmax %u precedes relminmxid %u", + xmax, ctx->relminmxid)); + fatal = true; + } There are checks that an XID is neither too old nor too new, and presumably something similar could be done for MultiXactIds, but here you only check one end of the range. Seems like you should check both. + /* Check xmin against relfrozenxid */ + xmin = HeapTupleHeaderGetXmin(ctx->tuphdr); + if (TransactionIdIsNormal(ctx->relfrozenxid) && + TransactionIdIsNormal(xmin)) + { + if (TransactionIdPrecedes(xmin, ctx->relfrozenxid)) + { + confess(ctx, + psprintf("tuple xmin %u precedes relfrozenxid %u", + xmin, ctx->relfrozenxid)); + fatal = true; + } + else if (!xid_valid_in_rel(xmin, ctx)) + { + confess(ctx, + psprintf("tuple xmin %u follows last assigned xid %u", + xmin, ctx->next_valid_xid)); + fatal = true; + } + } Here you do check both ends of the range, but the comment claims otherwise. Again, please proof-read for this kind of stuff. + /* Check xmax against relfrozenxid */ Ditto here. + psprintf("tuple's header size is %u bytes which is less than the %u byte minimum valid header size", I suggest: tuple data begins at byte %u, but the tuple header must be at least %u bytes + psprintf("tuple's %u byte header size exceeds the %u byte length of the entire tuple", I suggest: tuple data begins at byte %u, but the entire tuple length is only %u bytes + psprintf("tuple's user data offset %u not maximally aligned to %u", I suggest: tuple data begins at byte %u, but that is not maximally aligned Or: tuple data begins at byte %u, which is not a multiple of %u That makes the messages look much more similar to each other grammatically and is more consistent about calling things by the same names. + psprintf("tuple with null values has user data offset %u rather than the expected offset %u", + psprintf("tuple without null values has user data offset %u rather than the expected offset %u", I suggest merging these: tuple data offset %u, but expected offset %u (%u attributes, %s) where %s is either "has nulls" or "no nulls" In fact, aren't several of the above checks redundant with this one? Like, why check for a value less than SizeofHeapTupleHeader or that's not properly aligned first? Just check this straightaway and call it good. + * If we get this far, the tuple is visible to us, so it must not be + * incompatible with our relDesc. The natts field could be legitimately + * shorter than rel's natts, but it cannot be longer than rel's natts. This is yet another case where you didn't update the comments. tuple_is_visible() now checks whether the tuple is visible to anyone, not whether it's visible to us, but the comment doesn't agree. In some sense I think this comment is redundant with the previous one anyway, because that one already talks about the tuple being visible. Maybe just write: The tuple is visible, so it must be compatible with the current version of the relation descriptor. It might have fewer columns than are present in the relation descriptor, but it cannot have more. + psprintf("tuple has %u attributes in relation with only %u attributes", + ctx->natts, + RelationGetDescr(ctx->rel)->natts)); I suggest: tuple has %u attributes, but relation has only %u attributes + /* + * Iterate over the attributes looking for broken toast values. This + * roughly follows the logic of heap_deform_tuple, except that it doesn't + * bother building up isnull[] and values[] arrays, since nobody wants + * them, and it unrolls anything that might trip over an Assert when + * processing corrupt data. + */ + ctx->offset = 0; + for (ctx->attnum = 0; ctx->attnum < ctx->natts; ctx->attnum++) + { + if (!check_tuple_attribute(ctx)) + break; + } I think this comment is too wordy. This text belongs in the header comment of check_tuple_attribute(), not at the place where it gets called. Otherwise, as you update what check_tuple_attribute() does, you have to remember to come find this comment and fix it to match, and you might forget to do that. In fact... looks like that already happened, because check_tuple_attribute() now checks more than broken TOAST attributes. Seems like you could just simplify this down to something like "Now check each attribute." Also, you could lose the extra braces. - bt_index_check | relname | relpages + bt_index_check | relname | relpages Don't include unrelated changes in the patch. I'm not really sure that the list of fields you're displaying for each reported problem really makes sense. I think the theory here should be that we want to report the information that the user needs to localize the problem but not everything that they could find out from inspecting the page, and not things that are too specific to particular classes of errors. So I would vote for keeping blkno, offnum, and attnum, but I would lose lp_flags, lp_len, and chunk. lp_off feels like it's a more arguable case: technically, it's a locator for the problem, because it gives you the byte offset within the page, but normally we reference tuples by TID, i.e. (blkno, offset), not byte offset. On balance I'd be inclined to omit it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: