Re: Testbed for predtest.c ... and some arguable bugs therein - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Testbed for predtest.c ... and some arguable bugs therein |
Date | |
Msg-id | 2629.1520551499@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Testbed for predtest.c ... and some arguable bugs therein (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Testbed for predtest.c ... and some arguable bugs therein
|
List | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Mar 8, 2018 at 12:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> A bit of hacking later, I have the attached. The set of test cases it >> includes at the moment were mostly developed with an eye to getting to >> full code coverage of predtest.c, but we could add more later. What's >> really interesting is that it proves that the "weak refutation" logic, >> i.e. predicate_refuted_by() with clause_is_check = true, is not >> self-consistent. > Oops. After further navel-gazing, I've concluded that my initial opinion of the "correct" semantics for weak refutation was wrong. When I revise the test logic to follow my revised understanding, it no longer claims that any proofs are mistaken, although there are still things that could be proven and aren't being. Let me summarize my thoughts here for the record. For predicate_implied_by, we have two plausible definitions of implication: * "strong" form: A implies B if truth of A implies truth of B. We need this to prove that a row satisfying one WHERE clause or index predicate must satisfy another one. * "weak" form: A implies B if non-falsity of A implies non-falsity of B. We need this to prove that a row satisfying one CHECK constraint must satisfy another one. We could conceivably use a third proof rule that tries to prove non-falsity of B from truth of A, but AFAICS there are no places in the current code that need exactly that behavior, and the "strong" rule seems like an adequate substitute unless such a need becomes mainstream. (There aren't that many cases where we could prove something per this rule but not per the strong rule, anyway.) The fourth possible proof rule is to prove truth of B from non-falsity of A, but the number of cases where you can make such a proof is so small that this option isn't actually interesting in practice. (For instance, you can't even conclude "X implies X" with such a rule.) On the predicate_refuted_by side, I was confused about how to choose among some similar options: * R1: A refutes B if truth of A implies falsity of B. * R2: A refutes B if truth of A implies non-truth of B. * R3: A refutes B if non-falsity of A implies non-truth of B. * R4: A refutes B if non-falsity of A implies falsity of B. We can discard R4, again because there are too few cases where we could prove anything per that rule, eg NOT X wouldn't refute X or vice versa. R1 is the traditional behavior of predicate_refuted_by, and as its comment says, is needed because we want to disprove a CHECK constraint (ie prove it would be violated) given a WHERE clause. R2 is of use for disproving one WHERE clause given another one. It turns out that of the two extant calls to predicate_refuted_by, one of them would actually like to have that semantics, because it's looking for mutually contradictory WHERE clauses. We've been getting by with R1 for that, but R2 would let us optimize some more cases. R3 could be of use for disproving a WHERE clause given a CHECK constraint, and is the one I'd supposed should be the alternate behavior of predicate_refuted_by. However, we have no actual use cases for that; relation_excluded_by_constraints goes the other way. Moreover, although R3 seems like it'd be the more tractable rule on grounds of symmetry, it's looking to me like R2 is about equivalent for proof purposes. For instance see the code near predtest.c:700, * If A is a strong NOT-clause, A R=> B if B equals A's arg * * We cannot make the stronger conclusion that B is refuted if B * implies A's arg; that would only prove that B is not-TRUE, not * that it's not NULL either. Hence use equal() rather than * predicate_implied_by_recurse(). We could do the latter if we * ever had a need for the weak form of refutation. The previous patch to add weak implication rules missed the opportunity for improvement here. If we are using R1 or R2, then we have the assumption "NOT A is true", allowing us to conclude A is false. Then, if we can prove B implies A under weak implication (not-false B implies not-false A), we have proven B must be false, satisfying the proof rule for R1. But also, if we can prove B implies A under strong implication (true B implies true A), we have proven B is not true, satisfying the proof rule for R2. So in either case there's potential to recurse to predicate_implied_by rather than only being able to handle the tautology "NOT A refutes A". Now, the same logic would go through if we were using R3. We'd be starting with non-falsity of "NOT A", only allowing us to conclude non-truth of A, but a proof of strong B => A would still let us conclude B is not true, satisfying R3. Alternatively, consider if we were starting with A and trying to refute NOT B. Intuitively you'd figure that if you can prove A => B, that oughta be enough to refute NOT B. That certainly works for R1 with strong implication: truth of A implies truth of B implies falsity of NOT B. And it works for R2 with either type of implication: truth of A implies (at least) non-falsity of B which implies non-truth of NOT B. For R3, you'd need weak implication: non-falsity of A implies non-falsity of B implies non-truth of NOT B. Now generally, strong implication is easier to prove than weak implication (cf. existing handling of clause_is_check in predtest.c), so R2 is marginally better in this case. So my initial feeling that we ought to define predicate_refuted_by's alternate behavior as R3 seems wrong: we're better off with R2, which matches present needs better and is roughly equally tractable as far as the actual proof logic goes. I propose to change test_predtest so that what it checks for as "w_r_holds" is R2 not R3, and to improve the documentation in predtest.c about this, and maybe to fix some of the cases where we can do better by adding mutual recursion between the "implied" and "refuted" cases. I feel a little guilty about taking time out of the commitfest to deal with this, but I'd rather get it done while this reasoning is still swapped in. regards, tom lane
pgsql-hackers by date: