Home > mailing lists

Bogus use of canonicalize_qual - Mailing list pgsql-hackers

From	Tom Lane
Subject	Bogus use of canonicalize_qual
Date	March 10, 2018 01:37:49
Msg-id	24475.1520635069@sss.pgh.pa.us Whole thread Raw
Responses	Re: Bogus use of canonicalize_qual
List	pgsql-hackers

Tree view

Whilst fooling about with predtest.c, I noticed a rather embarrassing
error.  Consider the following, rather silly, CHECK constraint:

regression=# create table pp (f1 int);
CREATE TABLE
regression=# create table cc (check (f1 = 1 or f1 = null)) inherits(pp);
CREATE TABLE

Because "f1 = null" reduces to constant NULL, this check constraint is
actually a no-op, because it can never evaluate to FALSE:

regression=# insert into cc values(1);
INSERT 0 1
regression=# insert into cc values(2);
INSERT 0 1
regression=# select * from pp;            
 f1 
----
  1
  2
(2 rows)

But:

regression=# select * from pp where f1 = 2;
 f1 
----
(0 rows)

Huh?  The reason is that the planner is deciding that it can exclude
cc from the plan:

regression=# explain select * from pp where f1 = 2;
                       QUERY PLAN                       
--------------------------------------------------------
 Append  (cost=0.00..0.01 rows=1 width=4)
   ->  Seq Scan on pp  (cost=0.00..0.00 rows=1 width=4)
         Filter: (f1 = 2)
(3 rows)

and that ultimately traces to the part of canonicalize_qual() that throws
away constant-NULL subexpressions of AND/OR clauses.  It's clearly
documented in canonicalize_qual() that it should only be applied to actual
WHERE clauses, where that's a valid optimization.  But there is lots of
code that didn't get that memo and is calling it on CHECK constraints,
allowing the NULL to be thrown away when it should not be.  The particular
culprit here, I think, is get_relation_constraints(), but there's a lot of
similar code elsewhere.

So, what to do?  We have a few choices, none ideal:

1. Just remove that optimization from canonicalize_qual().  This would
result in some inefficiency in badly-written queries, but it might be
acceptable.

2. Run around and remove all the bogus canonicalize_qual() calls.  The
main demerit here is that this'd mean CHECK constraints also don't get the
other effect of canonicalize_qual(), which is:

 * The following code attempts to apply the inverse OR distributive law:
 *        ((A AND B) OR (A AND C))  =>  (A AND (B OR C))
 * That is, locate OR clauses in which every subclause contains an
 * identical term, and pull out the duplicated terms.

This'd possibly make it harder to match WHERE clauses, which do get that
processing, to CHECK clauses which wouldn't.  I think that possibly
predtest.c is smart enough to make proofs even in the face of that, but
I'm not sure.  Another concern is whether external code might still
contain incorrect canonicalize_qual() calls, or whether we might not
accidentally put some back in future.

3. Change canonicalize_qual() to take an additional parameter indicating
whether it's working on a true qual or not.  This might be the best fix
for HEAD, but it doesn't seem very back-patchable.

4. Split canonicalize_qual() into two functions, one for the inverse OR
business and one for NULL removal.  This would result in an additional
tree traversal (and reconstruction) for every WHERE clause, slowing
planning somewhat.

5. Remove NULL-simplification from canonicalize_qual(), but put it
back somewhere later in the planner where we're traversing qual trees
anyway.  I think that it might work to charge the RestrictInfo-building
code with this, though I'm not sure about it.  It seems kind of high
risk for a back-patchable change in any case.

Thoughts?

            regards, tom lane

pgsql-hackers by date:

From: Peter Eisentraut
Date: 10 March 2018, 01:23:48
Subject: Re: PATCH: Unlogged tables re-initialization tests

From: Yura Sokolov
Date: 10 March 2018, 05:11:27
Subject: Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian abit)

Bogus use of canonicalize_qual - Mailing list pgsql-hackers

Previous

Next