"WHERE col NOT IN" yields falsely empty result. - Mailing list pgsql-bugs
From | pgsql-bugs@postgresql.org |
---|---|
Subject | "WHERE col NOT IN" yields falsely empty result. |
Date | |
Msg-id | 200106131235.f5DCZlj28167@hub.org Whole thread Raw |
Responses |
Re: "WHERE col NOT IN" yields falsely empty result.
|
List | pgsql-bugs |
Dr M J Carter (Martin.J.Carter@nottingham.ac.uk) reports a bug with a severity of 2 The lower the number the more severe it is. Short Description "WHERE col NOT IN" yields falsely empty result. Long Description Consider SELECT * FROM bar WHERE col2 NOT IN (SELECT col1 FROM foo) where tables foo and bar overlap but neither is a subset of the other. Since the difference set bar - foo is nonempty, the above should yield one or more rows. However, the presence of a null in foo.col1 (tests 3 and 8 below) yields zero rows, even where the difference set has rows with no null entries (see test 2). Am I missing something obvious? eg in one of the FAQs? Two possible user-level workarounds (using EXCEPT with IN (tests 5 and 9), and using a correlated join (tests 6 and 10)) are shown below, which work as expected if the null is in the difference between the tables. A null in the intersection, as in tests 9 and 10, causes its row to appear in the output; on reflection this is be expected, and can be fixed (tests 11 and 12) once the possibility is recognised. I can't (yet) see how to show null-entry rows in the difference set without also showing nulls from the intersection, but I'm an SQL newbie so this time I *know* I'm missing something. (Disable blather mode.) The tables need not have the same column types; that just makes testing a touch easier, by changing the columns being examined(as in test 4). Apologies for the extensiveness of my tests, of which you say "don't waste your time" in your Web page: I had to convince myself that I wasn't just being stupid. Binaries used are from the stock Debian distributions, not recompiled by me. The older set is on my home box (Debian 2.2, PostgreSQL-6.5.3, built with gcc 2.95.2), and the newer one at work (Deb 2.x (testing): psql yields "PostgreSQL 7.1 on i686-pc-linux-gnu, compiledby GCC 2.95.4"; libc6 is v2.2.3-5; processor AMD K6/2 .... anything else?). Sample Code Feed the following through "psql foo -f foo.sql | less": ---- snip ---- DROP TABLE foo; CREATE TABLE "foo" ( "ipnum" inet, "ipname" text); DROP TABLE bar; CREATE TABLE "bar" ( "ipnum" inet, "ipname" text); COPY "foo" FROM stdin; 192.168.187.10 theory 192.168.187.11 junior 192.168.40.43 granby 192.168.40.48 gotham 192.168.21.16 marian 192.168.20.6 marian 192.168.1.1 jips-gw \. COPY "bar" FROM stdin; 192.168.187.10 theory 192.168.187.11 junior 192.168.40.43 granby 192.168.40.48 gotham 192.168.21.16 marian 192.168.20.6 marian 192.168.3.8 real-gw \. SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo); -- 1: Shows intersection (6 rows). SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo); -- 2: Shows expected single row from bar. INSERT INTO foo (ipnum) VALUES ('192.168.104.42'); -- Add row to foo with null ipname value. SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo); -- 3: Yields empty set. Is this a bug? it's certainly unexpected; -- what I'd expected was the same one-line reply asfor test 2. SELECT * FROM bar WHERE ipnum NOT IN (SELECT ipnum FROM foo); -- 4: Yields expected single row again, by examining other column. SELECT * FROM bar EXCEPT SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo); -- 5: Yields expected row from bar. SELECT * FROM bar WHERE NOT EXISTS ( SELECT * FROM foo WHERE foo.ipname = bar.ipname); -- 6: Inspired by answer 4.23 in general PostgreSQL FAQ: yields -- expected row, as well as being somewhat faster forlarge tables. INSERT INTO bar (ipnum) VALUES ('192.168.104.42'); -- Move row to intersection. SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo); -- 7: Shows non-null rows of intersection (6 rows). SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo); -- 8: Yields empty set. SELECT * FROM bar EXCEPT SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo); -- 9: Yields expected row from bar, plus empty-field row from the -- intersection. SELECT * FROM bar WHERE NOT EXISTS ( SELECT * FROM foo WHERE foo.ipname = bar.ipname); -- 10: Ditto. SELECT * FROM bar WHERE ipname IS NOT NULL EXCEPT SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo); -- 11: Yields expected row from bar only. SELECT * FROM bar WHERE ipname IS NOT NULL AND NOT EXISTS ( SELECT * FROM foo WHERE foo.ipname = bar.ipname); -- 12: Ditto. No file was uploaded with this report
pgsql-bugs by date: