Thread: BUG #6535: LEFT JOIN on large table is altering data
The following bug has been logged on the website: Bug reference: 6535 Logged by: Aren Cambre Email address: aren@arencambre.com PostgreSQL version: 9.1.3 Operating system: Windows 7 x64 Description:=20=20=20=20=20=20=20=20 (Note that this issue is described fully at http://stackoverflow.com/questions/9713537/postgres-left-join-is-creating-n= ew-data. A few knowledgeable people have weighed in, and there is no solution identified.) In a nutshell, I have a query LEFT JOINing table A with table B and inserting the data into table A'. A' is a functional equivalent of table A. Table B is actually built from a using a WITH at the top. I'm using it to help add a rank() to the table. Table A has just over 2 million rows. There is a column named citation_id that has no null values. Not only does citation_id have a NOT NULL constraint, I have verified with a query that it has no null values. The LEFT JOIN is pulling table A's citation_id. However, when the joined table is inserted into table A', Postgres throws "null value in column "citation_id" violates not-null constraint" errors. Note that I am running Postgres 32 bit for PostGIS compatibility.
aren@arencambre.com writes: > (Note that this issue is described fully at > http://stackoverflow.com/questions/9713537/postgres-left-join-is-creating-new-data. > A few knowledgeable people have weighed in, and there is no solution > identified.) This might be a bug, but you've not provided sufficient information for someone else to reproduce the problem. What would be good is a SQL script that reproduces the error from a standing start (empty database). regards, tom lane
Thanks. While creating that very file, I discovered that 1 row had blanks in every field despite a column having a *NOT NULL* constraint and another column being a* *serial. Removing that column appears to fixed the problem. Something about that column made Postgres unhappy, though. If I ran these queries: SELECT COUNT(*) FROM consistent.master WHERE citation_id IS NOT NULL UNION SELECT COUNT(*) FROM consistent.master UNION SELECT COUNT(*) FROM consistent.master WHERE citation_id IS NULL I got this result: 2085344 2085343 0 Not clear how adding a WHERE clause, whose only practical effect is to reduce the number of rows returned, could cause *more* rows to be returned. That seems buggy to me. Aren On Fri, Mar 16, 2012 at 12:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > aren@arencambre.com writes: > > (Note that this issue is described fully at > > > http://stackoverflow.com/questions/9713537/postgres-left-join-is-creating-new-data > . > > A few knowledgeable people have weighed in, and there is no solution > > identified.) > > This might be a bug, but you've not provided sufficient information for > someone else to reproduce the problem. What would be good is a SQL > script that reproduces the error from a standing start (empty database). > > regards, tom lane >
Aren Cambre <aren@arencambre.com> wrote: > SELECT COUNT(*) > FROM consistent.master > WHERE citation_id IS NOT NULL > UNION > SELECT COUNT(*) > FROM consistent.master > UNION > SELECT COUNT(*) > FROM consistent.master > WHERE citation_id IS NULL > > I got this result: > > 2085344 > 2085343 > 0 > > Not clear how adding a WHERE clause, whose only practical effect > is to reduce the number of rows returned, could cause *more* rows > to be returned. That seems buggy to me. Never assume that the rows will be returned in any particular order from a query unless you specify ORDER BY. Assuming, as you seem to be doing, that rows from the left side of a UNION will be output before rows from the right side is not safe. You have no way of knowing which row in a result set like that came from which of the UNIONed SELECTs. In this case your assumption is almost certainly wrong. -Kevin
Kevin, You're absolutely correct; there's no telling what order the planner will use. I did in fact run the queries separately and got the same result that I intimated in the UNIONed queries. I can no longer reproduce, however, because I have since altered the table. So if there is a bug, I may not be much help in nailing it down. The only thing I can help with is this appears to have happened at row # 583847 of just over 2 million rows, per a straight CSV dump of the table. Aren On Sun, Mar 18, 2012 at 3:22 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov > wrote: > Aren Cambre <aren@arencambre.com> wrote: > > > SELECT COUNT(*) > > FROM consistent.master > > WHERE citation_id IS NOT NULL > > UNION > > SELECT COUNT(*) > > FROM consistent.master > > UNION > > SELECT COUNT(*) > > FROM consistent.master > > WHERE citation_id IS NULL > > > > I got this result: > > > > 2085344 > > 2085343 > > 0 > > > > Not clear how adding a WHERE clause, whose only practical effect > > is to reduce the number of rows returned, could cause *more* rows > > to be returned. That seems buggy to me. > > Never assume that the rows will be returned in any particular order > from a query unless you specify ORDER BY. Assuming, as you seem to > be doing, that rows from the left side of a UNION will be output > before rows from the right side is not safe. You have no way of > knowing which row in a result set like that came from which of the > UNIONed SELECTs. In this case your assumption is almost certainly > wrong. > > -Kevin >
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > Aren Cambre <aren@arencambre.com> wrote: > >> SELECT COUNT(*) >> FROM consistent.master >> WHERE citation_id IS NOT NULL >> UNION >> SELECT COUNT(*) >> FROM consistent.master >> UNION >> SELECT COUNT(*) >> FROM consistent.master >> WHERE citation_id IS NULL >> >> I got this result: >> >> 2085344 >> 2085343 >> 0 >> >> Not clear how adding a WHERE clause, whose only practical effect >> is to reduce the number of rows returned, could cause *more* rows >> to be returned. > Never assume that the rows will be returned in any particular > order from a query unless you specify ORDER BY. Hmm. That doesn't explain why the numbers don't add up, though. Is that a copy/paste from an actual query run, or was there some hand-editing there? In particular, you might easily get that result if that last line was really: WHERE citation_id = '' instead of the IS NULL test. In the ANSI standard and in PostgreSQL there is a big difference between an empty string and NULL, although there is at least one product I know of which breaks from standard compliance by treating them as equivalent. -Kevin
Kevin, Thanks. That was a direct copy/paste. It happened that the *UNION*ed queries spat out those results in the same order that the queries appeared. Just want to again emphasize that my database state has changed, so I am not sure this is remains a good case for finding a bug. Aren On Sun, Mar 18, 2012 at 3:33 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov > wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > > Aren Cambre <aren@arencambre.com> wrote: > > > >> SELECT COUNT(*) > >> FROM consistent.master > >> WHERE citation_id IS NOT NULL > >> UNION > >> SELECT COUNT(*) > >> FROM consistent.master > >> UNION > >> SELECT COUNT(*) > >> FROM consistent.master > >> WHERE citation_id IS NULL > >> > >> I got this result: > >> > >> 2085344 > >> 2085343 > >> 0 > >> > >> Not clear how adding a WHERE clause, whose only practical effect > >> is to reduce the number of rows returned, could cause *more* rows > >> to be returned. > > > Never assume that the rows will be returned in any particular > > order from a query unless you specify ORDER BY. > > Hmm. That doesn't explain why the numbers don't add up, though. Is > that a copy/paste from an actual query run, or was there some > hand-editing there? In particular, you might easily get that result > if that last line was really: > > WHERE citation_id = '' > > instead of the IS NULL test. In the ANSI standard and in PostgreSQL > there is a big difference between an empty string and NULL, although > there is at least one product I know of which breaks from standard > compliance by treating them as equivalent. > > -Kevin >
Aren Cambre <aren@arencambre.com> writes: > Just want to again emphasize that my database state has changed, so I am > not sure this is remains a good case for finding a bug. Your description of an apparently-all-nulls row sounds a great deal like data corruption. I suspect there may be more wrong than just the one row, and that that's what's leading to the inconsistent results. It'd be worthwhile to run hardware diagnostics (memtest, disk tests), make sure you are on the latest minor release of Postgres, and perhaps do a database dump and reload. (If the latter seems impractical, at least consider reindexing everything.) regards, tom lane
Thanks. I am running this on 1 year old hardware on a wholly-encrypted C drive, encrypted with TrueCrypt. I suspect that I would have been alerted already with that or SMART if I had hardware issues? I am on 9.1.3, although I think the data was inserted back when I was running 9.0.x. I upgraded to 9.1 almost as soon as it came out, doing the requisite dumps and reloads. I have also reindexed a couple of times as diagnostic steps. So far, looks like only that one row has data corruption. Aren On Sun, Mar 18, 2012 at 9:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Aren Cambre <aren@arencambre.com> writes: > > Just want to again emphasize that my database state has changed, so I am > > not sure this is remains a good case for finding a bug. > > Your description of an apparently-all-nulls row sounds a great deal like > data corruption. I suspect there may be more wrong than just the one > row, and that that's what's leading to the inconsistent results. It'd > be worthwhile to run hardware diagnostics (memtest, disk tests), make > sure you are on the latest minor release of Postgres, and perhaps do a > database dump and reload. (If the latter seems impractical, at least > consider reindexing everything.) > > regards, tom lane >
Excerpts from Aren Cambre's message of dom mar 18 23:39:16 -0300 2012: > Thanks. I am running this on 1 year old hardware on a wholly-encrypted C > drive, encrypted with TrueCrypt. I suspect that I would have been alerted > already with that or SMART if I had hardware issues? That sounds excessively optimistic -- not necessarily a good characteristic on a database administrator. --=20 =C3=81lvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
> > > Thanks. I am running this on 1 year old hardware on a wholly-encrypted C > > drive, encrypted with TrueCrypt. I suspect that I would have been alerted > > already with that or SMART if I had hardware issues? > > That sounds excessively optimistic -- not necessarily a good > characteristic on a database administrator. > Which is why I could never be a DBA. :-) I am the only user. This is for my doctoral research project. Aren