Re: Remove lossy-operator RECHECK flag? - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Remove lossy-operator RECHECK flag? |
Date | |
Msg-id | 10447.1208194075@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Remove lossy-operator RECHECK flag? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Remove lossy-operator RECHECK flag?
|
List | pgsql-hackers |
I've committed the runtime-recheck changes. Oleg had mentioned that GIST text search could be improved by using runtime rechecking, but I'll leave any refinements of that sort up to you. One thing I was wondering about is that GIN and GIST are set up to preinitialize the recheck flag to TRUE; this means that if someone uses an old consistent() function that doesn't know it should set the flag, a recheck will be forced. But it seems to me that there's an argument for preinitializing to FALSE instead. There are four possibilities for what will happen with an un-updated consistent() function: 1. If we set the flag TRUE, and that's correct, everything is fine. 2. If we set the flag TRUE, and that's wrong (ie, the query is really exact) then a useless recheck occurs when we arrive at the heap. Nothing visibly goes wrong, but the query is slower than it should be. 3. If we set the flag FALSE, and that's correct, everything is fine. 4. If we set the flag FALSE, and that's wrong (ie, the query is really inexact), then rows that don't match the query may get returned. By the argument that it's better to break things obviously than to break them subtly, risking case 4 seems more attractive than risking case 2. This also ties into my previous question about what 8.4 pg_dump should do when seeing amopreqcheck = TRUE while dumping from an old server. I'm now convinced that the committed behavior (print RECHECK anyway) is the best choice, for a couple of reasons: * It avoids silent breakage if the dump is reloaded into an old server. * You'll have to deal with the issue anyhow if you made your dump with the older version's pg_dump. What this means is that, if we make the preinitialization value FALSE, then an existing GIST/GIN opclass that doesn't use RECHECK will load just fine into 8.4 and everything will work as expected, even without touching the C code. An opclass that does use RECHECK will fail to load from the dump, and if you're stubborn and edit the dump instead of getting a newer version of the module, you'll start getting wrong query answers. This means that all the pain is concentrated on the RECHECK-using case. And you can hardly maintain that you weren't warned about compatibility problems, if the dump didn't load ... On the other hand, if we make the preinitialization value TRUE, there's some pain for users whether they used RECHECK or not, and there won't be any obvious notification of the problem when they didn't. So I'm thinking it might be better to switch to the other preinitialization setting. Comments? regards, tom lane
pgsql-hackers by date: