Thread: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
danielnewman@umich.edu
Date:
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDIxMApMb2dnZWQgYnk6ICAg ICAgICAgIERhbmllbCBOZXdtYW4KRW1haWwgYWRkcmVzczogICAgICBkYW5p ZWxuZXdtYW5AdW1pY2guZWR1ClBvc3RncmVTUUwgdmVyc2lvbjogOS41LjMK T3BlcmF0aW5nIHN5c3RlbTogICBPUyBYIDEwLjExLjUKRGVzY3JpcHRpb246 ICAgICAgICAKCkkgaGF2ZSBhIHRhYmxlIHdpdGggb25lIGNvbHVtbiBhbmQg YSBoYXNoIGluZGV4Og0KDQogICAgICAgICAgVGFibGUgInB1YmxpYy5oYXNo X2lzc3VlX3RhYmxlIg0KICAgICAgQ29sdW1uICAgICAgIHwgICAgICAgVHlw ZSAgICAgICAgfCBNb2RpZmllcnMgDQotLS0tLS0tLS0tLS0tLS0tLS0tKy0t LS0tLS0tLS0tLS0tLS0tLS0rLS0tLS0tLS0tLS0NCiBoYXNoX2lzc3VlX2Nv bHVtbiB8IGNoYXJhY3RlciB2YXJ5aW5nIHwgbm90IG51bGwNCkluZGV4ZXM6 DQogICAgImhhc2hfaXNzdWVfaW5kZXgiIGhhc2ggKGhhc2hfaXNzdWVfY29s dW1uKQ0KDQpXaGVuIEkgcnVuIHRoZSBxdWVyeSBgc2VsZWN0ICogZnJvbSBo YXNoX2lzc3VlX3RhYmxlIHdoZXJlIGhhc2hfaXNzdWVfY29sdW1uCmxpa2Ug JzIxODQnO2AgSSBnZXQgNzAxIHJlc3VsdHMuIFdoZW4gSSBydW4gdGhlIHF1 ZXJ5IA0KV2hlbiBJIHJ1biB0aGUgcXVlcnkgYHNlbGVjdCAqIGZyb20gaGFz aF9pc3N1ZV90YWJsZSB3aGVyZSBoYXNoX2lzc3VlX2NvbHVtbgo9ICcyMTg0 JztgLCBJIGdldCAwIHJlc3VsdHMuIEhvd2V2ZXIsIGlmIEkgZHJvcCB0aGUg aGFzaCBpbmRleCBhbmQgcmVydW4gdGhlCnNlY29uZCBxdWVyeSwgSSBnZXQg dGhlIHNhbWUgcmVzdWx0cy4NCg0KSSBoYXZlIGNyZWF0ZWQgYSBzcWwgZHVt cCBmaWxlIGNvbnRhaW5pbmcgdGhlIGRhdGEgbmVlZGVkIHRvIHJlcGxpY2F0 ZSB0aGUKaXNzdWUsIGJ1dCBpdCBpcyB0b28gbGFyZ2UgdG8gY29weS9wYXN0 ZSBoZXJlLiBJcyB0aGVyZSBzb21ld2hlcmUgSSBjYW4KdXBsb2FkIG9yIHNl bmQgdGhhdCBmaWxlPwoK
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
danielnewman@umich.edu writes: > When I run the query `select * from hash_issue_table where hash_issue_column > like '2184';` I get 701 results. When I run the query > When I run the query `select * from hash_issue_table where hash_issue_column > = '2184';`, I get 0 results. However, if I drop the hash index and rerun the > second query, I get the same results. Sounds like a corrupted hash index to me; which is not terribly surprising given hash indexes' lack of WAL support. Can you reproduce this after rebuilding the hash index? regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Daniel Newman
Date:
Yes. If i delete the index and recreate it, the bug is replicated. I have uploaded a pg_dump .sql file to https://www.dropbox.com/s/4dxuo2uthj721od/hash_issue_db.sql?dl=0 that you can use to recreate the issue. After creating the database when I run: >>> select * from hash_issue_table where hash_issue_column = '1'; I get no results. When I run: >>> drop index hash_issue_index; >>> select * from hash_issue_table where hash_issue_column = '1'; I get 531 rows of results. When I run: >>> create index hash_issue_index on hash_issue_table using hash(hash_issue_column); >>> select * from hash_issue_table where hash_issue_column = '1'; I get 0 results again. I have repeated this successfully multiple times and on a coworker's machine as well. Interestingly, I modified the pg_dump file a bit. At the end, it says: CREATE INDEX hash_issue_index ON hash_issue_table USING hash > (hash_issue_column); > > DROP INDEX hash_issue_index; > > CREATE INDEX hash_issue_index ON hash_issue_table USING hash > (hash_issue_column); > This is because the issue was not replicating (for some reason) when it built the index the first time. On Thu, Jun 23, 2016 at 1:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > danielnewman@umich.edu writes: > > When I run the query `select * from hash_issue_table where > hash_issue_column > > like '2184';` I get 701 results. When I run the query > > When I run the query `select * from hash_issue_table where > hash_issue_column > > = '2184';`, I get 0 results. However, if I drop the hash index and rerun > the > > second query, I get the same results. > > Sounds like a corrupted hash index to me; which is not terribly surprising > given hash indexes' lack of WAL support. Can you reproduce this after > rebuilding the hash index? > > regards, tom lane >
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Daniel Newman <dtnewman@gmail.com> writes: > Yes. If i delete the index and recreate it, the bug is replicated. So the answer is that this got broken by commit 9f03ca915196dfc8, which appears to have imagined that _hash_form_tuple() is just an alias for index_form_tuple(). But it ain't. As a result, construction of hash indexes larger than shared_buffers is broken in 9.5 and up: what gets entered into the index is garbage, because we are taking raw data as if it were already hashed. (In fact, in this example, we seem to be taking *pointers* to raw data as the hash values.) > Interestingly, I modified the pg_dump file a bit. At the end, it says: >> CREATE INDEX hash_issue_index ON hash_issue_table USING hash >> (hash_issue_column); >> DROP INDEX hash_issue_index; >> CREATE INDEX hash_issue_index ON hash_issue_table USING hash >> (hash_issue_column); > This is because the issue was not replicating (for some reason) when it > built the index the first time. I think what's happening there is that the first CREATE INDEX underestimates the size of the table and decides it doesn't need to use the _h_spool() code path. The other path isn't broken. We can either revert the aforesaid commit so far as it affects hash, or do something to break _hash_form_tuple's encapsulation of the hash-value-for-data substitution. I don't immediately see a non-messy way to do the latter. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
I wrote: > So the answer is that this got broken by commit 9f03ca915196dfc8, > which appears to have imagined that _hash_form_tuple() is just an > alias for index_form_tuple(). But it ain't. As a result, construction > of hash indexes larger than shared_buffers is broken in 9.5 and up: BTW, an easy way to prove that the _h_spool code path is broken is to do this: diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c index 49a6c81..162cb63 100644 --- a/src/backend/access/hash/hash.c +++ b/src/backend/access/hash/hash.c @@ -125,7 +125,7 @@ hashbuild(Relation heap, Relation index, IndexInfo *indexInfo) * NOTE: this test will need adjustment if a bucket is ever different from * one page. */ - if (num_buckets >= (uint32) NBuffers) + if (1) buildstate.spool = _h_spoolinit(heap, index, num_buckets); else buildstate.spool = NULL; and then run the regression tests. I'm thinking it would be a good idea to provide some less-painful way to force that code path to be taken for testing purposes. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Daniel Newman
Date:
Wow, thank you for such a quick response! And thanks for pointing out the commit... I'm definitely interested in having a look at the code. In the meantime, I can solve the specific problem I'm having by switching to a btree index :) Thank you for this and all the rest of your great work. Best, Daniel Newman On Thu, Jun 23, 2016 at 7:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Daniel Newman <dtnewman@gmail.com> writes: > > Yes. If i delete the index and recreate it, the bug is replicated. > > So the answer is that this got broken by commit 9f03ca915196dfc8, > which appears to have imagined that _hash_form_tuple() is just an > alias for index_form_tuple(). But it ain't. As a result, construction > of hash indexes larger than shared_buffers is broken in 9.5 and up: > what gets entered into the index is garbage, because we are taking > raw data as if it were already hashed. (In fact, in this example, > we seem to be taking *pointers* to raw data as the hash values.) > > > Interestingly, I modified the pg_dump file a bit. At the end, it says: > >> CREATE INDEX hash_issue_index ON hash_issue_table USING hash > >> (hash_issue_column); > >> DROP INDEX hash_issue_index; > >> CREATE INDEX hash_issue_index ON hash_issue_table USING hash > >> (hash_issue_column); > > This is because the issue was not replicating (for some reason) when it > > built the index the first time. > > I think what's happening there is that the first CREATE INDEX > underestimates the size of the table and decides it doesn't need to > use the _h_spool() code path. The other path isn't broken. > > We can either revert the aforesaid commit so far as it affects hash, > or do something to break _hash_form_tuple's encapsulation of the > hash-value-for-data substitution. I don't immediately see a non-messy > way to do the latter. > > regards, tom lane >
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Robert Haas
Date:
On Thu, Jun 23, 2016 at 7:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Daniel Newman <dtnewman@gmail.com> writes: >> Yes. If i delete the index and recreate it, the bug is replicated. > > So the answer is that this got broken by commit 9f03ca915196dfc8, > which appears to have imagined that _hash_form_tuple() is just an > alias for index_form_tuple(). But it ain't. As a result, construction > of hash indexes larger than shared_buffers is broken in 9.5 and up: > what gets entered into the index is garbage, because we are taking > raw data as if it were already hashed. (In fact, in this example, > we seem to be taking *pointers* to raw data as the hash values.) Oops. >> Interestingly, I modified the pg_dump file a bit. At the end, it says: >>> CREATE INDEX hash_issue_index ON hash_issue_table USING hash >>> (hash_issue_column); >>> DROP INDEX hash_issue_index; >>> CREATE INDEX hash_issue_index ON hash_issue_table USING hash >>> (hash_issue_column); >> This is because the issue was not replicating (for some reason) when it >> built the index the first time. > > I think what's happening there is that the first CREATE INDEX > underestimates the size of the table and decides it doesn't need to > use the _h_spool() code path. The other path isn't broken. > > We can either revert the aforesaid commit so far as it affects hash, > or do something to break _hash_form_tuple's encapsulation of the > hash-value-for-data substitution. I don't immediately see a non-messy > way to do the latter. Would it work to have something like _hash_form_tuple() except that instead of returning an index tuple it would just return the Datum and isnull flags and let the caller decide what to do with them? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Jun 23, 2016 at 7:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> We can either revert the aforesaid commit so far as it affects hash, >> or do something to break _hash_form_tuple's encapsulation of the >> hash-value-for-data substitution. I don't immediately see a non-messy >> way to do the latter. > Would it work to have something like _hash_form_tuple() except that > instead of returning an index tuple it would just return the Datum and > isnull flags and let the caller decide what to do with them? Yeah. Let's define it as a function that transforms input values/isnull arrays into output values/isnull arrays. It seems all right for the callers to know the latter are always of length 1, so they can be local variables in the callers. I'll go make it so. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Thu, Jun 23, 2016 at 4:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > So the answer is that this got broken by commit 9f03ca915196dfc8, > which appears to have imagined that _hash_form_tuple() is just an > alias for index_form_tuple(). But it ain't. Can we add test coverage that will detect when comparetup_index_hash() gives wrong answers, please? It's been broken at least 2 times, that I'm aware of. I'll write a patch, if that helps. -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes: > On Thu, Jun 23, 2016 at 4:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So the answer is that this got broken by commit 9f03ca915196dfc8, >> which appears to have imagined that _hash_form_tuple() is just an >> alias for index_form_tuple(). But it ain't. > Can we add test coverage that will detect when comparetup_index_hash() > gives wrong answers, please? It's been broken at least 2 times, that > I'm aware of. I'll write a patch, if that helps. I agree that we need to do something, but I'm not very clear on what the something is. I considered inventing a debug #define that would reduce the _h_spool threshold to the minimum workable amount (2, looks like), but I'm not sure how much that will help. What did you have in mind? regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Fri, Jun 24, 2016 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I agree that we need to do something, but I'm not very clear on what > the something is. I considered inventing a debug #define that would > reduce the _h_spool threshold to the minimum workable amount (2, looks > like), but I'm not sure how much that will help. What did you have > in mind? I think we should do that. Why do you say that you're not sure how much it will help? It may have escaped your attention, so I'll point out that amcheck regression tests are my solution to testing external sorting, an area which is similarly sorely lacking. -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes: > On Fri, Jun 24, 2016 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I agree that we need to do something, but I'm not very clear on what >> the something is. I considered inventing a debug #define that would >> reduce the _h_spool threshold to the minimum workable amount (2, looks >> like), but I'm not sure how much that will help. What did you have >> in mind? > I think we should do that. Why do you say that you're not sure how > much it will help? Well, it won't run automatically --- unless someone spins up a buildfarm machine that adds that symbol to CPPFLAGS, and even then we'd only have coverage on one platform. However, the next step up would be to invent an index storage parameter or some such to force the decision, letting the standard regression tests include a reasonably-cheap case that would exercise _h_spool. But that seems like rather a large amount of work compared to the likely benefits. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Fri, Jun 24, 2016 at 3:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think we should do that. Why do you say that you're not sure how >> much it will help? > > Well, it won't run automatically --- unless someone spins up a buildfarm > machine that adds that symbol to CPPFLAGS, and even then we'd only have > coverage on one platform. I think that this is an argument against further proliferation of #define's for this kind of thing, not an argument against adding a way to force tuplesort based hash index builds. -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes: > On Fri, Jun 24, 2016 at 3:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Well, it won't run automatically --- unless someone spins up a buildfarm >> machine that adds that symbol to CPPFLAGS, and even then we'd only have >> coverage on one platform. > I think that this is an argument against further proliferation of > #define's for this kind of thing, not an argument against adding a way > to force tuplesort based hash index builds. So ... what's your point? This statement doesn't seem to lead to a conclusion in favor of either of the alternatives I mentioned. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Sat, Jun 25, 2016 at 8:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think that this is an argument against further proliferation of >> #define's for this kind of thing, not an argument against adding a way >> to force tuplesort based hash index builds. > > So ... what's your point? This statement doesn't seem to lead to a > conclusion in favor of either of the alternatives I mentioned. I'm in favor of just adding a debug option. We should introduce a more general idea of a #define that enables a variety of debugging options, because it's silly to have to worry about buildfarm coverage each and every time this crops up. I'm not entirely sure what level that should be scoped at. The status quo is impractical. Incidentally, did you see to it that there is buildfarm coverage for RAW_EXPRESSION_COVERAGE_TEST? -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes: > On Sat, Jun 25, 2016 at 8:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So ... what's your point? This statement doesn't seem to lead to a >> conclusion in favor of either of the alternatives I mentioned. > I'm in favor of just adding a debug option. We should introduce a more > general idea of a #define that enables a variety of debugging options, > because it's silly to have to worry about buildfarm coverage each and > every time this crops up. I'm not entirely sure what level that should > be scoped at. The status quo is impractical. I don't exactly see how that leads to any improvement --- you still end up having to manually configure some buildfarm critters for coverage, no? > Incidentally, did you see to it that there is buildfarm coverage for > RAW_EXPRESSION_COVERAGE_TEST? Yes, see dromedary. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Sat, Jun 25, 2016 at 12:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm in favor of just adding a debug option. We should introduce a more >> general idea of a #define that enables a variety of debugging options, >> because it's silly to have to worry about buildfarm coverage each and >> every time this crops up. I'm not entirely sure what level that should >> be scoped at. The status quo is impractical. > > I don't exactly see how that leads to any improvement --- you still end > up having to manually configure some buildfarm critters for coverage, no? Yes, but you only have to do it once. I don't see the problem with creating a #define that represents a grab-bag of debug options that should run on one or two buildfarm animals. That doesn't preclude also allowing things like RAW_EXPRESSION_COVERAGE_TEST to be enabled more selectively. In short, I want to make it easier to add custom debugging options that could trip a regression test failure. -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes: > On Sat, Jun 25, 2016 at 12:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I don't exactly see how that leads to any improvement --- you still end >> up having to manually configure some buildfarm critters for coverage, no? > Yes, but you only have to do it once. I don't see the problem with > creating a #define that represents a grab-bag of debug options that > should run on one or two buildfarm animals. That doesn't preclude also > allowing things like RAW_EXPRESSION_COVERAGE_TEST to be enabled more > selectively. Meh. I do not really think that dromedary's test options (-DCOPY_PARSE_PLAN_TREES -DRAW_EXPRESSION_COVERAGE_TEST) represent a model to follow for this problem. In both of those cases, the value of having the option turned on is that it will exercise code that hasn't even been written yet. So those options have to be enabled across the entire test suite in order to be of much use. Furthermore, there's no particular reason to think that they'd expose platform-dependent behavior, so there's no real downside to having just one or two buildfarm critters using them. But exercising a non-default code path in hash index build is something we only need in one place in the tests, and at least in my judgment there is a possibility for platform-specific behavior. So I think some way of turning it on dynamically, and then adding that as a standard test case, would be a considerable improvement. I just don't quite want to go to the effort of inventing a GUC or reloption. Is there some intermediate answer? regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
I wrote: > But exercising a non-default code path in hash index build is something we > only need in one place in the tests, and at least in my judgment there is > a possibility for platform-specific behavior. So I think some way of > turning it on dynamically, and then adding that as a standard test case, > would be a considerable improvement. I just don't quite want to go to the > effort of inventing a GUC or reloption. Is there some intermediate > answer? A small-footprint answer that just occurred to me is to redefine the threshold as being, not NBuffers, but maintenance_work_mem. Then a reasonably-sized test case could be made by reducing that to its minimum allowed value. This seems like it'd be fairly unsurprising for users since there's lots of precedent for maintenance_work_mem affecting the behavior of CREATE INDEX. If maintenance_work_mem exceeds shared_buffers then you'd get a certain amount of thrashing between shared buffers and kernel disk cache, but it wouldn't be really awful unless maintenance_work_mem exceeded available RAM, which would be a certifiably Bad Idea anyway. I guess we could forestall the buffer-thrashing scenario and preserve testability by setting the sort threshold to min(NBuffers, maintenance_work_mem). regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Mon, Jun 27, 2016 at 7:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Meh. I do not really think that dromedary's test options > (-DCOPY_PARSE_PLAN_TREES -DRAW_EXPRESSION_COVERAGE_TEST) represent a model > to follow for this problem. In both of those cases, the value of having > the option turned on is that it will exercise code that hasn't even been > written yet. So those options have to be enabled across the entire test > suite in order to be of much use. Furthermore, there's no particular > reason to think that they'd expose platform-dependent behavior, so there's > no real downside to having just one or two buildfarm critters using them. You think that the hash index tuplesort code should be possible to exercise in the regression tests dynamically, without any new #define whatsoever. Fair enough, but I still think that enabling debug #define options should be possible at a coarser granularity, even if that ends up covering a set of cases that are not very crisply defined. Maybe that's a discussion for the next time something like RAW_EXPRESSION_COVERAGE_TEST is proposed, though. -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Mon, Jun 27, 2016 at 8:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > A small-footprint answer that just occurred to me is to redefine the > threshold as being, not NBuffers, but maintenance_work_mem. Then a > reasonably-sized test case could be made by reducing that to its minimum > allowed value. This seems like it'd be fairly unsurprising for users > since there's lots of precedent for maintenance_work_mem affecting the > behavior of CREATE INDEX. I like this idea. Should I write a patch? -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes: > On Mon, Jun 27, 2016 at 8:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> A small-footprint answer that just occurred to me is to redefine the >> threshold as being, not NBuffers, but maintenance_work_mem. Then a >> reasonably-sized test case could be made by reducing that to its minimum >> allowed value. This seems like it'd be fairly unsurprising for users >> since there's lots of precedent for maintenance_work_mem affecting the >> behavior of CREATE INDEX. > I like this idea. Should I write a patch? Please. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes: > I like this idea. Should I write a patch? BTW, while you're at it: it strikes me that the threshold should be either min(NBuffers, maintenance_work_mem) or min(NLocBuffer, maintenance_work_mem), depending on whether we're talking about a regular or temp table/index. That is, there's a pre-existing bug here that when NLocBuffer is a good deal less than NBuffers (which is the typical case nowadays) you'll get a lot of thrashing between local buffers and kernel cache, if the index isn't quite big enough to trigger the sorting code. This might not manifest as actual I/O, but it's not the intended behavior either. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Mon, Jun 27, 2016 at 1:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <pg@heroku.com> writes: >> I like this idea. Should I write a patch? > > BTW, while you're at it: it strikes me that the threshold should be > either min(NBuffers, maintenance_work_mem) or > min(NLocBuffer, maintenance_work_mem), depending on whether we're > talking about a regular or temp table/index. That is, there's a > pre-existing bug here that when NLocBuffer is a good deal less than > NBuffers (which is the typical case nowadays) you'll get a lot of > thrashing between local buffers and kernel cache, if the index isn't > quite big enough to trigger the sorting code. This might not manifest > as actual I/O, but it's not the intended behavior either. Understood. It's on my TODO list for the week, although I'll prioritize isolating and fixing that UPSERT "attempted to delete invisible tuple" bug. -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Mon, Jun 27, 2016 at 1:31 PM, Peter Geoghegan <pg@heroku.com> wrote: > Understood. It's on my TODO list for the week, although I'll > prioritize isolating and fixing that UPSERT "attempted to delete > invisible tuple" bug. I would like to start work on this immediately, but feel it's premature until my other tuplesort testing patch gets some review [1]. [1] https://www.postgresql.org/message-id/CAM3SWZSgxehDkDMq1FdiW2A0Dxc79wH0hz1x-TnGy=1BXEL+nw@mail.gmail.com -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Mon, Jun 27, 2016 at 1:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <pg@heroku.com> writes: >> I like this idea. Should I write a patch? > > BTW, while you're at it: it strikes me that the threshold should be > either min(NBuffers, maintenance_work_mem) or > min(NLocBuffer, maintenance_work_mem), depending on whether we're > talking about a regular or temp table/index. That is, there's a > pre-existing bug here that when NLocBuffer is a good deal less than > NBuffers (which is the typical case nowadays) you'll get a lot of > thrashing between local buffers and kernel cache, if the index isn't > quite big enough to trigger the sorting code. This might not manifest > as actual I/O, but it's not the intended behavior either. Finally found time for this. Attached patch tests hash tuplesorts along the lines we discussed. It adds one new tuplesort operation, which does not spill to disk. It also asserts that hash values retrieved through the tuplesort interface are in fact in sorted order. I wanted to have something reliably trip when comparetup_index_hash() gives wrong answers, even when a corrupt final index is perhaps not a consequence of the underlying bug. Due to how the hash build code works, maintenance_work_mem will need to be 2 blocks smaller than the final index size in order to force a sort (see code comments). I think that this does not matter, since none of this is documented currently. Having reviewed the coverage report for tuplesort.c [1], I think that the only notable testing omission once this latest patch is committed will be that there is no coverage for "backwards get tuple" cases (this is more noticeable when looking at coverage statistics for logtape.c, actually). This seems less important to me, and would be awkward to test in any case, given my constraints. [1] http://coverage.postgresql.org/src/backend/utils/sort/tuplesort.c.gcov.html -- Peter Geoghegan
Attachment
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes: > Finally found time for this. Attached patch tests hash tuplesorts > along the lines we discussed. It adds one new tuplesort operation, > which does not spill to disk. It also asserts that hash values > retrieved through the tuplesort interface are in fact in sorted order. > I wanted to have something reliably trip when comparetup_index_hash() > gives wrong answers, even when a corrupt final index is perhaps not a > consequence of the underlying bug. Pushed with some adjustments, mostly paranoia concerning integer overflows in the calculation around maintenance_work_mem. I did not like the way you'd set up the test case though: tenk1 is *very* heavily used in the regression tests, and having an extra index on it that might capture query plans for unrelated test cases doesn't seem like a good idea. Especially not when you drop the index in the middle of a large group of parallel tests, so that any capturing that did happen would be timing-sensitive. So I just made the create_index test create and immediately drop the index. We have other tests that are supposed to exercise searches of hash indexes, anyway. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Sat, Jul 16, 2016 at 12:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > So I just made the create_index test create and immediately drop the > index. We have other tests that are supposed to exercise searches of > hash indexes, anyway. I suppose that the assertion made the new hash index searches redundant. Thanks -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes: > On Sat, Jul 16, 2016 at 12:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So I just made the create_index test create and immediately drop the >> index. We have other tests that are supposed to exercise searches of >> hash indexes, anyway. > I suppose that the assertion made the new hash index searches redundant. [ thinks about it a bit more... ] Actually, maybe it doesn't. The thing that made 9f03ca915196dfc8 really nasty is that it built an index that was actually corrupt, with incorrect hash codes stored for entries. I think your assertion would probably have caught that, but only accidentally, because the problem had little to do with whether the sort satisfied its charter. Maybe we should stick at least one simple test query in there before we drop the index again. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Sat, Jul 16, 2016 at 1:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I suppose that the assertion made the new hash index searches redundant. > > [ thinks about it a bit more... ] Actually, maybe it doesn't. The thing > that made 9f03ca915196dfc8 really nasty is that it built an index that was > actually corrupt, with incorrect hash codes stored for entries. I think > your assertion would probably have caught that, but only accidentally, > because the problem had little to do with whether the sort satisfied its > charter. Maybe we should stick at least one simple test query in there > before we drop the index again. That's what I was thinking, but wasn't sure offhand which way it would happen. Couldn't hurt much to just add one more test. It is pretty awkward how the hash index searches are in hash_index.sql, which must coordinate with create_index.sql, but I didn't want to go against the grain and just put it all in one place. Maybe I should have. -- Peter Geoghegan
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes: > On Sat, Jul 16, 2016 at 1:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> ... Maybe we should stick at least one simple test query in there >> before we drop the index again. > That's what I was thinking, but wasn't sure offhand which way it would > happen. Couldn't hurt much to just add one more test. Right. Done. > It is pretty awkward how the hash index searches are in > hash_index.sql, which must coordinate with create_index.sql, but I > didn't want to go against the grain and just put it all in one place. > Maybe I should have. Well, that structure is intended to work with indexes that are built and then left in place throughout the tests. For a short-lived index, the best thing is to create it, test it, drop it, and really you can do that about anywhere. But again, doing that on tenk1 is problematic because of its popularity in all sorts of query-planning tests; if you're going to add a short-lived index you need to do that in a test that's not meant to run concurrently with a lot of other stuff. Another approach we could have taken is to construct a table just for this purpose. But I think the way it is now is fine. In the longer term, if hash indexes ever become less of a second-class citizen, maybe we'd actually want to include a hash index on tenk1 throughout the tests. But I don't want to go there today. regards, tom lane
Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
From
Peter Geoghegan
Date:
On Sat, Jul 16, 2016 at 1:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It is pretty awkward how the hash index searches are in >> hash_index.sql, which must coordinate with create_index.sql, but I >> didn't want to go against the grain and just put it all in one place. >> Maybe I should have. > > Well, that structure is intended to work with indexes that are built and > then left in place throughout the tests. For a short-lived index, the > best thing is to create it, test it, drop it, and really you can do that > about anywhere. But again, doing that on tenk1 is problematic because > of its popularity in all sorts of query-planning tests; if you're going > to add a short-lived index you need to do that in a test that's not meant > to run concurrently with a lot of other stuff. I did think that the "fillfactor = 10" bloat would have prevented problems with plan stability, but I accept that that might not be good enough. > Another approach we could have taken is to construct a table just for > this purpose. I actually started with that. Anyway, test coverage of tuplesort.c seems in much better shape now. The strategy of not always moving the tests towards the code, but rather moving the code towards the tests (improving code testability) seems to have been more effective than I'd anticipated. My unpublished parallel CREATE INDEX patch forces a single worker, single leader (deterministic division of labor) "serial parallel" sort for every CREATE INDEX statement when force_parallel_mode is set to 'regress'. That's a nice thing to have for testing, because the determinism allows one to line things up just right, and reliably tickle certain classes of bug (this made a certain amount of TDD possible). This configuration is only useful for testing, and is not htat different to forcing an external sort, the notable difference being that the worker process is still reliably subject to the general limitations of parallel sort worker processes. -- Peter Geoghegan