Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: WIP: Covering + unique indexes. |
Date | |
Msg-id | CAKJS1f__-i3eOc8pD6WFODDFvPrMBhe4o3KE0kokFWy+AQrStQ@mail.gmail.com Whole thread Raw |
In response to | Re: WIP: Covering + unique indexes. (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>) |
Responses |
Re: WIP: Covering + unique indexes.
|
List | pgsql-hackers |
On 27 January 2016 at 03:35, Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote: > including_columns_3.0 is the latest version of patch. > And changes regarding the previous version are attached in a separate patch. > Just to ease the review and debug. Hi, I've made another pass over the patch. There's still a couple of things that I think need to be looked at. Do we need the "b (included)" here? The key is (a) = (1). Having irrelevant details might be confusing. postgres=# create table a (a int not null, b int not null); CREATE TABLE postgres=# create unique index on a (a) including(b); CREATE INDEX postgres=# insert into a values(1,1); INSERT 0 1 postgres=# insert into a values(1,1); ERROR: duplicate key value violates unique constraint "a_a_b_idx" DETAIL: Key (a, b (included))=(1, 1) already exists. Extra tabs: /* Truncate nonkey attributes when inserting on nonleaf pages. */ if (rel->rd_index->indnatts != rel->rd_index->indnkeyatts && !P_ISLEAF(lpageop)) { itup = index_reform_tuple(rel, itup, rel->rd_index->indnatts, rel->rd_index->indnkeyatts); } In index_reform_tuple() I find it a bit scary that you change the TupleDesc's number of attributes then set it back again once you're finished reforming the shortened tuple. Maybe it would be better to modify index_form_tuple() to accept a new argument with a number of attributes, then you can just Assert that this number is never higher than the number of attributes in the TupleDesc. I'm also not that keen on index_reform_tuple() in general. I wonder if there's a way we can just keep the Datum/isnull arrays a bit longer, and only form the tuple when needed. I've not looked into this in detail, but it does look like reforming the tuple is not going to be cheap. If we do need to keep this function, I think a better name might be index_trim_tuple() and I don't think you need to pass the original length. It might make sense to Assert() that the trim length is smaller than the tuple size What statement will cause this: numberOfKeyAttributes = list_length(stmt->indexParams); if (numberOfKeyAttributes <= 0) ereport(ERROR, (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), errmsg("must specify at least one key column"))); I seem to just get errors from the parser when trying. Much of this goes over 80 chars: /* * We append any INCLUDING columns onto the indexParams list so that * we have one list with all columns. Later we can determine which of these * are key columns, and which are just part of the INCLUDING list by check the list * position. A list item in a position less than ii_NumIndexKeyAttrs is part of * the key columns, and anything equal to and over is part of the * INCLUDING columns. */ stmt->indexParams = list_concat(stmt->indexParams, stmt->indexIncludingParams); in gistrescan() there is some code: for (attno = 1; attno <= natts; attno++) { TupleDescInitEntry(so->giststate->fetchTupdesc, attno, NULL, scan->indexRelation->rd_opcintype[attno - 1], -1, 0); } Going by RelationInitIndexAccessInfo() rd_opcintype[] is allocated to be sized by the number of key columns, but this loop goes over the number of attribute columns. Perhaps this is not a big problem since GIST does not support INCLUDING columns, but it does seem wrong still. Which brings me to the fact that I've spent a bit of time trying to look for places where you've forgotten to change natts to nkeyatts. I did find this one, but I don't have much confidence that there's not lots more places that have been forgotten. Apart from this one, how confident are you that you've found all the places? I'm getting towards being happy with the code that I see that's been changed, but I'm hesitant to mark as "Ready for committer" due to not being all that comfortable that all the code that needs to be updated has been updated. I'm not quite sure of a good way to find all these places. I wondering if hacking the code so that each btree index which is created with > 1 column puts all but the first column into the INCLUDING columns, then run the regression tests to see if there are any crashes. I'm really not that sure of how else to increase the confidence levels on this. Do you have ideas? -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: