Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers
From | Anastasia Lubennikova |
---|---|
Subject | Re: WIP: Covering + unique indexes. |
Date | |
Msg-id | a976fff4-29a9-c799-5c2d-19a371031325@postgrespro.ru Whole thread Raw |
In response to | Re: WIP: Covering + unique indexes. (Andres Freund <andres@anarazel.de>) |
Responses |
Re: WIP: Covering + unique indexes.
|
List | pgsql-hackers |
30.03.2017 22:11, Andres Freund
First of all, I want to thank you and Robert for reviewing this patch.
Your expertise in postgres subsystems is really necessary for features like this.
I just wonder, why don't you share your thoughts and doubts till the "last call".
So the name looks good to me.
Any suggestions?
But I haven't considered the possibility of index_form_tuple() failure.
Fixed in this version of the patch. Now it creates a copy of tupledesc to pass it to index_form_tuple.
I think that this name reflects the essence of the function clear enough and don't feel like renaming it again.
Maybe it can be solved in some other way, but for now it is a tested and working implementation.
Hm,
columnList contains entries of columnElem type and index_including_params works with index_elem.
Is there a way they can be combined?
> It's the only point in insertion process, where we perform truncation
Other comments about code format, spelling and comments are fixed in attached patches.
Thank you again for reviewing.
Any objection from reviewers to push both patches?
First of all, I want to thank you and Robert for reviewing this patch.
Your expertise in postgres subsystems is really necessary for features like this.
I just wonder, why don't you share your thoughts and doubts till the "last call".
The feature is "index with included columns", it uses keyword "INCLUDE".diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c index f2eda67..59029b9 100644 --- a/contrib/bloom/blutils.c +++ b/contrib/bloom/blutils.c @@ -120,6 +120,7 @@ blhandler(PG_FUNCTION_ARGS) amroutine->amclusterable = false; amroutine->ampredlocks = false; amroutine->amcanparallel = false; + amroutine->amcaninclude = false;That name doesn't strike me as very descriptive.
So the name looks good to me.
Any suggestions?
Definitely. But do you have any specific proposals?+ <term><literal>INCLUDE</literal></term> + <listitem> + <para> + An optional <literal>INCLUDE</> clause allows a list of columns to be + specified which will be included in the non-key portion of the index. + Columns which are part of this clause cannot also exist in the + key columns portion of the index, and vice versa. The + <literal>INCLUDE</> columns exist solely to allow more queries to benefit + from <firstterm>index-only scans</> by including certain columns in the + index, the value of which would otherwise have to be obtained by reading + the table's heap. Having these columns in the <literal>INCLUDE</> clause + in some cases allows <productname>PostgreSQL</> to skip the heap read + completely. This also allows <literal>UNIQUE</> indexes to be defined on + one set of columns, which can include another set of columns in the + <literal>INCLUDE</> clause, on which the uniqueness is not enforced. + It's the same with other constraints (PRIMARY KEY and EXCLUDE). This can + also can be used for non-unique indexes as any columns which are not required + for the searching or ordering of records can be used in the + <literal>INCLUDE</> clause, which can slightly reduce the size of the index. + Currently, only the B-tree access method supports this feature. + Expressions as included columns are not supported since they cannot be used + in index-only scans. + </para> + </listitem> + </varlistentry>This could use some polishing.
Initial reasoning was something like this:+/* + * Reform index tuple. Truncate nonkey (INCLUDE) attributes. + */ +IndexTuple +index_truncate_tuple(Relation idxrel, IndexTuple olditup) +{ + TupleDesc itupdesc = RelationGetDescr(idxrel); + Datum values[INDEX_MAX_KEYS]; + bool isnull[INDEX_MAX_KEYS]; + IndexTuple newitup; + int indnatts = IndexRelationGetNumberOfAttributes(idxrel); + int indnkeyatts = IndexRelationGetNumberOfKeyAttributes(idxrel); + + Assert(indnatts <= INDEX_MAX_KEYS); + Assert(indnkeyatts > 0); + Assert(indnkeyatts < indnatts); + + index_deform_tuple(olditup, itupdesc, values, isnull); + + /* form new tuple that will contain only key attributes */ + itupdesc->natts = indnkeyatts; + newitup = index_form_tuple(itupdesc, values, isnull); + newitup->t_tid = olditup->t_tid; + + itupdesc->natts = indnatts;Uh, isn't this a *seriously* bad idea? If index_form_tuple errors out, this'll corrupt the tuple descriptor.
> Maybe it would be better to modify index_form_tuple() to accept a new > argument with a number of attributes, then you can just Assert that > this number is never higher than the number of attributes in the > TupleDesc. Good point. I agree that this function is a bit strange. I have to set tupdesc->nattrs to support compatibility with index_form_tuple(). I didn't want to add neither a new field to tupledesc nor a new parameter to index_form_tuple(), because they are used widely.
But I haven't considered the possibility of index_form_tuple() failure.
Fixed in this version of the patch. Now it creates a copy of tupledesc to pass it to index_form_tuple.
We'd discussed with other reviewers, they suggested index_truncate_tuple() instead of index_reform_tuple().Maybe also rename the function to index_build_key_tuple()?
I think that this name reflects the essence of the function clear enough and don't feel like renaming it again.
Since we don't use included columns for system indexes, there is no difference. I've just tried to minimize code changes here.@@ -368,7 +370,7 @@ systable_beginscan(Relation heapRelation, { int j; - for (j = 0; j < irel->rd_index->indnatts; j++) + for (j = 0; j < IndexRelationGetNumberOfAttributes(irel); j++){ if (key[i].sk_attno == irel->rd_index->indkey.values[j]) { @@ -376,7 +378,7 @@ systable_beginscan(Relation heapRelation, break; } } - if (j == irel->rd_index->indnatts) + if (j == IndexRelationGetNumberOfAttributes(irel)) elog(ERROR, "column is not in index"); }Not that it matters overly much, but why are we doing this for all attributes, rather than just key attributes?
I've replaced all the references to relnatts with macro, primarily to ensure that I won't miss anything that should use only key attributes.--- a/src/backend/bootstrap/bootstrap.c +++ b/src/backend/bootstrap/bootstrap.c @@ -600,7 +600,7 @@ boot_openrel(char *relname) relname, (int) ATTRIBUTE_FIXED_PART_SIZE); boot_reldesc = heap_openrv(makeRangeVar(NULL, relname, -1), NoLock); - numattr = boot_reldesc->rd_rel->relnatts; + numattr = RelationGetNumberOfAttributes(boot_reldesc); for (i = 0; i < numattr; i++) { if (attrtypes[i] == NULL)That seems a bit unrelated.
Initially, I did to provide pg_get_constraintdef_worker() with info about included columns.@@ -2086,7 +2086,8 @@ StoreRelCheck(Relation rel, char *ccname, Node *expr, is_validated, RelationGetRelid(rel), /* relation */ attNos, /* attrs in the constraint */ - keycount, /* # attrs in the constraint */ + keycount, /* # key attrs in the constraint */ + keycount, /* # total attrs in the constraint */ InvalidOid, /* not a domain constraint */ InvalidOid, /* no associated index */ InvalidOid, /* Foreign key fields */It doesn't quite seem right to me to store this both in pg_index and pg_constraint.
Maybe it can be solved in some other way, but for now it is a tested and working implementation.
+opt_c_include: INCLUDE '(' columnList ')' { $$ = $3; } + | /* EMPTY */ { $$ = NIL; } + ;+opt_include: INCLUDE '(' index_including_params ')' { $$ = $3; } + | /* EMPTY */ { $$ = NIL; } + ; + +index_including_params: index_elem { $$ = list_make1($1); } + | index_including_params ',' index_elem { $$ = lappend($1, $3); } + ; +Why do we have multiple different definitions of this?
Hm,
columnList contains entries of columnElem type and index_including_params works with index_elem.
Is there a way they can be combined?
To be exact, it claimed that regarding insertion of new values, not index build.+ keytup = index_truncate_tuple(wstate->index, oitup);The code in _bt_split previously claimed that it's the only place doing truncation...
> It's the only point in insertion process, where we perform truncation
Other comments about code format, spelling and comments are fixed in attached patches.
Thank you again for reviewing.
-- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
pgsql-hackers by date: