Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: WIP: Covering + unique indexes. |
Date | |
Msg-id | CAM3SWZQLLYajNy-5K702RTWMeRkUEEBtYJbb+ifYRQOCHNx6=A@mail.gmail.com Whole thread Raw |
In response to | Re: WIP: Covering + unique indexes. (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: WIP: Covering + unique indexes.
|
List | pgsql-hackers |
On Tue, Apr 5, 2016 at 1:31 PM, Peter Geoghegan <pg@heroku.com> wrote: > My new understanding: The extra "included" columns are stored in the > index, but do not affect its sort order at all. They are no more part > of the key than, say, the heap TID that the key points to. They are > just "payload". Noticed a few issues following another pass: * tuplesort.c should handle the CLUSTER case in the same way as the btree case. No? * Why have a RelationGetNumberOfAttributes(indexRel) call in tuplesort_begin_index_btree() at all now? * This critical section is unnecessary, because this happens during index builds: + if (indnkeyatts != indnatts && P_ISLEAF(opageop)) + { + /* + * It's essential to truncate High key here. + * The purpose is not just to save more space on this particular page, + * but to keep whole b-tree structure consistent. Subsequent insertions + * assume that hikey is already truncated, and so they should not + * worry about it, when copying the high key into the parent page + * as a downlink. + * NOTE It is not crutial for reliability in present, + * but maybe it will be that in the future. + * NOTE this code will be changed by the "btree compression" patch, + * which is in progress now. + */ + keytup = index_reform_tuple(wstate->index, oitup, + indnatts, indnkeyatts); + + /* delete "wrong" high key, insert keytup as P_HIKEY. */ + START_CRIT_SECTION(); + PageIndexTupleDelete(opage, P_HIKEY); + + if (!_bt_pgaddtup(opage, IndexTupleSize(keytup), keytup, P_HIKEY)) + elog(ERROR, "failed to rewrite compressed item in index \"%s\"", + RelationGetRelationName(wstate->index)); + END_CRIT_SECTION(); + } Note that START_CRIT_SECTION() promotes any ERROR to PANIC, which isn't useful here, because we have no buffer lock held, and nothing must be WAL-logged. * Think you forgot to update spghandler(). (You did not add a test for just that one AM, either) * I wonder why this restriction needs to exist: + else + elog(ERROR, "Expressions are not supported in included columns."); What does not supporting it buy us? Was it just that the pg_index representation is more complicated, and you wanted to put it off? An error like this should use ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED ..., btw. * I would like to see index_reform_tuple() assert that the new, truncated index tuple is definitely <= the original (I worry about the 1/3 page restriction issue). Maybe you should also change the name of index_reform_tuple(), per David. * There is some stray whitespace within RelationGetIndexAttrBitmap(). I think you should have updated it with code, though. I don't think it's necessary for HOT updates to work, but I think it could be necessary so that we don't need to get a row lock that blocks non-conflict foreign key locking (see heap_update() callers). I think you need to be careful for non-key columns within the loop in RelationGetIndexAttrBitmap(), basically, because it seems to still go through all columns. UPSERT also must call this code, FWIW. * I think that a similar omission is also made for the replica identity stuff in RelationGetIndexAttrBitmap(). Some thought is needed on how this patch interacts with logical decoding, I guess. * Valgrind shows an error with an aggregate statement I tried: 2016-04-05 17:01:31.129 PDT 12310 LOG: statement: explain analyze select count(*) from ab where b > 5 group by a, b; ==12310== Invalid read of size 4 ==12310== at 0x656615: match_clause_to_indexcol (indxpath.c:2226) ==12310== by 0x656615: match_clause_to_index (indxpath.c:2144) ==12310== by 0x656DBC: match_clauses_to_index (indxpath.c:2115) ==12310== by 0x658054: match_restriction_clauses_to_index (indxpath.c:2026) ==12310== by 0x658054: create_index_paths (indxpath.c:269) ==12310== by 0x64D1DB: set_plain_rel_pathlist (allpaths.c:649) ==12310== by 0x64D1DB: set_rel_pathlist (allpaths.c:427) ==12310== by 0x64D93B: set_base_rel_pathlists (allpaths.c:299) ==12310== by 0x64D93B: make_one_rel (allpaths.c:170) ==12310== by 0x66876C: query_planner (planmain.c:246) ==12310== by 0x669FBA: grouping_planner (planner.c:1666) ==12310== by 0x66D0C9: subquery_planner (planner.c:751) ==12310== by 0x66D3DA: standard_planner (planner.c:300) ==12310== by 0x66D714: planner (planner.c:170) ==12310== by 0x6FD692: pg_plan_query (postgres.c:798) ==12310== by 0x59082D: ExplainOneQuery (explain.c:350) ==12310== Address 0xbff290c is 2,508 bytes inside a block of size 8,192 alloc'd ==12310== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==12310== by 0x81B7FA: AllocSetAlloc (aset.c:853) ==12310== by 0x81D257: palloc (mcxt.c:907) ==12310== by 0x4B6F65: RelationGetIndexScan (genam.c:94) ==12310== by 0x4C135D: btbeginscan (nbtree.c:431) ==12310== by 0x4B7A5C: index_beginscan_internal (indexam.c:279) ==12310== by 0x4B7C5A: index_beginscan (indexam.c:222) ==12310== by 0x4B73D1: systable_beginscan (genam.c:379) ==12310== by 0x7E8CF9: ScanPgRelation (relcache.c:341) ==12310== by 0x7EB3C4: RelationBuildDesc (relcache.c:951) ==12310== by 0x7ECD35: RelationIdGetRelation (relcache.c:1800) ==12310== by 0x4A4D37: relation_open (heapam.c:1118) ==12310== { <insert_a_suppression_name_here> Memcheck:Addr4 fun:match_clause_to_indexcol fun:match_clause_to_index fun:match_clauses_to_index fun:match_restriction_clauses_to_index fun:create_index_paths fun:set_plain_rel_pathlist fun:set_rel_pathlist fun:set_base_rel_pathlists fun:make_one_rel fun:query_planner fun:grouping_planner fun:subquery_planner fun:standard_planner fun:planner fun:pg_plan_query fun:ExplainOneQuery } Separately, I tried "make installcheck-tests TESTS=index_including" from Postgres + Valgrind, with Valgrind's --track-origins option enabled (as it was above). I recommend installing Valgrind, and making sure that the patch shows no errors. I didn't actually find a Valgrind issue from just using your regression tests (nor did I find an issue from separately running the regression tests with CLOBBER_CACHE_ALWAYS, FWIW). -- Peter Geoghegan
pgsql-hackers by date: