Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers
From | Thom Brown |
---|---|
Subject | Re: WIP: Covering + unique indexes. |
Date | |
Msg-id | CAA-aLv7mWdu5WCrUH0sw79abDsq4tdZEpkvV8SfgNyLp92+z6Q@mail.gmail.com Whole thread Raw |
In response to | WIP: Covering + unique indexes. (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>) |
Responses |
Re: WIP: Covering + unique indexes.
|
List | pgsql-hackers |
On 8 October 2015 at 16:18, Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote: > > Hi hackers, > > I'm working on a patch that allows to combine covering and unique > functionality for btree indexes. > > Previous discussion was here: > 1) Proposal thread > 2) Message with proposal clarification > > In a nutshell, the feature allows to create index with "key" columns and > "included" columns. > "key" columns can be used as scan keys. Unique constraint relates only to > "key" columns. > "included" columns may be used as scan keys if they have suitable opclass. > Both "key" and "included" columns can be returned from index by > IndexOnlyScan. > > Btree is the default index and it's used everywhere. So it requires properly > testing. Volunteers are welcome) > > Use case: > - We have a table (c1, c2, c3, c4); > - We need to have an unique index on (c1, c2). > - We would like to have a covering index on all columns to avoid reading of > heap pages. > > Old way: > CREATE UNIQUE INDEX olduniqueidx ON oldt USING btree (c1, c2); > CREATE INDEX oldcoveringidx ON oldt USING btree (c1, c2, c3, c4); > > What's wrong? > Two indexes contain repeated data. Overhead to data manipulation operations > and database size. > > New way: > CREATE UNIQUE INDEX newidx ON newt USING btree (c1, c2) INCLUDING (c3, c4); > > The patch is attached. > In 'test.sql' you can find a test with detailed comments on each step, and > comparison of old and new indexes. > > New feature has following syntax: > CREATE UNIQUE INDEX newidx ON newt USING btree (c1, c2) INCLUDING (c3, c4); > Keyword INCLUDING defines the "included" columns of index. These columns > aren't concern to unique constraint. > Also, them are not stored in index inner pages. It allows to decrease index > size. > > Results: > 1) Additional covering index is not required anymore. > 2) New index can use IndexOnlyScan on queries, where old index can't. > > For example, > explain analyze select c1, c2 from newt where c1<10000 and c3<20; > > *more examples in 'test.sql' > > Future work: > To do opclasses for "included" columns optional. > > CREATE TABLE tbl (c1 int, c4 box); > CREATE UNIQUE INDEX idx ON tbl USING btree (c1) INCLUDING (c4); > > If we don't need c4 as an index scankey, we don't need any btree opclass on > it. > But we still want to have it in covering index for queries like > > SELECT c4 FROM tbl WHERE c1=1000; > SELECT * FROM tbl WHERE c1=1000; The definition output needs a space after "INCLUDING": # SELECT pg_get_indexdef('people_first_name_last_name_email_idx'::regclass::oid); pg_get_indexdef --------------------------------------------------------------------------------------------------------------------------CREATE UNIQUEINDEX people_first_name_last_name_email_idx ON people USING btree (first_name, last_name) INCLUDING(email) (1 row) There is also no collation output: # CREATE UNIQUE INDEX test_idx ON people (first_name COLLATE "en_GB", last_name) INCLUDING (email COLLATE "pl_PL"); CREATE INDEX # SELECT pg_get_indexdef('test_idx'::regclass::oid); pg_get_indexdef -------------------------------------------------------------------------------------------------------------CREATE UNIQUEINDEX test_idx ON people USING btree (first_name COLLATE "en_GB", last_name) INCLUDING(email) (1 row) As for functioning, it works as described: # EXPLAIN SELECT email FROM people WHERE (first_name,last_name) = ('Paul','Freeman'); QUERY PLAN ----------------------------------------------------------------------------------------------------------Index Only Scanusing people_first_name_last_name_email_idx on people(cost=0.28..1.40 rows=1 width=21) Index Cond: ((first_name = 'Paul'::text)AND (last_name = 'Freeman'::text)) (2 rows) Typo: "included columns must not intersects with key columns" should be: "included columns must not intersect with key columns" One thing I've noticed you can do with your patch, which you haven't mentioned, is have a non-unique covering index: # CREATE INDEX covering_idx ON people (first_name) INCLUDING (last_name); CREATE INDEX # EXPLAIN SELECT first_name, last_name FROM people WHERE first_name = 'Paul'; QUERY PLAN ---------------------------------------------------------------------------------Index Only Scan using covering_idx on people (cost=0.28..1.44 rows=4 width=13) Index Cond: (first_name = 'Paul'::text) (2 rows) But this appears to behave as if it were a regular multi-column index, in that it will use the index for ordering rather than sort after fetching from the index. So is this really stored the same as a multi-column index? The index sizes aren't identical, so something is different. Thom
pgsql-hackers by date: