Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Date | |
Msg-id | CAH2-WzmQGYDDoAETGhpGtJQRv_uFHMjvQZ6JdLV-sxGoCgLBNg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
|
List | pgsql-hackers |
On Thu, Feb 6, 2020 at 6:18 PM Peter Geoghegan <pg@bowt.ie> wrote: > Attached is v32, which is even closer to being committable. Attached is v33, which adds the last piece we need: opclass infrastructure that tells nbtree whether or not deduplication can be applied safely. This is based on work by Anastasia that was shared with me privately. I may not end up committing 0001-* as a separate patch, but it makes sense to post it that way to make review easier -- this is supposed to be infrastructure that isn't just useful for the deduplication patch. 0001-* adds a new C function, _bt_allequalimage(), which only actually gets called within code added by 0002-* (i.e. the patch that adds the deduplication feature). At this point, my main concern is that I might not have the API exactly right in a world where these new support functions are used by more than just the nbtree deduplication feature. I would like to get detailed review of the new opclass infrastructure stuff, and have asked for it directly, but I don't think that committing the patch needs to block on that. I've now written a fair amount of documentation for both the feature and the underlying opclass infrastructure. It probably needs a bit more copy-editing, but I think that it's generally in fairly good shape. It might be a good idea for those who would like to review the opclass stuff to start with some of my btree.sgml changes, and work backwards -- the shape of the API itself is the important thing within the 0001-* patch. New opclass proc ================ In general, supporting deduplication is the rule for B-Tree opclasses, rather than the exception. Most can use the generic btequalimagedatum() routine as their support function 4, which unconditionally indicates that deduplication is safe. There is a new test that tries to catch opclasses that omitted to do this. Here is the opr_sanity.out changes added by the first patch: -- Almost all Btree opclasses can use the generic btequalimagedatum function -- as their equalimage proc (support function 4). Look for opclasses that -- don't do so; newly added Btree opclasses will usually be able to support -- deduplication with little trouble. SELECT amproc::regproc AS proc, opf.opfname AS opfamily_name, opc.opcname AS opclass_name, opc.opcintype::regtype AS opcintype FROM pg_am am JOIN pg_opclass opc ON opc.opcmethod = am.oid JOIN pg_opfamily opf ON opc.opcfamily = opf.oid LEFT JOIN pg_amproc ON amprocfamily = opf.oid AND amproclefttype = opcintype AND amprocnum = 4 WHERE am.amname = 'btree' AND amproc IS DISTINCT FROM 'btequalimagedatum'::regproc ORDER BY amproc::regproc::text, opfamily_name, opclass_name; proc | opfamily_name | opclass_name | opcintype -------------------+------------------+------------------+------------------ bpchar_equalimage | bpchar_ops | bpchar_ops | character btnameequalimage | text_ops | name_ops | name bttextequalimage | text_ops | text_ops | text bttextequalimage | text_ops | varchar_ops | text | array_ops | array_ops | anyarray | enum_ops | enum_ops | anyenum | float_ops | float4_ops | real | float_ops | float8_ops | double precision | jsonb_ops | jsonb_ops | jsonb | money_ops | money_ops | money | numeric_ops | numeric_ops | numeric | range_ops | range_ops | anyrange | record_image_ops | record_image_ops | record | record_ops | record_ops | record | tsquery_ops | tsquery_ops | tsquery | tsvector_ops | tsvector_ops | tsvector (16 rows) Those types/opclasses that you see here with a "proc" that is NULL cannot use deduplication under any circumstances -- they have no pg_amproc entry for B-Tree support function 4. The other four rows at the start (those with a non-NULL "proc") are for collatable types, where using deduplication is conditioned on not using a nondeterministic collation. The details are in the sgml docs for the second patch, where I go into the issue with numeric display scale, why nondeterministic collations disable the use of deduplication, etc. Note that these "equalimage" procs don't take any arguments, which is a first for an index AM support function. Even still, we can take a collation at CREATE INDEX time using the standard PG_GET_COLLATION() mechanism. I suppose that it's a little bit odd to have no arguments but still call PG_GET_COLLATION() in certain support functions. Still, it works just fine, at least as far as the needs of deduplication are concerned. Since using deduplication is supposed to pretty much be the norm from now on, it seemed like it might make sense to add a NOTICE about it during CREATE INDEX -- a notice letting the user know that it isn't being used due to a lack of opclass support: regression=# create table foo(bar numeric); CREATE TABLE regression=# create index on foo(bar); NOTICE: index "foo_bar_idx" cannot use deduplication CREATE INDEX Note that this NOTICE isn't seen with an INCLUDE index, since that's expected to not support deduplication. I have a feeling that not everybody will like this, which is why I'm pointing it out. Thoughts? -- Peter Geoghegan
Attachment
pgsql-hackers by date: