Re: Pluggable toaster - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Pluggable toaster |
Date | |
Msg-id | CALDaNm2cQW5bP-y2BgdmbHqMkuATchxhvBfpbA62ZkZsKi-Rxg@mail.gmail.com Whole thread Raw |
In response to | Re: Pluggable toaster (Nikita Malakhov <hukutoc@gmail.com>) |
Responses |
Re: Pluggable toaster
|
List | pgsql-hackers |
On Tue, 27 Dec 2022 at 02:32, Nikita Malakhov <hukutoc@gmail.com> wrote: > > Hi hackers! > > Pluggable TOAST API with catalog control table PG_TOASTREL - pre-patch. > > Pluggable TOAST - TOAST API rework - introduce PG_TOASTREL catalog > relation containing TOAST dependencies. NOTE: here is a pre-patch, not > a final version, just to introduce another approach to a Pluggable TOAST > idea, it needs some cleanup, tests rework and some improvements, so > the main > goal of this message is to introduce this different approach. This is the > last patch and it is installed on top of older TOAST API patches, so here > are 3 patches attached: > > 0001_toaster_interface_v24.patch.gz > This patch introduces new custom TOAST pointer, Pluggable TOAST API and > Toaster support functions - cache, lookup, and new attribute 'atttoaster' > in PG_ATTRIBUTE table which stores Toaster OID; > > 0002_toaster_default_v24.patch.gz > Here the default TOAST mechanics is routed via TOAST API, but still using > varatt_external TOAST Pointer - so this step does not change overall TOAST > mechanics unless you plug in some custom Toaster; > > 0003_pg_toastrel_table_v24.patch.gz > Here Pluggable TOAST is reworked not to modify PG_ATTRIBUTE, instead this > patch introduces new catalog table PG_TOASTREL with its support functions. > > Motivation: PG_ATTRIBUTE is already the largest catalog table. We try > to avoid modification of existing catalog tables, and previous solution > had several problems: > 1) New field in PG_ATTRIBUTE; > 2) No opportunity to save all Toaster assignment history; > 3) No opportunity to have multi-TOAST tables assigned to a relation or > an attribute; > 4) Toaster cannot be dropped - to drop Toaster we need to scan all tables > with TOASTable columns. > > Instead of extending PG_ATTRIBUTE with ATTTOASTER attribute, we decided > to store all Table-Toaster relations in a new catalog table PG_TOASTREL. > This cancels the necessity to modify catalog table PG_ATTRIBUTE, allows to store > full history of Toasters assignments, and allows to drop unused Toasters > from system. > > Toasters are assigned to a table column. ALTER TABLE ... SET TOASTER command > creates a new row in PG_TOASTREL. To distinguish sequential assignments, > PG_TOASTREL has special attribute - 'version'. With each new assignment > its 'version' attribute is increased, and the row with the biggest 'version' > is the current Toaster for a column. > > This approach allows to provide different behavior, even for a single table > we can have one TOAST table for the whole relation (as it is in current TOAST > mechanics), or we can have separate TOAST relation(s) for each TOASTable > column - this requires a slight modification if current approach. The latter > also allows simple invariant of column-oriented storage. > > Also, this approach makes PG_ATTRIBUTE attribute RELTOASTRELID obsolete - > current mechanics allows only 1 TOAST table for relation, which limits > greatly TOAST capabilities - because all TOASTed columns are stored in this > table, which in its turn limits overall base relation capacity. > > In future, this approach allows us to have a kind of near infinite TOAST > storage, with ability to store large values (larger than 512 Mbytes), > auto-creation of TOAST table only when the first value is actually TOASTed, > and much more. > > The approach, along with the TOAST API itself, introduces the catalog table > PG_TOASTREL with a set of support functions. > > PG_TOASTREL definition: > > postgres@postgres=# \d+ pg_toastrel; > Table "pg_catalog.pg_toastrel" > Column | Type | Collation | Nullable | Default | Storage | Toaster | Compression | Stats target | Description > -------------+----------+-----------+----------+---------+---------+---------+-------------+--------------+------------- > oid | oid | | not null | | plain | | | | > toasteroid | oid | | not null | | plain | | | | > relid | oid | | not null | | plain | | | | > toastentid | oid | | not null | | plain | | | | > attnum | smallint | | not null | | plain | | | | > version | smallint | | not null | | plain | | | | > relname | name | | not null | | plain | | | | > toastentname | name | | not null | | plain | | | | > flag | "char" | | not null | | plain | | | | > toastoptions | "char" | | not null | | plain | | | | > Indexes: > "pg_toastrel_oid_index" PRIMARY KEY, btree (oid) > "pg_toastrel_name_index" UNIQUE CONSTRAINT, btree (toasteroid, relid, version, attnum) > "pg_toastrel_rel_index" btree (relid, attnum) > "pg_toastrel_tsr_index" btree (toasteroid) > Access method: heap > (This is not a final definition) > > Where: > oid - PG_TOASTREL record ID > toasteroid - Toaster OID from PG_TOASTER > relid - base relation OID > toastentid - TOAST entity OID (not necessary to be a table) > attnum - TOASTable attribute index in base relation > version - Toaster assignment version - sequence of assignments > relname - base relation name (optional) > toastentname - TOAST entity name (optional) > flag - special field to mark rows, currently only the value 'x' is used > to mark unused rows > > PG_TOASTREL unique key consists of: > toasteroid, relid, attnum, version > > All currently assigned Toasters are additionally stored in cache for > fast access. When new row is being TOASTed - Toaster, relation Oid, > TOAST relation Oid, column index are added into Toastrel Cache for fast > access. > > Create table, change Toaster, change column type were changed to > add new rows in PG_TOASTREL, to use this table and cache instead > of altering pg_attribute with new column. For table creation from > scratch when no TOAST tables were created is used special condition > with version=0. > > DROP TABLE drops rows in PG_TOASTREL for this table. This allows to - > DROP TOASTER command added. When no rows with the according Toaster are > present in PG_TOASTREL - it is considered unused and thus could be safely > dropped from the system. > > Default toaster 'deftoaster' (reference TOAST mechanics) cannot be dropped. > > Working branch: > https://github.com/postgrespro/postgres/tree/toastapi_with_ctl > > Would be glad to get any proposals and objections. The patch does not apply on top of HEAD as in [1], please post a rebased patch: patching file src/backend/utils/cache/syscache.c === Applying patches on top of PostgreSQL commit ID 33ab0a2a527e3af5beee3a98fc07201e555d6e45 === === applying patch ./0001_toaster_interface_v24.patch patching file contrib/test_decoding/expected/ddl.out Hunk #2 FAILED at 874. 1 out of 2 hunks FAILED -- saving rejects to file src/backend/utils/cache/syscache.c.rej [1] - http://cfbot.cputube.org/patch_41_3490.log Regards, Vignesh
pgsql-hackers by date: