From bade818d2a77dd4f5cf93cfaba05f6a11899732c Mon Sep 17 00:00:00 2001
From: Kommi <haribabuk@fast.au.fujitsu.com>
Date: Mon, 18 Feb 2019 12:41:34 +1100
Subject: [PATCH 10/10] Table access method API explanation

All the table access method API's and their details are explained.
---
 doc/src/sgml/am.sgml | 548 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 544 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/am.sgml b/doc/src/sgml/am.sgml
index 579187ed1b..d440ebeb58 100644
--- a/doc/src/sgml/am.sgml
+++ b/doc/src/sgml/am.sgml
@@ -18,12 +18,552 @@
   <para>
    All Tables in <productname>PostgreSQL</productname> are the primary
    data store. Each table is stored as its own physical <firstterm>relation</firstterm>
-   and so is described by an entry in the <structname>pg_class</structname>
-   catalog. The contents of an table are entirely under the control of its
-   access method. (All the access methods furthermore use the standard page
-   layout described in <xref linkend="storage-page-layout"/>.)
+   and is described by an entry in the <structname>pg_class</structname>
+   catalog. A table's content is entirely controlled by its access method, although
+   all access methods use the same standard page layout described in <xref linkend="storage-page-layout"/>.
   </para>
+  
+  <sect2 id="table-access-methods-api">
+   <title>Table access method API</title>
+   
+   <para>
+    Each table access method is described by a row in the
+    <link linkend="catalog-pg-am"><structname>pg_am</structname></link> system
+    catalog. The <structname>pg_am</structname> entry specifies a <firstterm>type</firstterm>
+    of the access method and a <firstterm>handler function</firstterm> for the
+    access method. These entries can be created and deleted using the <xref linkend="sql-create-access-method"/>
+    and <xref linkend="sql-drop-access-method"/> SQL commands.
+   </para>
+  
+   <para>
+    A table access method handler function must be declared to accept a
+    single argument of type <type>internal</type> and to return the
+    pseudo-type <type>table_am_handler</type>.  The argument is a dummy value that
+    simply serves to prevent handler functions from being called directly from
+    SQL commands.  The result of the function must be a palloc'd struct of
+    type <structname>TableAmRoutine</structname>, which contains everything
+    that the core code needs to know to make use of the table access method.
+    The <structname>TableAmRoutine</structname> struct, also called the access
+    method's <firstterm>API struct</firstterm>, includes fields specifying assorted
+    fixed properties of the access method, such as whether it can support
+    bitmap scans.  More importantly, it contains pointers to support
+    functions for the access method, which do all of the real work to access
+    tables.  These support functions are plain C functions and are not
+    visible or callable at the SQL level.  The support functions are described
+    in <structname>TableAmRoutine</structname> structure. For more details, please
+    refer the file <filename>src/include/access/tableam.h</filename>.
+   </para>
+   
+   <para>
+    Any new <literal>TABLE ACCSESS METHOD</literal> developers can refer the exisitng <literal>HEAP</literal>
+    implementation present in the <filename>src/backend/heap/heapam_handler.c</filename> for more details of
+    how it is implemented for HEAP access method.
+   </para>
+   
+   <para>
+    There are differnt type of API's that are defined and those details are below.
+   </para>
+  
+   <sect3 id="slot-implementation-function">
+    <title>Slot implementation functions</title>
+     
+   <para>
+<programlisting>
+const TupleTableSlotOps *(*slot_callbacks) (Relation rel);
+</programlisting>
+  
+    This API expects the function should return the slot implementation that is specific to the AM.
+    Following are the predefined types of slot implementations that are available,
+    <literal>TTSOpsVirtual</literal>, <literal>TTSOpsHeapTuple</literal>,
+    <literal>TTSOpsMinimalTuple</literal> and <literal>TTSOpsBufferHeapTuple</literal>.
+    The AM implementations can use any one of them. For more details of these slot 
+    specific implementations, you can refer <filename>src/include/executor/tuptable.h</filename>.
+   </para>
+   </sect3>
+   
+   <sect3 id="table-scan-functions">
+    <title>Table scan functions</title>
+     
+    <para>
+     The following API's are used for scanning of a table.
+    </para>
+   
+    <para>
+<programlisting>
+TableScanDesc (*scan_begin) (Relation rel,
+                             Snapshot snapshot,
+                             int nkeys, struct ScanKeyData *key,
+                             ParallelTableScanDesc parallel_scan,
+                             bool allow_strat,
+                             bool allow_sync,
+                             bool allow_pagemode,
+                             bool is_bitmapscan,
+                             bool is_samplescan,
+                             bool temp_snap);
+</programlisting>
+  
+     This API to start a scan of a relation pointed by <literal>rel</literal> using specified options
+     and returns the <structname>TableScanDesc</structname>. <literal>parallel_scan</literal> can be used
+     by the AM, in case if it support parallel scan.
+    </para>
+  
+    <para>
+<programlisting>
+void        (*scan_end) (TableScanDesc scan);
+</programlisting>
+  
+     This API to end the scan that is started by the API <literal>scan_begin</literal>.
+    </para>
+  
+    <para>
+<programlisting>
+void        (*scan_rescan) (TableScanDesc scan, struct ScanKeyData *key, bool set_params,
+                            bool allow_strat, bool allow_sync, bool allow_pagemode);
+</programlisting>
+  
+     This API to restart the given scan that is already started by the
+     API <literal>scan_begin</literal> using the provided options, releasing
+     any resources (such as buffer pins) that are held by the scan.
+    </para>
+   
+    <para>
+<programlisting>
+TupleTableSlot *(*scan_getnextslot) (TableScanDesc scan,
+                                     ScanDirection direction, TupleTableSlot *slot);
+</programlisting>
+  
+     This API to return the next satisified tuple from the scan started by the API
+     <literal>scan_begin</literal>.
+    </para>
+    
+   </sect3>
+  
+   <sect3 id="parallel-table-scan-function">
+    <title>parallel table scan functions</title>
+   
+    <para>
+     The following API's are used to perform the parallel scan. 
+    </para>  
+   
+    <para>
+<programlisting>
+Size        (*parallelscan_estimate) (Relation rel);
+</programlisting>
+  
+     This API to return the total size that is required for the AM to perform
+     the parallel table scan. The minimum size that is required is 
+     <structname>ParallelBlockTableScanDescData</structname>.
+    </para>
+    
+    <para>
+<programlisting>
+Size        (*parallelscan_initialize) (Relation rel, ParallelTableScanDesc parallel_scan);
+</programlisting>
+  
+     This API to perform the initialization of the <literal>parallel_scan</literal>
+     that is required for the parallel scan to be performed by the AM and also return
+     the total size that is required for the AM to perform the parallel table scan.
+    </para>
+   
+    <para>
+<programlisting>
+void        (*parallelscan_reinitialize) (Relation rel, ParallelTableScanDesc parallel_scan);
+</programlisting>
+  
+     This API to reinitalize the parallel scan structure pointed by the <literal>parallel_scan</literal>.
+    </para>
+    
+   </sect3>
+ 
+   <sect3 id="index-scan-functions">
+    <title>Index scan functions</title>
+     
+    <para>
+<programlisting>
+struct IndexFetchTableData *(*begin_index_fetch) (Relation rel);
+</programlisting>
+  
+     This API to return the allocated and initialized <structname>IndexFetchTableData</structname>
+     strutucture that is used to perform the table scan from the index.
+    </para>
+  
+    <para>
+<programlisting>
+void        (*reset_index_fetch) (struct IndexFetchTableData *data);
+</programlisting>
+  
+     This API to release the AM specific resources that are held by <structname>IndexFetchTableData</structname>
+     of a index scan.
+    </para>
+   
+    <para>
+<programlisting>
+void        (*end_index_fetch) (struct IndexFetchTableData *data);
+</programlisting>
+  
+     This API to release AM-specific resources held by the <structname>IndexFetchTableData</structname>
+     of a given index scan and free the memory of <structname>IndexFetchTableData</structname> itself.
+    </para>
+    
+    <para>
+<programlisting>
+TransactionId (*compute_xid_horizon_for_tuples) (Relation rel,
+                                                 ItemPointerData *items,
+                                                 int nitems);
+</programlisting>
+  
+     This API to get the newest xid among the provided tuples by <literal>items</literal>. This is used
+     to compute what snapshots to conflict with the <literal>items</literal> when replaying WAL records
+     for page-level index vacuums.
+    </para>
+    
+   </sect3>
 
+   <sect3 id="manipulation-of-physical-tuples-functions">
+    <title>Manipulation of physical tuples functions</title>
+     
+    <para>
+<programlisting>
+void        (*tuple_insert) (Relation rel, TupleTableSlot *slot, CommandId cid,
+                             int options, struct BulkInsertStateData *bistate);
+</programlisting>
+  
+     This API to insert the tuple contained in the provided slot into the relation
+     and update the unique identifier of the tuple <literal>ItemPointerData</literal>
+     in the slot, use the BulkInsertStateData if available.
+    </para>
+   
+    <para>
+<programlisting>
+void        (*tuple_insert_speculative) (Relation rel,
+                                         TupleTableSlot *slot,
+                                         CommandId cid,
+                                         int options,
+                                         struct BulkInsertStateData *bistate,
+                                         uint32 specToken);
+</programlisting>
+  
+     This API is similar like <literal>tuple_insert</literal> API, but it inserts the tuple
+     with addtional information that is necessray for speculative insertion, the insertion will
+     be confirmed later based on its successful insertion to the index.
+    </para>
+    
+    <para>
+<programlisting>
+void        (*tuple_complete_speculative) (Relation rel,
+                                           TupleTableSlot *slot,
+                                           uint32 specToken,
+                                           bool succeeded);
+</programlisting>
+  
+     This API to complete the speculative insertion of a tuple started by <literal>tuple_insert_speculative</literal>,
+     invoked after finishing the index insert and returns whether the operation is successfule or not?
+    </para>
+   
+    <para>
+<programlisting>
+HTSU_Result (*tuple_delete) (Relation rel,
+                             ItemPointer tid,
+                             CommandId cid,
+                             Snapshot snapshot,
+                             Snapshot crosscheck,
+                             bool wait,
+                             HeapUpdateFailureData *hufd,
+                             bool changingPart);
+</programlisting>
+  
+     This API to delete a tuple of the relation pointed by the ItemPointer and returns the
+     result of the operation. In case of any failure updates the hufd.
+    </para>
+   
+    <para>
+<programlisting>
+HTSU_Result (*tuple_update) (Relation rel,
+                             ItemPointer otid,
+                             TupleTableSlot *slot,
+                             CommandId cid,
+                             Snapshot snapshot,
+                             Snapshot crosscheck,
+                             bool wait,
+                             HeapUpdateFailureData *hufd,
+                             LockTupleMode *lockmode,
+                             bool *update_indexes);
+</programlisting>
+  
+     This API to perform updating a tuple with the new tuple pointed by the ItemPointer and returns
+     the result of the operation and also updates the flag whether the index needs an update or not?
+     In case of any failure it should update the hufd flag.
+    </para>
+   
+    <para>
+<programlisting>
+void        (*multi_insert) (Relation rel, TupleTableSlot **slots, int nslots,
+                             CommandId cid, int options, struct BulkInsertStateData *bistate);
+</programlisting>
+  
+     This API to perform insertion of multiple tuples into the relation for faster data insertion.
+     use the BulkInsertStateData if available.
+    </para>
+   
+    <para>
+<programlisting>
+HTSU_Result (*tuple_lock) (Relation rel,
+                           ItemPointer tid,
+                           Snapshot snapshot,
+                           TupleTableSlot *slot,
+                           CommandId cid,
+                           LockTupleMode mode,
+                           LockWaitPolicy wait_policy,
+                           uint8 flags,
+                           HeapUpdateFailureData *hufd);
+</programlisting>
+  
+     This API to lock the specified tuple pointed by the ItemPointer <literal>tid</literal>
+     of its newest version and returns the result of the operation. In case of failure updates the hufd.
+    </para>
+   
+    <para>
+<programlisting>
+void        (*finish_bulk_insert) (Relation rel, int options);
+</programlisting>
+  
+     This API to perform the operations necessary to complete insertions made
+     via <literal>tuple_insert</literal> and <literal>multi_insert</literal> with a
+     BulkInsertState specified. This e.g. may e.g. used to flush the relation when
+     inserting with skipping WAL or may be no operation.
+    </para>
+   
+   </sect3>
+  
+   <sect3 id="non-modifying-tuple-functions">
+    <title>Non modifying tuple functions</title>
+     
+    <para>
+<programlisting>
+bool        (*tuple_fetch_row_version) (Relation rel,
+                                        ItemPointer tid,
+                                        Snapshot snapshot,
+                                        TupleTableSlot *slot,
+                                        Relation stats_relation);
+</programlisting>
+  
+     This API to fetches the latest tuple specified by the ItemPointer <literal>tid</literal>
+     and store it in the slot. For e.g, in the case if Heap AM, the update chains are created
+     whenever the tuple is updated, so the function should fetch the latest tuple.
+    </para>
+  
+    <para>
+<programlisting>
+void        (*tuple_get_latest_tid) (Relation rel,
+                                     Snapshot snapshot,
+                                     ItemPointer tid);
+</programlisting>
+  
+     This API to get the TID of the latest version of the tuple based on the specified
+     ItemPointer. For e.g, in the case of Heap AM, the update chains are created whenever
+     any tuple is updated. This API is useful to find out latest ItemPointer.
+    </para>
+   
+    <para>
+<programlisting>
+bool        (*tuple_fetch_follow) (struct IndexFetchTableData *scan,
+                                   ItemPointer tid,
+                                   Snapshot snapshot,
+                                   TupleTableSlot *slot,
+                                   bool *call_again, bool *all_dead);
+</programlisting>
+  
+     This API is used to fetch the tuple pointed by the ItemPointer based on the
+     IndexFetchTableData and store it in the specified slot and also updates the flags.
+     This API is called from the index scan operation.
+    </para>
+   
+    <para>
+<programlisting>
+bool        (*tuple_satisfies_snapshot) (Relation rel,
+                                         TupleTableSlot *slot,
+                                         Snapshot snapshot);
+</programlisting>
+  
+     This API performs the tuple visibility based on provided snapshot and returns
+     "true" if the current tuple is visible, otherwise "false".
+    </para>
+    
+   </sect3>
+   
+   <sect3 id="ddl-related-functions">
+    <title>DDL related functions</title>
+     
+    <para>
+<programlisting>
+void        (*relation_set_new_filenode) (Relation rel,
+                                          char persistence,
+                                          TransactionId *freezeXid,
+                                          MultiXactId *minmulti);
+</programlisting>
+  
+     This API to create the storage that is necessary to store the tuples of the relation
+     and also updates the minimum XID that is possible to insert the tuples. For e.g, the Heap AM,
+     should create the relfilenode that is necessary to store the heap tuples.
+    </para>
+  
+    <para>
+<programlisting>
+void        (*relation_nontransactional_truncate) (Relation rel);
+</programlisting>
+  
+     This API is used to truncate the specified relation, this operation is not non-reversible.
+    </para>
+  
+    <para>
+<programlisting>
+void        (*relation_copy_data) (Relation rel, RelFileNode newrnode);
+</programlisting>
+  
+     This API to perform the copy of the relation from existing filenode to the new filenode
+     specified by the <literal>newrnode</literal> and removes the existing filenode.
+    </para>
+   
+    <para>
+<programlisting>
+void        (*relation_vacuum) (Relation onerel, int options,
+                                struct VacuumParams *params, BufferAccessStrategy bstrategy);
+</programlisting>
+  
+     This API performs vacuuming of the relation based on the specified params.
+     It Gathers all the dead tuples of the relation and clean them including
+     the indexes.
+    </para>
+   
+    <para>
+<programlisting>
+void        (*scan_analyze_next_block) (TableScanDesc scan, BlockNumber blockno,
+                                        BufferAccessStrategy bstrategy);
+</programlisting>
+  
+     This API to return a relation block, required to perform tuple analysis. Analysis of this
+     information is used by the planner to optimize the query planning on this relation.
+    </para>
+   
+    <para>
+<programlisting>
+bool        (*scan_analyze_next_tuple) (TableScanDesc scan, TransactionId OldestXmin,
+                                        double *liverows, double *deadrows, TupleTableSlot *slot);
+</programlisting>
+  
+     This API to get the next visible tuple from the block being scanned based on the snapshot
+     and also updates the number of live and dead tuples encountered.
+    </para>
+   
+    <para>
+<programlisting>
+void        (*relation_copy_for_cluster) (Relation NewHeap, Relation OldHeap, Relation OldIndex,
+                                          bool use_sort,
+                                          TransactionId OldestXmin, TransactionId FreezeXid, MultiXactId MultiXactCutoff,
+                                          double *num_tuples, double *tups_vacuumed, double *tups_recently_dead);
+</programlisting>
+  
+     This API to make a copy of the content of a relation, optionally sorted using either the specified index or by sorting
+     explicitly. It also removes the dead tuples.
+    </para>
+   
+    <para>
+<programlisting>
+double      (*index_build_range_scan) (Relation heap_rel,
+                                       Relation index_rel,
+                                       IndexInfo *index_nfo,
+                                       bool allow_sync,
+                                       bool anyvisible,
+                                       BlockNumber start_blockno,
+                                       BlockNumber end_blockno,
+                                       IndexBuildCallback callback,
+                                       void *callback_state,
+                                       TableScanDesc scan);
+</programlisting>
+  
+     This API to scan the specified blocks of a given relation and insert them into the specified index
+     using the provided the callback function.
+    </para>
+   
+    <para>
+<programlisting>
+void        (*index_validate_scan) (Relation heap_rel,
+                                    Relation index_rel,
+                                    IndexInfo *index_info,
+                                    Snapshot snapshot,
+                                    struct ValidateIndexState *state);
+</programlisting>
+  
+     This API to scan the table according to the given snapshot and insert tuples
+     satisfying the snapshot into the specified index, provided their TIDs are
+     also present in the <structname>ValidateIndexState</structname> struct;
+     this API is used as the last phase of a concurrent index build.
+    </para>
+   
+   </sect3>
+   
+   <sect3 id="planner-functions">
+    <title>planner functions</title>
+     
+    <para>
+<programlisting>
+void        (*relation_estimate_size) (Relation rel, int32 *attr_widths,
+                                       BlockNumber *pages, double *tuples, double *allvisfrac);
+</programlisting>
+  
+     This API estimates the total size of the relation and also returns the number of
+     pages, tuples and etc related to the corresponding relation.
+    </para>
+    
+   </sect3>
+   
+   <sect3 id="executor-functions">
+    <title>executor functions</title>
+     
+    <para>
+<programlisting>
+bool        (*scan_bitmap_pagescan) (TableScanDesc scan,
+                                     TBMIterateResult *tbmres);
+</programlisting>
+  
+     This API to scan the relation block specified in the scan descriptor to collect and return the
+     tuples requested by the <structname>tbmres</structname> based on the visibility.
+    </para>
+  
+    <para>
+<programlisting>
+bool        (*scan_bitmap_pagescan_next) (TableScanDesc scan,
+                                          TupleTableSlot *slot);
+</programlisting>
+  
+     This API to get the next tuple from the set of tuples of a given page specified in the scan descriptor
+     and return the provided slot; returns false in case if there are no more tuples. 
+    </para>
+   
+    <para>
+<programlisting>
+bool        (*scan_sample_next_block) (TableScanDesc scan,
+                                       struct SampleScanState *scanstate);
+</programlisting>
+  
+     This API to select the next block of a relation using the given sampling method or sequentially and
+     set its information in the scan descriptor.
+    </para>
+   
+    <para>
+<programlisting>
+bool        (*scan_sample_next_tuple) (TableScanDesc scan,
+                                       struct SampleScanState *scanstate,
+                                       TupleTableSlot *slot);
+</programlisting>
+  
+     This API get the next tuple to sample from the current sampling block based on
+     the sampling method, otherwise get the next visible tuple of the block that is 
+     choosen from the <literal>scan_sample_next_block</literal>.
+    </para>
+    
+  </sect3>  
+  </sect2>
  </sect1> 
  
  <sect1 id="index-access-methods">
-- 
2.20.1.windows.1