Re: Doc updates for index cost estimator change - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: Doc updates for index cost estimator change |
Date | |
Msg-id | 200006080234.WAA06592@candle.pha.pa.us Whole thread Raw |
In response to | Doc updates for index cost estimator change (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Doc updates for index cost estimator change
|
List | pgsql-hackers |
Has this been dealt with? > I have written the attached bit of doco about the new index cost > estimator procedure definition, but I am not sure where to put it. > There isn't (AFAICT) any existing documentation about how to make > a new kind of index, which would be the proper place for it. > May I impose on you to find/make a place for this and mark it up > properly? > > Also, doc/src/graphics/catalogs.ag needs to be updated, but I have > no idea how. (The amopselect and amopnpages fields of pg_amop > are gone; pg_am has a new field amcostestimate.) > > regards, tom lane > > > Index cost estimation functions > ------------------------------- > > Every index access method must provide a cost estimation function for > use by the planner/optimizer. The procedure OID of this function is > given in the 'amcostestimate' field of the access method's pg_am entry. > (Note: prior to Postgres 7.0, a different scheme was used for registering > index-specific cost estimation functions.) > > The amcostestimate function is given a list of WHERE clauses that have > been determined to be usable with the index. It must return estimates > of the cost of accessing the index and the selectivity of the WHERE > clauses (that is, the fraction of main-table tuples that will be > retrieved during the index scan). For simple cases, nearly all the > work of the cost estimator can be done by calling standard routines > in the optimizer --- the point of having an amcostestimate function is > to allow index access methods to provide index-type-specific knowledge, > in case it is possible to improve on the standard estimates. > > Each amcostestimate function must have the signature > > void > amcostestimate (Query *root, > RelOptInfo *rel, > IndexOptInfo *index, > List *indexQuals, > Cost *indexAccessCost, > Selectivity *indexSelectivity); > > The first four parameters are inputs: > > root The query being processed > rel The relation the index is on > index The index itself > indexQuals List of index qual clauses (implicitly ANDed); > NIL list indicates no qualifiers are available > > The last two parameters are pass-by-reference outputs: > > *indexAccessCost Set to cost of index processing > *indexSelectivity Set to index selectivity > > Note that cost estimate functions must be written in C, not in SQL or > any available procedural language, because they must access internal > data structures of the planner/optimizer. > > The indexAccessCost should be computed in the units used by > src/backend/optimizer/path/costsize.c: a disk block fetch has cost 1.0, > and the cost of processing one index tuple should usually be taken as > cpu_index_page_weight (which is a user-adjustable optimizer parameter). > The access cost should include all disk and CPU costs associated with > scanning the index itself, but NOT the cost of retrieving or processing > the main-table tuples that are identified by the index. > > The indexSelectivity should be set to the estimated fraction of the main > table tuples that will be retrieved during the index scan. In the case > of a lossy index, this will typically be higher than the fraction of > tuples that actually pass the given qual conditions. > > A typical cost estimator will proceed as follows: > > 1. Estimate and return the fraction of main-table tuples that will be visited > based on the given qual conditions. In the absence of any index-type-specific > knowledge, use the standard optimizer function clauselist_selec(): > > *indexSelectivity = clauselist_selec(root, indexQuals); > > 2. Estimate the number of index tuples that will be visited during the > scan. For many index types this is the same as indexSelectivity times > the number of tuples in the index, but it might be more. (Note that the > index's size in pages and tuples is available from the IndexOptInfo struct.) > > 3. Estimate the number of index pages that will be retrieved during the scan. > This might be just indexSelectivity times the index's size in pages. > > 4. Compute the index access cost as > > *indexAccessCost = numIndexPages + cpu_index_page_weight * numIndexTuples; > > Examples of cost estimator functions can be found in > src/backend/utils/adt/selfuncs.c. > > By convention, the pg_proc entry for an amcostestimate function should show > > prorettype = 0 > pronargs = 6 > proargtypes = 0 0 0 0 0 0 > > We use zero ("opaque") for all the arguments since none of them have types > that are known in pg_type. > > ************ > -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
pgsql-hackers by date: