Re: Indexes and statistics - Mailing list pgsql-sql
From | David Witham |
---|---|
Subject | Re: Indexes and statistics |
Date | |
Msg-id | CFA248776934FD43847E740E43C346D199DC18@ozimelb03.ozicom.com Whole thread Raw |
In response to | Indexes and statistics ("David Witham" <davidw@unidial.com.au>) |
Responses |
Re: Indexes and statistics
|
List | pgsql-sql |
Tom, I'm running PostgreSQL 7.3.2 on Red Hat Linux 7.3 with 512Mb RAM. The table definition is: Table "public.cdr" Column | Type | Modifiers -----------------------+-----------------------+-----------carrier_id | integer | not nullfile_id | integer | not nullservice_num | character varying(10) | not nullday | date | not nulltime | integer | not nulldestination | character varying(20) | not nullduration | integer | not nullcharge_wholesale |numeric(8,2) | not nullcharge_band_id | integer | charge_retail | numeric(8,2) | not nullrate_plan_id | integer | not nullitem_code | integer | notnullcust_id | integer | not nullbill_id | integer | prefix | character varying(12) | charge_wholesale_calc | numeric(8,2) | Indexes: cdr_ix1 btree ("day"), cdr_ix2 btree (service_num), cdr_ix3 btree (cust_id), cdr_ix4 btree(bill_id), cdr_ix5 btree (carrier_id), cdr_ix6 btree (file_id) Does this make it a "wide" table? The data arrives ordered by service_num, day, time. This customer has one primary service_num that most of the calls aremade from. Therefore each day a clump of CDRs will be loaded for that customer, interspersed with CDRs from all the othercustomers. Therefore the distribution of records for a service_num is clumpy but evenly distributed throughout the table.For a customer with a single primary number, this result applies to the customer as a whole. For a customer with manyservice_num's the result is a little more doubtful depending on whether their service_num's arrive sequentially or not.This would not necessarily be the case. I hope this makes sense. Does it help any? Thanks, David -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Wednesday, 18 February 2004 16:10 To: David Witham Cc: pgsql-sql@postgresql.org Subject: Re: [SQL] Indexes and statistics "David Witham" <davidw@unidial.com.au> writes: > One of the customers is quite large (8.3% of the records): Hmm. Unless your rows are quite wide, a random sampling of 8.3% of the table would be expected to visit every page of the table, probably several times. So the planner's cost estimates do not seem out of line to me; an indexscan *should* be slow. The first question to ask is why the deviation from reality. Are the rows for that customer ID likely to be physically concentrated into a limited number of physical pages? Do you have so much RAM that the whole table got swapped in, eliminating the extra I/O that the planner is expecting? regards, tom lane