Re: INDEX suggestion needed - Mailing list pgsql-general
From | Manfred Koizar |
---|---|
Subject | Re: INDEX suggestion needed |
Date | |
Msg-id | pgrhvus3d1nr8pp8311lopqd5uk4ffuir5@4ax.com Whole thread Raw |
In response to | Re: INDEX suggestion needed (Thomas Beutin <tyrone@laokoon.IN-Berlin.DE>) |
Responses |
Re: INDEX suggestion needed
|
List | pgsql-general |
On Thu, 12 Dec 2002 20:13:24 +0100, Thomas Beutin <tyrone@laokoon.IN-Berlin.DE> wrote: >tb=# VACUUM VERBOSE ANALYZE stat_pages; >NOTICE: Pages 7809: Changed 0, Empty 0; Tup 343554: Vac 0, Keep 0, UnUsed 0. >Aggregate (cost=3.19..3.19 rows=1 width=34) (actual time=52.89..52.89 rows=1 loops=1) > -> Index Scan using tb5 on stat_pages (cost=0.00..3.18 rows=1 width=34) (actual time=52.74..52.74 rows=0 loops=1) >Total runtime: 53.11 msec >tb=# EXPLAIN ANALYZE SELECT COUNT(DISTINCT a_id) FROM stat_pages WHERE m_id = 35::smallint AND (visit >= '2002-06-01' ANDvisit <= '2002-12-11'); This selects (almost) all rows. An index cannot help. >i got the following index/sequence scans by date ranges: >(visit >= '2002-06-01' AND visit <= '2002-06-30') index scan 0 rows satisfy this condition, index scan is ok >(visit >= '2002-06-01' AND visit <= '2002-07-31') index scan Ca. 10000 rows, 3% of the whole table, index scan ok >(visit >= '2002-06-01' AND visit <= '2002-08-31') sequence scan 32%, sequence scan is expected to be faster, unless tuples are almost perfectly ordered by visit or most of the table (~ 8000 pages) fits into the cache. How much physical memory is installed? What are your shared_buffers and effective_cache_size settings? You might want to experiment with SET enable_seqscan = off; >(visit >= '2002-07-01' AND visit <= '2002-07-31') index scan Same as 06-01 to 07-31. >(visit >= '2002-08-01' AND visit <= '2002-08-31') sequence scan Ca. 29% >(visit >= '2002-09-01' AND visit <= '2002-09-30') sequence scan More than 50% >(visit >= '2002-10-01' AND visit <= '2002-10-31') index scan, but long (>5sec) Ca. 3%; interesting that it takes so long; I'd like to see EXPLAIN ANALYZE output for enable_seqscan on and off. >And: The date range in my table is from 2002-07-10 11:36:53+02 up to >2002-10-29 23:31:47+01. Yes, this is approximately reflected by the histogram bounds. > attname | null_frac | avg_wi | n_distinct | correlation >---------+-----------+--------+------------+------------- > visit | 0 | 8 | -0.543682 | -0.972118 The negative correlation looks strange. How did you insert your data? > m_id | 0 | 2 | 1 | 1 ^^^ Only one distinct value in m_id? This explains why your m_id index is never used. > attname | most_common_vals >---------+-------------------------- > visit | "2002-08-21 10:29:10+02", ... > m_id | 35 > attname | histogram_bounds >---------+-------------------------- > visit | "2002-07-25 16:37:12+02" > "2002-08-15 12:36:18+02" > "2002-08-23 12:36:15+02" > "2002-08-29 17:30:54+02" > "2002-09-05 12:54:31+02" > "2002-09-10 18:03:54+02" > "2002-09-16 15:44:56+02" > "2002-09-20 14:34:40+02" > "2002-09-24 13:59:29+02" > "2002-09-29 09:09:31+02" > "2002-10-29 23:25:13+01" > m_id | > >??? Is this output ok? Almost. I forgot to ask for most_common_freqs (cut'n'paste error). But I don't expect any value of visit to occur much more than twice, so these values should be irrelevant to our estimations. Servus Manfred
pgsql-general by date: