Thread: Re: [HACKERS] indexing words slow
> On the other hand, if I do a count(*) on '^ric', his takes consequently > around 1:30 mins, no matter how often I run it. This returns 7866. > > A search on count(*) of '^lling' and '^tones' takes around 2.5 secs after > running it several times. Since postgres will still read the data row (even for this count(*)) I would guess, that this is a data distribution problem. Maybe you could cluster your data ? Maybe the '^rol' rows stick pretty much together, whilst the '^ric' rows are evenly distributed on all datapages. Of course there would be room for improvement if postgresql would not read the data pages, which are not needed for any query that only selects columns that are part of the index. Andreas
> > > On the other hand, if I do a count(*) on '^ric', his takes consequently > > around 1:30 mins, no matter how often I run it. This returns 7866. > > > > A search on count(*) of '^lling' and '^tones' takes around 2.5 secs after > > running it several times. > > Since postgres will still read the data row (even for this count(*)) I would guess, > that this is a data distribution problem. Maybe you could cluster your data ? > Maybe the '^rol' rows stick pretty much together, whilst the '^ric' rows > are evenly distributed on all datapages. Of course there would be room > for improvement if postgresql would not read the data pages, which are not needed > for any query that only selects columns that are part of the index. > Brilliant. I had forgotten that the data pages are accessed as well as the index pages. The CLUSTER command works in 6.3, so perhaps that will change some times. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
On Wed, 11 Mar 1998, Bruce Momjian wrote: > > > > > On the other hand, if I do a count(*) on '^ric', his takes consequently > > > around 1:30 mins, no matter how often I run it. This returns 7866. > > > > > > A search on count(*) of '^lling' and '^tones' takes around 2.5 secs after > > > running it several times. > > > > Since postgres will still read the data row (even for this count(*)) I would guess, > > that this is a data distribution problem. Maybe you could cluster your data ? > > Maybe the '^rol' rows stick pretty much together, whilst the '^ric' rows > > are evenly distributed on all datapages. Of course there would be room > > for improvement if postgresql would not read the data pages, which are not needed > > for any query that only selects columns that are part of the index. > > > > Brilliant. I had forgotten that the data pages are accessed as well as > the index pages. The CLUSTER command works in 6.3, so perhaps that will > change some times. Running a 'cluster' right now.... Maarten _____________________________________________________________________________ | TU Delft, The Netherlands, Faculty of Information Technology and Systems | | Department of Electrical Engineering | | Computer Architecture and Digital Technique section | | M.Boekhold@et.tudelft.nl | -----------------------------------------------------------------------------