Re: Tsearch2 performance on big database - Mailing list pgsql-performance
From | Oleg Bartunov |
---|---|
Subject | Re: Tsearch2 performance on big database |
Date | |
Msg-id | Pine.GSO.4.62.0503241343580.5508@ra.sai.msu.su Whole thread Raw |
In response to | Re: Tsearch2 performance on big database (Rick Jansen <rick@rockingstone.nl>) |
Responses |
Re: Tsearch2 performance on big database
|
List | pgsql-performance |
On Thu, 24 Mar 2005, Rick Jansen wrote: > Oleg Bartunov wrote: >> from my notes >> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes >> >> It's usefull to see words statistics, for example, to check how good >> your dictionaries work or how did you configure pg_ts_cfgmap. Also, you >> may notice probable stop words relevant for your collection. Tsearch >> provides stat() function: >> >> ....................... >> >> Don't hesitate to read it and if you find some bugs or know better wording >> I'd be glad to improve my notes. >> > > Thanks, but that stat() query takes way too long.. I let it run for like > 4 hours and still nothing. The database I am testing tsearch2 on is also > the production database (mysql) server so I have to be careful not to > use too many resources :o stat() is indeed a bigdog, it was designed for developers needs, so we recommend to save results in table. > > Anyway, here's my pg_ts_cfgmap now (well the relevant bits): > > default_english | lhword | {en_ispell,en_stem} > default_english | lpart_hword | {en_ispell,en_stem} > default_english | lword | {en_ispell,en_stem} > > Is it normal that queries for single words (or perhaps they are words > that are common) take a really long time? Like this: > 'hispanic' isn't common, I see you get only 674 rows and 'buckingham & palace' returns 185 rows. Did you run 'vacuum analyze' ? I see a big discrepancy between estimated rows (8041) and actual rows. > ilab=# explain analyze select count(*) from books where description_fti @@ > to_tsquery('default', 'hispanic'); > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------ > Aggregate (cost=20369.81..20369.81 rows=1 width=0) (actual > time=261512.031..261512.031 rows=1 loops=1) > -> Index Scan using idxfti_idx on books (cost=0.00..20349.70 rows=8041 > width=0) (actual time=45777.760..261509.288 rows=674 loops=1) > Index Cond: (description_fti @@ '\'hispan\''::tsquery) > Total runtime: 261518.529 ms > (4 rows) > > ilab=# explain analyze select titel from books where description_fti @@ > to_tsquery('default', 'buckingham & palace'); > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------ > Index Scan using idxfti_idx on books (cost=0.00..20349.70 rows=8041 > width=57) (actual time=18992.045..48863.385 rows=185 loops=1) > Index Cond: (description_fti @@ '\'buckingham\' & \'palac\''::tsquery) > Total runtime: 48863.874 ms > (3 rows) > > > I dont know what happened, these queries were a lot faster 2 days ago..what > the feck is going on?! > > Rick > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
pgsql-performance by date: