Re: Understanding tsearch2 performance - Mailing list pgsql-performance
From | Oleg Bartunov |
---|---|
Subject | Re: Understanding tsearch2 performance |
Date | |
Msg-id | Pine.LNX.4.64.1007141629000.32129@sn.sai.msu.ru Whole thread Raw |
In response to | Understanding tsearch2 performance (Ivan Voras <ivoras@freebsd.org>) |
Responses |
Re: Understanding tsearch2 performance
|
List | pgsql-performance |
Something is not good with statistics, 91 est. vs 8449 actually returned. Returning 8449 rows could be quite long. Oleg On Wed, 14 Jul 2010, Ivan Voras wrote: > Here's a query and its EXPLAIN ANALYZE output: > > cms=> select count(*) from forum; > count > ------- > 90675 > (1 row) > > cms=> explain analyze select id,title from forum where _fts_ @@ > 'fer'::tsquery; > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on forum (cost=29.21..361.21 rows=91 width=35) > (actual time=2.946..63.646 rows=8449 loops=1) > Recheck Cond: (_fts_ @@ '''fer'''::tsquery) > -> Bitmap Index Scan on forum_fts (cost=0.00..29.19 rows=91 > width=0) (actual time=2.119..2.119 rows=8449 loops=1) > Index Cond: (_fts_ @@ '''fer'''::tsquery) > Total runtime: 113.641 ms > (5 rows) > > The problem is - tsearch2 seems too slow. I have nothing to compare it > to but 113 ms for searching through this small table of 90,000 records > seems too slow. The forum_fts index is of GIN type and the table > certainly fits into RAM. > > When I issue a dumb query without an index, I get a comparable order of > magnitude performance: > > cms=> explain analyze select id,title from forum where content ilike > '%fer%'; > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------ > Seq Scan on forum (cost=0.00..7307.44 rows=3395 width=35) (actual > time=0.030..798.375 rows=10896 loops=1) > Filter: (content ~~* '%fer%'::text) > Total runtime: 864.384 ms > (3 rows) > > cms=> explain analyze select id,title from forum where content like '%fer%'; > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------- > Seq Scan on forum (cost=0.00..7307.44 rows=3395 width=35) (actual > time=0.024..146.959 rows=7596 loops=1) > Filter: (content ~~ '%fer%'::text) > Total runtime: 191.732 ms > (3 rows) > > Some peculiarities of the setup which might or might not influence this > performance: > > 1) I'm using ICU-patched postgresql because I cannot use my UTF-8 locale > otherwise - this is why the difference between the dumb queries is large > (but I don't see how this can influence tsearch2 since it pre-builds the > tsvector data with lowercase lexemes) > > 2) My tsearch2 lexer is somewhat slow (but I don't see how it can > influence these read-only queries on a pre-built, lexed and indexed data) > > Any ideas? > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
pgsql-performance by date: