Home > mailing lists

Re: trgm regex index peculiarity - Mailing list pgsql-hackers

From	Erik Rijkers
Subject	Re: trgm regex index peculiarity
Date	March 28, 2014 22:28:13
Msg-id	32d687d2a55963c74281327fffcf7abb.squirrel@webmail.xs4all.nl Whole thread Raw
In response to	Re: trgm regex index peculiarity (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses	Re: trgm regex index peculiarity
List	pgsql-hackers

Tree view

On Fri, March 28, 2014 09:31, Heikki Linnakangas wrote:
> I went back and tried Erik's original test
> (http://www.postgresql.org/message-id/dafad644f268ce1503e1b8b682aae38a.squirrel@webmail.xs4all.nl).
> With a fresh checkout from master, the difference between the slow and
> fast queries is much less dramatic than Erik reported. The reason is
> that Alexander's GIN "fast scan" patch is very effective on that query.
> Erik reported that the '^abcd' query took 140ms, vs 5ms for 'abcd'. On
> my laptop, the numbers with a fresh checkout are about 2.5 ms vs. 1 ms,
> and with fast scan disabled (by modifying the source code), 40ms vs 1ms.
>
> So thanks to the fast scan patch, I don't think this patch is worth
> pursuing anymore. Unless there are some other test case where this patch
> helps, but the fast scan patch doesn't.
>

for the same 2 statements of my original test:
explain (analyze,buffers) select txt from azjunk6 where txt ~ '^abcd'; -- slow (140 ms)
explain (analyze,buffers) select txt from azjunk6 where txt ~  'abcd' and substr(txt,1,4) = 'abcd'; -- fast (5 ms)

You mention (from HEAD, I suppose?) a difference of 2.5 ms vs. 1 ms.

FWIW, for me the difference (from HEAD) remains quite a bit larger:

for n in `seq 1 10`; do ./trgm_peculiarity.sh ; done | grep runtime
Total runtime: 16.167 msTotal runtime: 2.188 msTotal runtime: 16.902 msTotal runtime: 2.203 msTotal runtime: 17.486
msTotalruntime: 2.201 msTotal runtime: 17.663 msTotal runtime: 2.441 msTotal runtime: 13.555 msTotal runtime: 2.204
msTotalruntime: 16.862 msTotal runtime: 2.225 msTotal runtime: 13.207 msTotal runtime: 2.550 msTotal runtime: 16.768
msTotalruntime: 2.172 msTotal runtime: 19.259 msTotal runtime: 2.180 msTotal runtime: 12.934 msTotal runtime: 2.198 ms
 

That's a lot better than the original 140ms vs 5ms  but your laptop's  2.5 ms vs. 1 ms  is perhaps not representative.

(for the full plans see below)


Erik Rijkers




----------------------------------------------------------------------------------------------------------------------------------Bitmap
HeapScan on azjunk6  (cost=56.77..432.93 rows=100 width=81) (actual time=15.898..15.925 rows=2 loops=1)  Recheck Cond:
(txt~ '^abcd'::text)  Rows Removed by Index Recheck: 11  Heap Blocks: exact=13  Buffers: shared hit=105  ->  Bitmap
IndexScan on azjunk6_trgm_re_idx  (cost=0.00..56.75 rows=100 width=0) (actual time=15.834..15.834 rows=13
 
loops=1)        Index Cond: (txt ~ '^abcd'::text)        Buffers: shared hit=92Planning time: 3.304 msTotal runtime:
16.179ms
 
(10 rows)

Time: 21.103 ms
explain (analyze,buffers) select txt from azjunk6 where txt ~  'abcd' and substr(txt,1,4) = 'abcd'; -- fast (5 ms)
                                                   QUERY PLAN
 

---------------------------------------------------------------------------------------------------------------------------------Bitmap
HeapScan on azjunk6  (cost=28.75..405.40 rows=1 width=81) (actual time=1.681..2.164 rows=2 loops=1)  Recheck Cond: (txt
~'abcd'::text)  Rows Removed by Index Recheck: 11  Filter: (substr(txt, 1, 4) = 'abcd'::text)  Rows Removed by Filter:
101 Heap Blocks: exact=113  Buffers: shared hit=120  ->  Bitmap Index Scan on azjunk6_trgm_re_idx  (cost=0.00..28.75
rows=100width=0) (actual time=1.171..1.171 rows=114
 
loops=1)        Index Cond: (txt ~ 'abcd'::text)        Buffers: shared hit=7Planning time: 0.516 msTotal runtime:
2.183ms
 
(12 rows)

pgsql-hackers by date:

From: Tom Lane
Date: 28 March 2014, 21:34:00
Subject: Re: Composite Datums containing toasted fields are a bad idea(?)

From: Christoph Berg
Date: 29 March 2014, 09:05:05
Subject: Re: Securing "make check" (CVE-2014-0067)

Re: trgm regex index peculiarity - Mailing list pgsql-hackers

Previous

Next