Re: trgm regex index peculiarity - Mailing list pgsql-hackers
From | Erik Rijkers |
---|---|
Subject | Re: trgm regex index peculiarity |
Date | |
Msg-id | 32d687d2a55963c74281327fffcf7abb.squirrel@webmail.xs4all.nl Whole thread Raw |
In response to | Re: trgm regex index peculiarity (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Responses |
Re: trgm regex index peculiarity
|
List | pgsql-hackers |
On Fri, March 28, 2014 09:31, Heikki Linnakangas wrote: > I went back and tried Erik's original test > (http://www.postgresql.org/message-id/dafad644f268ce1503e1b8b682aae38a.squirrel@webmail.xs4all.nl). > With a fresh checkout from master, the difference between the slow and > fast queries is much less dramatic than Erik reported. The reason is > that Alexander's GIN "fast scan" patch is very effective on that query. > Erik reported that the '^abcd' query took 140ms, vs 5ms for 'abcd'. On > my laptop, the numbers with a fresh checkout are about 2.5 ms vs. 1 ms, > and with fast scan disabled (by modifying the source code), 40ms vs 1ms. > > So thanks to the fast scan patch, I don't think this patch is worth > pursuing anymore. Unless there are some other test case where this patch > helps, but the fast scan patch doesn't. > for the same 2 statements of my original test: explain (analyze,buffers) select txt from azjunk6 where txt ~ '^abcd'; -- slow (140 ms) explain (analyze,buffers) select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd'; -- fast (5 ms) You mention (from HEAD, I suppose?) a difference of 2.5 ms vs. 1 ms. FWIW, for me the difference (from HEAD) remains quite a bit larger: for n in `seq 1 10`; do ./trgm_peculiarity.sh ; done | grep runtime Total runtime: 16.167 msTotal runtime: 2.188 msTotal runtime: 16.902 msTotal runtime: 2.203 msTotal runtime: 17.486 msTotalruntime: 2.201 msTotal runtime: 17.663 msTotal runtime: 2.441 msTotal runtime: 13.555 msTotal runtime: 2.204 msTotalruntime: 16.862 msTotal runtime: 2.225 msTotal runtime: 13.207 msTotal runtime: 2.550 msTotal runtime: 16.768 msTotalruntime: 2.172 msTotal runtime: 19.259 msTotal runtime: 2.180 msTotal runtime: 12.934 msTotal runtime: 2.198 ms That's a lot better than the original 140ms vs 5ms but your laptop's 2.5 ms vs. 1 ms is perhaps not representative. (for the full plans see below) Erik Rijkers ----------------------------------------------------------------------------------------------------------------------------------Bitmap HeapScan on azjunk6 (cost=56.77..432.93 rows=100 width=81) (actual time=15.898..15.925 rows=2 loops=1) Recheck Cond: (txt~ '^abcd'::text) Rows Removed by Index Recheck: 11 Heap Blocks: exact=13 Buffers: shared hit=105 -> Bitmap IndexScan on azjunk6_trgm_re_idx (cost=0.00..56.75 rows=100 width=0) (actual time=15.834..15.834 rows=13 loops=1) Index Cond: (txt ~ '^abcd'::text) Buffers: shared hit=92Planning time: 3.304 msTotal runtime: 16.179ms (10 rows) Time: 21.103 ms explain (analyze,buffers) select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd'; -- fast (5 ms) QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------Bitmap HeapScan on azjunk6 (cost=28.75..405.40 rows=1 width=81) (actual time=1.681..2.164 rows=2 loops=1) Recheck Cond: (txt ~'abcd'::text) Rows Removed by Index Recheck: 11 Filter: (substr(txt, 1, 4) = 'abcd'::text) Rows Removed by Filter: 101 Heap Blocks: exact=113 Buffers: shared hit=120 -> Bitmap Index Scan on azjunk6_trgm_re_idx (cost=0.00..28.75 rows=100width=0) (actual time=1.171..1.171 rows=114 loops=1) Index Cond: (txt ~ 'abcd'::text) Buffers: shared hit=7Planning time: 0.516 msTotal runtime: 2.183ms (12 rows)
pgsql-hackers by date: