Re: patternsel() and histogram_selectivity() and the hard cutoff of 100 - Mailing list pgsql-hackers
From | Gregory Stark |
---|---|
Subject | Re: patternsel() and histogram_selectivity() and the hard cutoff of 100 |
Date | |
Msg-id | 87zlufl5yj.fsf@oxford.xeocode.com Whole thread Raw |
In response to | patternsel() and histogram_selectivity() and the hard cutoff of 100 (Gregory Stark <stark@enterprisedb.com>) |
Responses |
Re: patternsel() and histogram_selectivity() and the hard
cutoff of 100
Re: patternsel() and histogram_selectivity() and the hard cutoff of 100 |
List | pgsql-hackers |
"Gregory Stark" <stark@enterprisedb.com> writes: > So I had a thought about how to soften the controversial hard cutoff of 100 > for the use of the histogram selectivity. Instead of switching 100% one way or > the other between the two heuristics why not calculate both and combine them. > The larger the sample size from the histogram the more we can weight the > histogram calculation. The smaller the histogram size the more we weight the > heuristic. > > My first thought was to scale it linearly so we use 10% of the histogram > sample + 90% of the heuristic for default statistic sizes of 10 samples. That > degenerates to the status quo for 100 samples and up. Incidentally I hacked up a patch to do this: postgres=# create table xx as (select i||'x'||i from generate_series(1,10000) as i(i)); SELECT postgres=# alter table xx alter x set statistics 1; ANALYZE postgres=# analyze xx; ALTER TABLE postgres=# explain analyze select * from xx where x like '%x1%'; QUERY PLAN ----------------------------------------------------------------------------------------------------- Seq Scan on xx (cost=0.00..174.00 rows=2000 width=9) (actual time=0.095..11.814 rows=1112 loops=1) Filter: (x ~~ '%x1%'::text) Total runtime: 13.957 ms (3 rows) postgres=# alter table xx alter x set statistics 10; analyze xx; explain analyze select * from xx where x like '%x1%'; ... Seq Scan on xx (cost=0.00..174.00 rows=1920 width=9) (actual time=0.036..11.454 rows=1112 loops=1) ... postgres=# alter table xx alter x set statistics 20; analyze xx; explain analyze select * from xx where x like '%x1%'; Seq Scan on xx (cost=0.00..174.00 rows=1820 width=9) (actual time=0.036..11.446 rows=1112 loops=1) ... postgres=# alter table xx alter x set statistics 50; analyze xx; explain analyze select * from xx where x like '%x1%'; Seq Scan on xx (cost=0.00..174.00 rows=1520 width=9) (actual time=0.036..11.406 rows=1112 loops=1) ... postgres=# alter table xx alter x set statistics 70; analyze xx; explain analyze select * from xx where x like '%x1%'; Seq Scan on xx (cost=0.00..174.00 rows=1320 width=9) (actual time=0.036..10.725 rows=1112 loops=1) ... postgres=# alter table xx alter x set statistics 90; analyze xx; explain analyze select * from xx where x like '%x1%'; Seq Scan on xx (cost=0.00..174.00 rows=1220 width=9) (actual time=0.036..10.326 rows=1112 loops=1) ... postgres=# alter table xx alter x set statistics 100; analyze xx; explain analyze select * from xx where x like '%x1%'; Seq Scan on xx (cost=0.00..174.00 rows=1120 width=9) (actual time=0.037..11.411 rows=1112 loops=1) ... postgres=# alter table xx alter x set statistics 200; analyze xx; explain analyze select * from xx where x like '%x1%'; Seq Scan on xx (cost=0.00..174.00 rows=1106 width=9) (actual time=0.037..11.328 rows=1112 loops=1) ... postgres=# alter table xx alter x set statistics 1; analyze xx; explain analyze select * from xx where x like '%x1%'; Seq Scan on xx (cost=0.00..174.00 rows=2000 width=9) (actual time=0.037..11.810 rows=1112 loops=1) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
Attachment
pgsql-hackers by date: