benchmarking the query planner (was Re: Simple postgresql.conf wizard) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | benchmarking the query planner (was Re: Simple postgresql.conf wizard) |
Date | |
Msg-id | 603c8f070812061019r5aca3aajeaa5493b11c5feff@mail.gmail.com Whole thread Raw |
Responses |
Re: benchmarking the query planner (was Re: Simple postgresql.conf wizard)
|
List | pgsql-hackers |
Sorry for top posting but we are getting a bit far afield from the original topic. I followed up the tests I did last night: http://archives.postgresql.org/pgsql-hackers/2008-12/msg00369.php I benchmarked 100 iterations of EXPLAIN on the query Greg Stark put together as a synthetic benchmark for default_statistics_target with various values for "SET STATISTICS n". Testing was done on CVS HEAD on my laptop with no configure options other than --prefix. Then I did this, to disable compression on pg_statistic. alter table pg_statistic alter column stanumbers1 set storage external; alter table pg_statistic alter column stanumbers2 set storage external; alter table pg_statistic alter column stanumbers3 set storage external; alter table pg_statistic alter column stanumbers4 set storage external; alter table pg_statistic alter column stavalues1 set storage external; alter table pg_statistic alter column stavalues2 set storage external; alter table pg_statistic alter column stavalues3 set storage external; alter table pg_statistic alter column stavalues4 set storage external; (Note that you'll need to put allow_system_table_mods=true in your postgresql.conf file if you want this to work.) Then I reran the tests. The results were pretty dramatic. In the table below, the first column is value of "SET STATISTICS n" that was performed the table column prior to analyzing it. The second column is the time required to plan the query 100x AFTER disabling compression on pg_statistic, and the third column is the time required to plan the query 100x BEFORE disabling compression on pg_statistic. 10 0.829202 0.8249 20 1.059976 1.06957 30 1.168727 1.143803 40 1.287189 1.263252 50 1.370167 1.363951 60 1.486589 1.460464 70 1.603899 1.571107 80 1.69402 1.689651 90 1.79068 1.804454 100 1.930877 2.803941 150 2.446471 4.833002 200 2.95323 6.217708 250 3.436741 7.507919 300 3.983568 8.895015 350 4.497475 10.201713 400 5.072471 11.576961 450 5.615272 12.933128 500 6.286358 14.408157 550 6.895951 15.745378 600 7.400134 17.192916 650 8.038159 18.568616 700 8.606704 20.025952 750 9.154889 21.45775 800 9.80953 22.74635 850 10.363471 24.057379 900 11.022348 25.559911 950 11.69732 27.021034 1000 12.266699 28.711027 As you can see, for default_statistics_target > 90, this is a HUGE win. After doing this test, I rebuilt with --enable-profiling and profiled EXPLAIN 10x with SET STATISTICS 10, 70, 100, 200 with a vanilla configuration, and then 200 again with compression turned off as described above. The, ahem, ridiculously small limit on attachment size prevents me from attaching the full results, so please see the attached results which are truncated after the first section. 10x doesn't seem to be quite enough to get the exact picture of where the bottlenecks are, but the overall picture is clear enough: decompression introduces a huge overhead. Looking specifically at the 200-decompress output, the next biggest hit is AllocSetAlloc(), which, from the detailed results that I unfortunately can't include, is being called mostly by datumCopy() which is being called mostly by get_attstatsslot(). There are 4000 calls to get_attstatsslot() which result 701,500 calls to datumCopy(). I'm not too sure what any of this means in terms of optimizatiion, other than that changing the storage type of pg_statistic columns to external looks like a huge win. Perhaps someone more knowledgeable than I has some thoughts. ...Robert
Attachment
pgsql-hackers by date: