Re: LIMIT/SORT optimization - Mailing list pgsql-patches
From | Bruce Momjian |
---|---|
Subject | Re: LIMIT/SORT optimization |
Date | |
Msg-id | 200704071751.l37HpAk05118@momjian.us Whole thread Raw |
In response to | Re: LIMIT/SORT optimization (Gregory Stark <stark@enterprisedb.com>) |
Responses |
Re: LIMIT/SORT optimization
Re: LIMIT/SORT optimization |
List | pgsql-patches |
I did some performance testing of the patch, and the results were good. I did this: test=> CREATE TABLE test (x INTEGER); test=> INSERT INTO test SELECT * FROM generate_series(1, 1000000); test=> SET log_min_duration_statement = 0; test=> SELECT * FROM test ORDER BY x LIMIT 3; and the results where, before the patch, for three runs: LOG: duration: 1753.518 ms statement: select * from test order by x limit 3; LOG: duration: 1766.019 ms statement: select * from test order by x limit 3; LOG: duration: 1777.520 ms statement: select * from test order by x limit 3; and after the patch: LOG: duration: 449.649 ms statement: select * from test order by x limit 3; LOG: duration: 443.450 ms statement: select * from test order by x limit 3; LOG: duration: 443.086 ms statement: select * from test order by x limit 3; --------------------------------------------------------------------------- Gregory Stark wrote: > > Updated patch attached: > > 1) Removes #if 0 optimizations > > 2) Changes #if 0 to #if NOT_USED for code that's there for completeness and to > keep the code self-documenting purposes rather but isn't needed by anything > currently > > 3) Fixed cost model to represent bounded sorts > > [ Attachment, skipping... ] > > > "Gregory Stark" <stark@enterprisedb.com> writes: > > > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: > > > >> There's a few blocks of code surrounded with "#if 0 - #endif". Are those just > >> leftovers that should be removed, or are things that still need to finished and > >> enabled? > > > > Uhm, I don't remember, will go look, thanks. > > Ok, they were left over code from an optimization that I decided wasn't very > important to pursue. The code that was ifdef'd out detected when disk sorts > could abort a disk sort merge because it had already generated enough tuples > for to satisfy the limit. > > But I never wrote the code to actually abort the run and it looks a bit > tricky. In any case the disk sort use case is extremely narrow, you would need > something like "LIMIT 50000" or more to do it and it would have to be a an > input table huge enough to cause multiple rounds of merges. > > > I think I've figured out how to adjust the cost model. It turns out that it > doesn't usually matter whether the cost model is correct since any case where > the optimization kicks in is a case you're reading a small portion of the > input so it's a case where an index would be *much* better if available. So > the only times the optimization is used is when there's no index available. > Nonetheless it's nice to get the estimates right so that higher levels in the > plan get reasonable values. > > I think I figured out how to do the cost model. At least the results are > reasonable. I'm not sure if I've done it the "right" way though. > > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
pgsql-patches by date: