Thread: sequence scan on PK
Hi, I understand that when a table contains only a few rows it is better to do a sequence scan than an index scan. But is this also for a table with 99 records? A table contains id integer (primary key) name varchar(70) parent integer comment text owner integer inheritAccess integer defaultAccess integer sequence bigint contentsinheritaccessmove integer contentsinheritaccessadd integer explain select * from tblFolders where id=90; QUERY PLAN ----------------------------------------------------------- Seq Scan on tblfolders (cost=0.00..3.24 rows=1 width=50) Filter: (id = 90) (I have analyze table bit still a sequence scan). With how manys rows it is ok to do an index scan or sequence scan? How is this calculated in pg? Regards Jer
Jeroen van Iddekinge wrote: > Hi, > > > I understand that when a table contains only a few rows it is better to > do a sequence scan than an index scan. But is this also for a table with > 99 records? > ... > explain select * from tblFolders where id=90; > QUERY PLAN > ----------------------------------------------------------- > Seq Scan on tblfolders (cost=0.00..3.24 rows=1 width=50) > Filter: (id = 90) > > > (I have analyze table bit still a sequence scan). > > With how manys rows it is ok to do an index scan or sequence scan? How > is this calculated in pg? > > Regards > Jer It depends on how many pages need to be read. To do an index scan you need to read the index pages, and then you read the page where the actual row resides. Usually the comment is if you are selecting >5% of the rows, seqscan is faster than an index scan. If I'm reading your query correctly, it is estimating needing to read about 3 pages to get the row you are asking for. If you used an index, it probably would have to read at least that many pages, and they would not be a sequential read, so it should be slower. If you want to prove it, try: \timing EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90; EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90; EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90; SET enable_seqscan TO OFF; EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90; EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90; EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90; Run multiple times to make sure everything is cached, and take the fastest time. On your machine it might be true that the index scan is slightly faster than the seqscan in this exact circumstance. But I have the feeling the time is trivially different, and if you had say 70 rows it would favor seqscan. Probably somewhere at 150-200 rows it will switch on it's own. You could tweak with several settings to get it to do an index scan earlier, but these would probably break other queries. You don't need to tune for 100 rows, more like 100k or 100M. John =:->
Attachment
> > You could tweak with several settings to get it to do an index scan > earlier, but these would probably break other queries. You don't need to > tune for 100 rows, morelike 100k or 100M. Thanks for respone. The index scan was a little bit faster for id=1 and faster for id=99. Which settings shoud I change for this? cpu_index_tuple_cost , cpu_operator_cost, cpu_tuple_cost? Jer.
Hi, > Thanks for respone. > The index scan was a little bit faster for id=1 and faster for id=99. > > Which settings shoud I change for this? cpu_index_tuple_cost , > cpu_operator_cost, cpu_tuple_cost? You should lower random_page_cost to make the planner choose an index scan vs sequential scan. Best regards -- Matteo Beccati http://phpadsnew.com/ http://phppgads.com/
Jeroen van Iddekinge wrote: > >> >> You could tweak with several settings to get it to do an index scan >> earlier, but these would probably break other queries. You don't need to >> tune for 100 rows, morelike 100k or 100M. > > > Thanks for respone. > The index scan was a little bit faster for id=1 and faster for id=99. > > Which settings shoud I change for this? cpu_index_tuple_cost , > cpu_operator_cost, cpu_tuple_cost? > > > Jer. Well, I would start with *don't*. You are only looking at one query, which is pretty much fast already, and probably is not going to be the bottleneck. You are optimizing the wrong thing. That being said, because you have everything cached in ram (since it is a tiny table), you probably would set random_page_cost = 2. In theory it should really never be lower than 2, though if you are trying to force an index scan you can do it. John =:->
Attachment
> > You should lower random_page_cost to make the planner choose an index > scan vs sequential scan. > Yes , it was a bit to high (18) so a lowered it. It speeded up some pages for about 5%. Reg. Jer
Jeroen van Iddekinge <iddekingej@lycos.com> writes: >> You could tweak with several settings to get it to do an index scan >> earlier, but these would probably break other queries. You don't need to >> tune for 100 rows, morelike 100k or 100M. > Which settings shoud I change for this? I'd agree with John's response: if you change any settings based on just this one test case, you're a fool. But usually random_page_cost is the best knob to twiddle if you wish to encourage indexscans. regards, tom lane
Hi, > Yes , it was a bit to high (18) so a lowered it. It speeded up some > pages for about 5%. 18? The default is 4 if I can remember correctly. I wonder if your db has ever seen an index scan ;) Best regards -- Matteo Beccati http://phpadsnew.com/ http://phppgads.com/
Matteo Beccati wrote: > Hi, > >> Yes , it was a bit to high (18) so a lowered it. It speeded up some >> pages for about 5%. > > > 18? The default is 4 if I can remember correctly. I wonder if your db > has ever seen an index scan ;) > I was expermenting how much some setting influence has on the perfomance of some web application. So I think i forgot to change the setting back and got some strange query plans. Thanks Jer
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tom Lane wrote: | Jeroen van Iddekinge <iddekingej@lycos.com> writes: | |>>You could tweak with several settings to get it to do an index scan |>>earlier, but these would probably break other queries. You don't need to |>>tune for 100 rows, morelike 100k or 100M. | | |>Which settings shoud I change for this? | | | I'd agree with John's response: if you change any settings based on just | this one test case, you're a fool. But usually random_page_cost is the | best knob to twiddle if you wish to encourage indexscans. | Perhaps just a small comment - before starting the tuning process, you want to make sure the query planner has the right ideas about the nature of data contained in your indexed column. Sometimes, if you insert reasonably sized batches of records containing the same value for that column (for example in a multicolumn key where you usually retrieve by only one column), statistics collector (used to) get out of sync with reality with regard to cardinality of data, because the default snapshot is too small to provide it with objective insight. If you're intimate with your data, you probably want to increase statistics target on that column and/or do some other statistics-related magic and ANALYZE the table again; that alone can mean the difference between a sequential and an index scan where appropriate, and most importantly, you don't need to distort the database's understanding of your hardware to achieve optimal plans (provided you have the value set to proper values, of course), so you won't get bitten where you don't expect it. :) Again, this might not pertain to a 100-row table, but is a good thing [tm] to know when optimizing. I personally would prefer to look at that aspect of optimizer's understanding of data before anything else. Hope this helps. Regards, - -- Grega Bremec gregab at p0f dot net -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFCfvizfu4IwuB3+XoRAhI1AJ92uhoh0u9q7/XPllH37o5KXlpJdwCfQ+2b sJhq4ZWDdZU9x4APoGOsMes= =Tq99 -----END PGP SIGNATURE-----