Re: Query results caching? - Mailing list pgsql-general
From | Ben-Nes Yonatan |
---|---|
Subject | Re: Query results caching? |
Date | |
Msg-id | 430A515B.6080307@canaan.co.il Whole thread Raw |
In response to | Re: Query results caching? ("Jim C. Nasby" <jnasby@pervasive.com>) |
Responses |
Re: Query results caching?
|
List | pgsql-general |
> On Mon, Aug 22, 2005 at 10:13:49PM +0200, Ben-Nes Yonatan wrote: > >>I think that I was misunderstood, Ill make an example: >>Lets say that im making the following query for the first time on the >>"motorcycles" table which got an index on the "manufacturer" field: >> >>EXPLAIN ANALYZE SELECT manufacturer FROM motorcycles WHERE >>manufacturer='suzuki'; >>... Total runtime: 3139.587 ms >> >>Now im doing the same query again and i get a much faster result (cause >>of the "caching"): Total runtime: 332.53 ms >> >>After both of those queries I drop the index and query the table again >>with the exact same query as before and now I receive: Total runtime: >>216834.871 ms >> >>And for my last check I run the exact same query again (without creating >>the INDEX back again) and I get quite similar result to my third query: >>Total runtime: 209218.01 ms >> >> >>My problem is that (maybe I just dont understand something basic >>here...) the last 2 (also the second query but I dont care about that) >>queries were using the "cache" that was created after the first query >>(which had an INDEX) so none of them actually showed me what will happen >>if a client will do such a search (without an INDEX) for the first time. >> >>I want to delete that "caching" after I do the first 2 queries so my >>next queries will show me "real life results". > > Ok I tried to handle both of your replies cause I got them at 2 seperate emails. Dann Corbit wrote: > These results are all what I would expect. When you delete the index, > the query will be forced to do a table scan (to examine every single > record in the table one by one). If the table is non-trivial it is > unlikely that either the OS or the database will cache the whole thing > in memory. However, when you query a small record set, then it is > likely to be retained in RAM which is literally thousands of times > faster than disk. Didnt know that, good to know though doesnt assure me... What if I drop the INDEX but create a diffrent INDEX which also make the process alot faster then without an INDEX but slower/faster then the one before, will it wont use the former "caching"? > If a search is to be made on a frequent basis, you should create an > index. > The query results above show you why. Obvious :) > Think about this for a minute. The real life results you want are > very fast results. For that reason, you should try to model the > customer queries as nearly as possible. If you have a canned > application like order entry, then the real parameterized query set > will probably be quite small in real life. If you are creating a > server for ad-hoc queries then it will be far more difficult to model > in real life. > > What is the real purpose of the application that you are writing? > > Will users be using a pre-programmed front end, or will they be typing > in queries free-form for whatever their heart desires? Ok ill try to describe the system as short & precise as possible (its also passed midnight here :)). Each day I receive about 4 million rows of data (products) which I insert into table1 (after I delete all of the previous data it had), along it I receive for every row about another 15 keywords which I insert into table2 (where as in table1 I delete all of the previous data it had also), this process is a fact that I cant change. Now the users of the site can search for data from table1 by typing whichever (and up to 4) words as they want at a text field (search input string) and the server should display the correct results by querying table1 & join table2 for its keywords. I succeded to do it quite fast but when I tried to ORDER BY my results its times jumped up drastically (2-3 seconds for a query... and thats after the caching..). I can't allow a situation where a user will search with a keyword which wasnt 'cached' before and because of that he will wait 15 seconds for a result. Jim C. Nasby wrote: > Emptying the cache will not show real-life results. You are always going > to have some stuff cached, even if you get a query for something new. In > this case (since you'll obviously want those indexes there), after some > amount of time you will have most (if not all) of the non-leaf index > pages cached, since they take a fairly small amount of memory and are > frequently accessed. This makes index traversal *much* faster than your > initial case shows, even if you query on something different each time. > Testing with a completely empty cache just isn't that realistic. As far as I understand it at my situation where all of the data is deleted and inserted each day from the start (INDEX will get lost with it..) & the endless variety of possible keywords search's & the immense size of the tables, the following reason wont last.. or am I wrong here? Because of all of that I want to be able to see how much time a query takes when its the first time its being run..... or I'm wrong again and failing to understand something? Again everyone THANKS ALOT its really amazing the help that I receive from you! Ben-Nes Yonatan
pgsql-general by date: