Alternative to a temporary table - Mailing list pgsql-performance
From | Steven Rosenstein |
---|---|
Subject | Alternative to a temporary table |
Date | |
Msg-id | OF771CB730.CDE13F6F-ON8525708F.0056A621-8525708F.0056C7BC@us.ibm.com Whole thread Raw |
Responses |
Re: Alternative to a temporary table
|
List | pgsql-performance |
I have a PHP web-based application where a temporary list of servers and their characteristics (each represented by a unique numeric server_id) is extracted from a master server list based on a number of dynamic and user-selected criteria (can the user view the server, is it on-line, is it a member of a specific group, etc). Once the user selects the set of criteria (and servers), it rarely change during the rest of the browser session. The resulting temporary list of servers is then joined against other tables with different sets of information about each of the servers, based on the server_id. I currently create a temporary table to hold the selected server_id's and characteristics. I then join this temp table with other data tables to produce my reports. My reason for using the temporary table method is that the SQL for the initial server selection is dynamically created based on the user's selections, and is complex enough that it does not lend itself to being easily incorporated into any of the other data extraction queries (which may also contain dynamic filtering). Unfortunately, the PHP connection to the database does not survive from webscreen to webscreen, so I have to re-generate the temporary server_id table each time it is needed for a report screen. An idea I had to make this process more efficient was instead of re-creating the temporary table over and over each time it is needed, do a one-time extraction of the list of user-selected server_id's, store the list in a PHP global variable, and then use the list in a dynamically-created WHERE clause in the rest of the queries. The resulting query would look something like SELECT * FROM some_data_table WHERE server_id IN (sid1,sid5,sid6,sid17,sid24...) Simple enough, however in rare cases the list of server_id's can range between 6,000 and 10,000. My question to the group is, can having so many values in a WHERE/IN clause effect query performance? Am I being naive about this and is there a different, better way? The server_id field is of course indexed, but it is possible that the list of selected sid's can contain almost all of the values in the some_data_table server_id index (in the situation where _all_ of the records are requested I wouldn't use the WHERE clause in the query). The some_data_table can contain millions of records for thousands of servers, so every bit of efficiency helps. If this is not the proper group for this kind of question, please point me in the right direction. Thanks! --- Steve ___________________________________________________________________________________ Steven Rosenstein IT Architect/Developer | IBM Virtual Server Administration Voice/FAX: 845-689-2064 | Cell: 646-345-6978 | Tieline: 930-6001 Text Messaging: 6463456978 @ mobile.mycingular.com Email: srosenst @ us.ibm.com "Learn from the mistakes of others because you can't live long enough to make them all yourself." -- Eleanor Roosevelt
pgsql-performance by date: