Thread: Fastest way to drop an index?
I have a table with two fields: user_id col2 There is quite a pivotal SQL for our site that issues the query: ...WHERE user_id = 'xyz' and col2 = 'xyz' Until now, I have had two indexes, one for user_id and one for col2. Now, I built a compound index concurrently (user_id, col2). After an error the previous, this has now built, but the query above is still using individual indexes of olden times. So I want to drop the index on col2 alone. (Col2 is never used by itself, so no use keeping the index. We only created it originally because an index on this field alone would be smaller than a compound index, and because PG manual said individual indexes tend to work better.) But when I try to drop the col2 index, it takes forever, and eats up a lot of memory to the extent that all other stuff stops. Why should dropping an index be so tedious? Would appreciate any thoughts on this. Thanks!
2009/2/3 Phoenix Kiula <phoenix.kiula@gmail.com>
is the system very write-intensive? Under high writes some DDL operations are really a pain...
if so, you could try to make up a script which will rewrite data to another table and switch tables after the operation. but this will require minimal application downtime.
btw, what's your pg version and basic memory-related settings? is it the same system that you describe in your other email?
and what is full definition of this table? only two fields? what's the primary key?
cheers,
Filip
I have a table with two fields:
user_id
col2
There is quite a pivotal SQL for our site that issues the query:
...WHERE user_id = 'xyz' and col2 = 'xyz'
Until now, I have had two indexes, one for user_id and one for col2.
Now, I built a compound index concurrently (user_id, col2). After an
error the previous, this has now built, but the query above is still
using individual indexes of olden times.
So I want to drop the index on col2 alone. (Col2 is never used by
itself, so no use keeping the index. We only created it originally
because an index on this field alone would be smaller than a compound
index, and because PG manual said individual indexes tend to work
better.)
But when I try to drop the col2 index, it takes forever, and eats up a
lot of memory to the extent that all other stuff stops. Why should
dropping an index be so tedious?
Would appreciate any thoughts on this. Thanks!
is the system very write-intensive? Under high writes some DDL operations are really a pain...
if so, you could try to make up a script which will rewrite data to another table and switch tables after the operation. but this will require minimal application downtime.
btw, what's your pg version and basic memory-related settings? is it the same system that you describe in your other email?
and what is full definition of this table? only two fields? what's the primary key?
cheers,
Filip
--
Filip Rembiałkowski
JID,mailto:filip.rembialkowski@gmail.com
On Tue, Feb 3, 2009 at 5:18 AM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > I have a table with two fields: > > user_id > col2 > > There is quite a pivotal SQL for our site that issues the query: > > ...WHERE user_id = 'xyz' and col2 = 'xyz' > > Until now, I have had two indexes, one for user_id and one for col2. > > Now, I built a compound index concurrently (user_id, col2). After an > error the previous, this has now built, but the query above is still > using individual indexes of olden times. > > So I want to drop the index on col2 alone. (Col2 is never used by > itself, so no use keeping the index. We only created it originally > because an index on this field alone would be smaller than a compound > index, and because PG manual said individual indexes tend to work > better.) PostgreSQL can use > 1 index at a time, so it's possible that this index IS getting used. Or does explain rule that out? IF you have a maintenance window, you can test how the db works with different indexes by dropping them in a transaction then rolling back. Note that this locks the table for most other users, so don't do it midday for a half an hour or so. begin drop index index1; explain analyze select query goes here rollback; > But when I try to drop the col2 index, it takes forever, and eats up a > lot of memory to the extent that all other stuff stops. Why should > dropping an index be so tedious? It shouldn't. Is this database under a very heavy load when you're trying to remove the index? Are you sure the drop index is what's eating up the memory and you're not just looking at memory usage due to shared memory allocations?
Thanks for the suggestions. Following is the SQL query. Actually, this is not merely a DROP INDEX question. I am also surprised that this straight index query takes more time than it used to! It would be under 1 second because it's a one-row conditional match, but not it takes anywhere between 5 to 10 seconds for just one row! Only change I have made recently is to increase the stats for user_id to 300. Not for title_encrypted. User_id is varchar(35) and title_encrypted is varchar(40). Will this differential statistics on two columns in a WHERE clause affect query speed? I wonder. pguser=# explain analyze select title_alias from testimonials where user_id = 'superman' and title_encrypted = md5('MY TITLE COMES HERE') ; QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Index Scan using new_idx_testimonials_userid on testimonials (cost=0.00..157.78 rows=1 width=9) (actual time=8809.715..8809.715 rows=0 loops=1) Index Cond: ((user_id)::text = 'superman'::text) Filter: ((title_encrypted)::text = 'b333dc1b0992cb8c70b58a418211389a'::text) Total runtime: 8809.750 ms (4 rows) Time: 8811.817 ms pguser=# explain analyze select title_alias from testimonials where user_id = 'superman' and title_encrypted = md5('MY TITLE COMES HERE') ; QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Index Scan using new_idx_testimonials_userid on testimonials (cost=0.00..157.78 rows=1 width=9) (actual time=1.426..1.426 rows=0 loops=1) Index Cond: ((user_id)::text = 'superman'::text) Filter: ((title_encrypted)::text = 'b333dc1b0992cb8c70b58a418211389a'::text) Total runtime: 1.462 ms (4 rows) Time: 2.289 ms
Phoenix Kiula <phoenix.kiula@gmail.com> writes: > Index Scan using new_idx_testimonials_userid on testimonials > (cost=0.00..157.78 rows=1 width=9) (actual time=8809.715..8809.715 > rows=0 loops=1) > Index Cond: ((user_id)::text = 'superman'::text) > Filter: ((title_encrypted)::text = 'b333dc1b0992cb8c70b58a418211389a'::text) > Total runtime: 8809.750 ms This is using the index to fetch the rows that match user_id = 'superman', and then testing each fetched row to see if it has the desired value of title_encrypted. The fact that hardly any rows pass the filter test tells nearly nothing about how long this should be expected to run. The rather high estimated cost suggests that the planner thinks there are several dozen rows matching the index condition, and the actual runtime suggests that there are actually hundred or thousands of 'em. If so, your problem is that you need a different index. I'd bet on an index on title_encrypted being more useful for this query than the one on user_id; or you could experiment with a two-column index. regards, tom lane