Re: SQL WHERE: many sql or large IN() - Mailing list pgsql-general
From | tom |
---|---|
Subject | Re: SQL WHERE: many sql or large IN() |
Date | |
Msg-id | E96D3B35-2D70-48BE-AF10-3E7F48ADA6C5@tacocat.net Whole thread Raw |
In response to | Re: SQL WHERE: many sql or large IN() (Oleg Bartunov <oleg@sai.msu.su>) |
Responses |
Re: SQL WHERE: many sql or large IN()
Re: SQL WHERE: many sql or large IN() |
List | pgsql-general |
If I read this right, intarray is for reading values from an array data type. I don't have this. I have a varchar() field that is indexed (unique), call it 'foo' I have a choice of running: SELECT bar FROM tokens WHERE foo IN ('apple','orange','biscuit'....) for up to ~300 words OR SELECT bar FROM tokens WHERE foo = 'apple' up to ~300 times as a prepared/cached SQL statements. On Apr 6, 2007, at 9:10 AM, Oleg Bartunov wrote: > Tom, > > have you seen contrib/intarray ? > > Oleg > On Fri, 6 Apr 2007, tom wrote: > >> I'm wondering where the differences are in running two different >> types of SQL statements. >> >> Given ~300 tokens/words I can either run 1 sql statement with a >> large list in a "WHERE foo IN (...300 tokens...)" >> or I can run ~300 statements, one for each token. >> In the first case, the SQL is not prepared, but just executed. >> In the second case, the SQL is prepared and run as a cached >> execution plan (I think). >> >> Now. It would seem that the second approach would be painfully >> slow. But I'm not sure that I'm seeing this. >> Currently I have <5 users. As always, this might change... >> >> Before I start going about coding and testing lots of stuff I >> thought I would ask for some historical experiences someone might >> have had when comparing these two approaches and if there are >> inflection points between the performance in terms of the number >> of tokens or simultaneous users. >> >> I should add that the tokens are either indexed or primary indexed >> but in both cases, unique, and not guaranteed to exist in every case. >> >> Initially it seems that the WHERE IN (...) approach takes a turn >> for the worse when the list gets very large. >> It also seems to do comparatively worse when the number of tokens >> is very small. >> But I can't claim any scientifically sound basis for making this >> distinction. >> >> Any experiences someone would like to share? >> >> ---------------------------(end of >> broadcast)--------------------------- >> TIP 1: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), > Sternberg Astronomical Institute, Moscow University, Russia > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(495)939-16-83, +007(495)939-23-83 > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: explain analyze is your friend >
pgsql-general by date: