Thread: OPtimize the performance of a query
(lower(g.feature_class) like lowwer('P') or lower(g.feature_class) like lower('L'))
>Hi,
>I have two tables in the same database: geoname and test_table.
>The geoname table contains many columns which are: name, feature_class, feature_code, admin1, admin2,admin3, name and so on.
>The second table 'test_table' contains only the columns: city, state.
>There is no join between the two tables and I want to make a match between the data contained in each of them because I need the result for a farther process.
>I wrote this query:
>select g.name, t.city
>from geoname as g, test_table as t
>where (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L')
>and lower(g.country_code) like 'US'
>and lower(g.admin1) like lower(t.state)
>and (lower(g.name) like lower(t.city) or lower(g.name) like lower(t.city || 'city'))
>The table geoname contains 370260 rows and the table test_table contains 10270 rows.
>The query took a long time to accomplish more than half an hour.Should I add another column in the table test_table which contains the country_code and make an inner join with the geoname table or >should I use indexs to accelerate the process?
Indexes are your friends ☺
I’d certainly add indexes on lower(g.feature_class, g.country_code) and lower(t.state)
Note “and lower(g.country_code) like 'US'” will not return any results as you’ve got the ‘US’ in UPPER…. Same as (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L')
Why are you using LIKE? Equals (=) is surely correct and probably faster?
Martin.
The query took a long time to accomplish more than half an hour.Should I add another column in the table test_table which contains the country_code and make an inner join with the geoname table or should I use indexs to accelerate the process?The table geoname contains 370260 rows and the table test_table contains 10270 rows.and (lower(g.name) like lower(t.city) or lower(g.name) like lower(t.city || 'city'))and lower(g.admin1) like lower(t.state)and lower(g.country_code) like 'US'where (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L')from geoname as g, test_table as tselect g.name, t.cityI wrote this query:There is no join between the two tables and I want to make a match between the data contained in each of them because I need the result for a farther process.The second table 'test_table' contains only the columns: city, state.The geoname table contains many columns which are: name, feature_class, feature_code, admin1, admin2,admin3, name and so on.Hi,I have two tables in the same database: geoname and test_table.
Do you have any indecies? https://www.postgresql.org/docs/current/static/indexes- expressional.html might be helpful to you. Also, EXPLAIN will help you understand how your query is being run and where it can be improved.JimOn Tue, Jan 16, 2018 at 11:32 AM, hmidi slim <hmidi.slim2@gmail.com> wrote:The query took a long time to accomplish more than half an hour.Should I add another column in the table test_table which contains the country_code and make an inner join with the geoname table or should I use indexs to accelerate the process?The table geoname contains 370260 rows and the table test_table contains 10270 rows.and (lower(g.name) like lower(t.city) or lower(g.name) like lower(t.city || 'city'))and lower(g.admin1) like lower(t.state)and lower(g.country_code) like 'US'where (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L')from geoname as g, test_table as tselect g.name, t.cityI wrote this query:There is no join between the two tables and I want to make a match between the data contained in each of them because I need the result for a farther process.The second table 'test_table' contains only the columns: city, state.The geoname table contains many columns which are: name, feature_class, feature_code, admin1, admin2,admin3, name and so on.Hi,I have two tables in the same database: geoname and test_table.
It would have been more helpful if you had included
the actual table structures for both tables.
However, I would start by creating separate indexes on
lower(feature_class)
lower(country_code)
lower(admin1)
lower(name)
lower(city)
That being said, you are better off forcing lowercase on all fields
BEFORE inserting into the table.
EG:
INSERT INTO test_table VALUES (lower(some_key), lower(name), lower(feature_class), ....)
Then you would would not need to use lower() in the indexes or the query.
Please, in the future, always include your version of PostgreSQL and O/S
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Sorry I forget the lower command when I wrote the code, it is like this: lower(g.country_code) like lower('US')
(lower(g.feature_class) like lowwer('P') or lower(g.feature_class) like lower('L'))
2018-01-16 17:40 GMT+01:00 Martin Moore <martin.moore@avbrief.com>:
>Hi,
>I have two tables in the same database: geoname and test_table.
>The geoname table contains many columns which are: name, feature_class, feature_code, admin1, admin2,admin3, name and so on.
>The second table 'test_table' contains only the columns: city, state.
>There is no join between the two tables and I want to make a match between the data contained in each of them because I need the result for a farther process.
>I wrote this query:
>select g.name, t.city
>from geoname as g, test_table as t
>where (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L')
>and lower(g.country_code) like 'US'
>and lower(g.admin1) like lower(t.state)
>and (lower(g.name) like lower(t.city) or lower(g.name) like lower(t.city || 'city'))
>The table geoname contains 370260 rows and the table test_table contains 10270 rows.
>The query took a long time to accomplish more than half an hour.Should I add another column in the table test_table which contains the country_code and make an inner join with the geoname table or >should I use indexs to accelerate the process?
Indexes are your friends ☺
I’d certainly add indexes on lower(g.feature_class, g.country_code) and lower(t.state)
Note “and lower(g.country_code) like 'US'” will not return any results as you’ve got the ‘US’ in UPPER…. Same as (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L')
Why are you using LIKE? Equals (=) is surely correct and probably faster?
Martin.
> On 16 Jan 2018, at 17:32, hmidi slim <hmidi.slim2@gmail.com> wrote: > > Hi, > I have two tables in the same database: geoname and test_table. > The geoname table contains many columns which are: name, feature_class, feature_code, admin1, admin2,admin3, name and soon. > The second table 'test_table' contains only the columns: city, state. > There is no join between the two tables and I want to make a match between the data contained in each of them because Ineed the result for a farther process. > I wrote this query: > select g.name, t.city > from geoname as g, test_table as t > where (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L') > and lower(g.country_code) like 'US' > and lower(g.admin1) like lower(t.state) > and (lower(g.name) like lower(t.city) or lower(g.name) like lower(t.city || 'city')) That query is equivalent to: select g.name, t.city from geoname as g, test_table as t where false or lower(g.name) = lower(t.city || 'city')); So those are probably not the results you want. At the very least, if you're lower-casing column contents, don't compare those to an upper-cased constant ;) Also, AND has precedence over OR, which is the other reason why my equivalent query is so much shorter. And finally, LIKE is equivalent to = (equals) without any wildcards. > The table geoname contains 370260 rows and the table test_table contains 10270 rows. > The query took a long time to accomplish more than half an hour.Should I add another column in the table test_table whichcontains the country_code and make an inner join with the geoname table or should I use indexs to accelerate the process? Some indices on lower(g.name) and lower(t.city) from your query would be useful, but in that case make sure you take theconcatenation of 'city' out of the lower()-call in your query. Just reading your latest addition - using lower() on constants is just a waste of cycles. It won't hurt your query much though. Regards, Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest.
Hi2018-01-16 17:44 GMT+01:00 hmidi slim <hmidi.slim2@gmail.com>:Sorry I forget the lower command when I wrote the code, it is like this: lower(g.country_code) like lower('US')
(lower(g.feature_class) like lowwer('P') or lower(g.feature_class) like lower('L'))please, don't do top post.Your query must be slow. When you use LIKE instead =. It is terrible performance issue.So don't use "LIKE" is first rule. Second - you can create functional indexesCREATE INDEX ON geoname ((lower(name)))RegardsPavel2018-01-16 17:40 GMT+01:00 Martin Moore <martin.moore@avbrief.com>:
>Hi,
>I have two tables in the same database: geoname and test_table.
>The geoname table contains many columns which are: name, feature_class, feature_code, admin1, admin2,admin3, name and so on.
>The second table 'test_table' contains only the columns: city, state.
>There is no join between the two tables and I want to make a match between the data contained in each of them because I need the result for a farther process.
>I wrote this query:
>select g.name, t.city
>from geoname as g, test_table as t
>where (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L')
>and lower(g.country_code) like 'US'
>and lower(g.admin1) like lower(t.state)
>and (lower(g.name) like lower(t.city) or lower(g.name) like lower(t.city || 'city'))
>The table geoname contains 370260 rows and the table test_table contains 10270 rows.
>The query took a long time to accomplish more than half an hour.Should I add another column in the table test_table which contains the country_code and make an inner join with the geoname table or >should I use indexs to accelerate the process?
Indexes are your friends ☺
I’d certainly add indexes on lower(g.feature_class, g.country_code) and lower(t.state)
Note “and lower(g.country_code) like 'US'” will not return any results as you’ve got the ‘US’ in UPPER…. Same as (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L')
Why are you using LIKE? Equals (=) is surely correct and probably faster?
Martin.
I changed the operator like and I'm using the operator = .I got the results much faster but I still have another question about operator. For difference should I use '<>' or 'is distinct from' with indexes?
2018-01-16 17:49 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:Hi2018-01-16 17:44 GMT+01:00 hmidi slim <hmidi.slim2@gmail.com>:Sorry I forget the lower command when I wrote the code, it is like this: lower(g.country_code) like lower('US')
(lower(g.feature_class) like lowwer('P') or lower(g.feature_class) like lower('L'))please, don't do top post.Your query must be slow. When you use LIKE instead =. It is terrible performance issue.So don't use "LIKE" is first rule. Second - you can create functional indexesCREATE INDEX ON geoname ((lower(name)))RegardsPavel2018-01-16 17:40 GMT+01:00 Martin Moore <martin.moore@avbrief.com>:
>Hi,
>I have two tables in the same database: geoname and test_table.
>The geoname table contains many columns which are: name, feature_class, feature_code, admin1, admin2,admin3, name and so on.
>The second table 'test_table' contains only the columns: city, state.
>There is no join between the two tables and I want to make a match between the data contained in each of them because I need the result for a farther process.
>I wrote this query:
>select g.name, t.city
>from geoname as g, test_table as t
>where (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L')
>and lower(g.country_code) like 'US'
>and lower(g.admin1) like lower(t.state)
>and (lower(g.name) like lower(t.city) or lower(g.name) like lower(t.city || 'city'))
>The table geoname contains 370260 rows and the table test_table contains 10270 rows.
>The query took a long time to accomplish more than half an hour.Should I add another column in the table test_table which contains the country_code and make an inner join with the geoname table or >should I use indexs to accelerate the process?
Indexes are your friends ☺
I’d certainly add indexes on lower(g.feature_class, g.country_code) and lower(t.state)
Note “and lower(g.country_code) like 'US'” will not return any results as you’ve got the ‘US’ in UPPER…. Same as (lower(g.feature_class) like 'P' or lower(g.feature_class) like 'L')
Why are you using LIKE? Equals (=) is surely correct and probably faster?
Martin.
Thank you for your advices and thanks for all people who give me some best practises and useful ideas.
Hi Hmidi, On 17/01/18 06:57, hmidi slim wrote: > I changed the operator like and I'm using the operator = .I got the > results much faster but I still have another question about operator. > For difference should I use '<>' or 'is distinct from' with indexes? > > 2018-01-16 17:49 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com > <mailto:pavel.stehule@gmail.com>>: > > Hi > > 2018-01-16 17:44 GMT+01:00 hmidi slim <hmidi.slim2@gmail.com > <mailto:hmidi.slim2@gmail.com>>: > > Sorry I forget the lower command when I wrote the code, it is > like this: lower(g.country_code) like lower('US') > (lower(g.feature_class) like lowwer('P') or > lower(g.feature_class) like lower('L')) > > > please, don't do top post. > [...] Top posting is when you reply at the top of the email, rather than at the bottom like this. Bottom posting allows people to see the context before your reply. You can trim excess, or no longer relevant, content - but note the bits that you have omitted with '[...]' Bottom posting is preferred. Cheers, Gavin