Re: ~* + LIMIT => infinite time? - Mailing list pgsql-performance
From | |
---|---|
Subject | Re: ~* + LIMIT => infinite time? |
Date | |
Msg-id | 63165.12.249.229.112.1039912914.squirrel@www.l-i-e.com Whole thread Raw |
In response to | Re: ~* + LIMIT => infinite time? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: ~* + LIMIT => infinite time?
Re: ~* + LIMIT => infinite time? |
List | pgsql-performance |
> Josh Berkus <josh@agliodbs.com> writes: >> You'll get a snarky response, and then be told to upgrade, if you try >> to submit a bug in 7.1.3. > > 7.1 is a tad long in the tooth, but still I'm curious about this. I > don't see how <plan A> can possibly take longer than <plan A> + <LIMIT > node on top>. Hey Tom. I think we met very briefly at the International PHP Conference in Frankfurt in 2001... Anyway. It's actually the other way around. <Plan A> takes like 4 seconds. <Plan A> + <LIMIT node on top> takes literally FOREVER and leaves a postgres process hanging 'round that I have to kill -9 to get rid of. I'd understand the LIMIT clause taking a bit longer, or being faster for startup (if there were no ORDER BY, which there is) but I never even considered it would hang the whole thing. Actually, PostgreSQL has been so reliable over the years, the idea that I'd run across a bug was just foreign to me... So I've been trying to tune performance on this query for weeks now, not realizing that the speed wasn't the issue at all. I could almost rip out the LIMIT completely if the application logic let me, and if the performance were a bit better. It occurred to me last night that the actual data *MIGHT* be involved -- It's some OCR text, and there are a few scattered non-ASCII characters involved... So *MAYBE* the actual text getting scanned could also be important. It seems unlikely, since the non-LIMIT query returns all the data just fine, but just in case... Here's a schema and a full dump for anybody that wants to dig in: http://bulletinarchive.org/pg_dump/ I could provide PHP source as well, or the query posted in this thread can serve as the test case. At the moment, I've altered the application to not use LIMIT when I have ~* in the query, and it "works" -- only the paging of results is broken, and the whole page takes twice as long to load as it should in those cases, since it's doing the same query twice and snarfing all the monster data and then throwing away the majority of rows in both cases. I need the first row to get the highest score, and the rows for paging in the real application... Anyway, my point is that the queries seem fine without the LIMIT clause, and "hang" with both "~*" and LIMIT, and I've even gone so far as to incorporate that into the application logic for now, just to have a page that loads at all instead of one that hangs. Meanwhile, I guess I should flail at it and try 7.3 in the hopes the bug disappeared. I was hoping to know for sure that it was a fixed bug in that upgrade path. Boss actually said we should go ahead and upgrade just on principle anyway. It's nice to have a smart boss. :-)
pgsql-performance by date: