Thread: executeQuery hangs on busy server - PROPOSAL: socketTimeout parameter
Hello everybody, I am using jdbc to access a large database on a busy PostgreSQL server (8.1) hosted on Windows, from several web applications. Occasionally, my executeQuery hang forever. They block on a read operation in SocketInputStream.socketRead0 (seen on the debugger) and stay there for _days_. They disappear only by stopping the application server (Tomcat). This means that the corresponding thread in the application server stops functioning, and after hours or days all threads are consumed and the web applications stop working. I tried to interrupt the threads, but the abovementioned operation is not interruptible (neither by Thread.interrupt() nor Thread.stop() ). After a lot of googling, I took all my courage, downloaded the jdbc sources and recompiled the driver with the following patch: == PGStream.java, from line 96 == public void changeSocket(Socket socket) throws IOException { this.connection = socket; // Submitted by Jason Venner <jason@idiom.com>. Disable Nagle // as we are selective about flushing output only when we // really need to. connection.setTcpNoDelay(true); // aspinelli@imteam.it 2007-08-09 // on slow servers connections used to hang on a read connection.setSoTimeout( 120000 ); // Buffer sizes submitted by Sverre H Huseby <sverrehu@online.no> pg_input = new VisibleBufferedInputStream(connection.getInputStream(), 8192); pg_output = new BufferedOutputStream(connection.getOutputStream(), 8192); if (encoding != null) setEncoding(encoding); } ==== ... and all goes well! Long queries (>2min) fail with a SQLException signalling a communication problem with the server - which is what I want. I think this could be reworked into one more connection parameter "socketTimeout", where 0 means no timeout; the value of the parameter would of course substitute the hard-coded 120000. I've seen a feature request on GBorg about stopping long-running queries, which could be satisfied by what I'm proposing. (actually, the connection parameter works connection-wide, while the feature request deals with single queries). I have browsed the sources and I think I can volunteer the code, if the developer team agrees. The core change is as simple as the lines above. However, there is quite a bit of modifications needed to bring the value of the parameter from org.postgresql.Driver down to PGStream, but I think it is not terribly difficult. I really need this patch, so I would have to apply it again at every new version of the driver - I prefer to donate a hundred lines of code to the project! Another question is - *why* do the queries hang? They are syntactically correct - they almost always work. Probably one of the two parties (driver and server) does some mistake which induces the driver to think there is data available, but the data is not there; a protocol bug? But this is outside my competence. I would appreciate any comment - any developer there? TIA Andrea Spinelli
On Fri, 10 Aug 2007, Andrea Spinelli wrote: > I am using jdbc to access a large database on a busy PostgreSQL server (8.1) > hosted on Windows, from several web applications. > > Occasionally, my executeQuery hang forever. They block on a read operation in > SocketInputStream.socketRead0 (seen on the debugger) and stay there for > _days_. They disappear only by stopping the application server (Tomcat). I believe this is a bug in the 8.1 windows port. What's the exact version you are using? I think this was fixed in 8.1.6. > connection.setSoTimeout( 120000 ); > > Long queries (>2min) fail with a SQLException signalling a communication > problem with the server - which is what I want. > > I think this could be reworked into one more connection parameter > "socketTimeout", where 0 means no timeout; the value of the parameter would > of course substitute the hard-coded 120000. > > I've seen a feature request on GBorg about stopping long-running queries, > which could be satisfied by what I'm proposing. (actually, the connection > parameter works connection-wide, while the feature request deals with single > queries). People want the query to be stopped, but they don't want their whole connection to be killed which is what your change does. Kris Jurka
Kris Jurka <books@ejurka.com> writes: > People want the query to be stopped, but they don't want their whole > connection to be killed which is what your change does. Can't you just "set statement_timeout = something" if you don't want to let long queries complete? regards, tom lane
On Fri, 10 Aug 2007, Tom Lane wrote: > Kris Jurka <books@ejurka.com> writes: >> People want the query to be stopped, but they don't want their whole >> connection to be killed which is what your change does. > > Can't you just "set statement_timeout = something" if you don't want to > let long queries complete? > Unfortunately the timeout is set on a Statement object, not the Connection, so any number of different timeouts may be floating around and we'd need to set and reset the timeout on each query execution to ensure we're executing with the right timeout. So that's how it will be implemented eventually, but it's not just a trivial SET that we're missing. Kris Jurka
Re: executeQuery hangs on busy server - PROPOSAL: socketTimeout parameter
From
Andrea Spinelli
Date:
Kris Jurka wrote: > On Fri, 10 Aug 2007, Andrea Spinelli wrote: > > > I believe this is a bug in the 8.1 windows port. What's the exact > version you are using? I think this was fixed in 8.1.6. The exact version I am using is 8.1.4 - so my version is old - good! There is hope that upgrading will fix everything! > > People want the query to be stopped, but they don't want their whole > connection to be killed which is what your change does. > I see your point, but - as far as I know - there is no way of interrupting a running query (short of exiting from the main executable); so maybe a (large) timeout would be a safe way of protecting against trouble. As to killing the connection, using a connection pool may mitigate the problem. Better a broken connection in a running thread than a useless thread which cannot be interrupted! Of course, my perspective is mainly related to web applications - standalone applications might use other techniques. Anyway, if you plan to support query timeouts, as you suggest in another message, that would be perfect. IMHO, depending on your plans, the "socketTimeout" strategy could be a temporary fix if the "setQueryTimeout" is too far in the future. Just my suggestions after a week of stress-testing a busy server! I love postgreSQL and thank everybody for their efforts here! Andrea Spinelli
Andrea Spinelli wrote: > I see your point, but - as far as I know - there is no way of > interrupting a running query (short of exiting from the main > executable); Statement.cancel(). If the communication between the client and server is intact and the server is still responding, that will eventually cancel the query. (In your case, due to a server bug the server is off in lala land so I'd guess that cancels won't work) -O