Re: psycopg2 (async) socket timeout - Mailing list psycopg
From | Jan Urbański |
---|---|
Subject | Re: psycopg2 (async) socket timeout |
Date | |
Msg-id | 4D597F76.40204@wulczer.org Whole thread Raw |
In response to | Re: psycopg2 (async) socket timeout (Danny Milosavljevic <danny.milo+ml@gmail.com>) |
Responses |
Re: psycopg2 (async) socket timeout
Re: psycopg2 (async) socket timeout |
List | psycopg |
On 14/02/11 19:59, Danny Milosavljevic wrote: > Hi, > > 2011/2/9 Jan Urbański <wulczer@wulczer.org>: >> ----- Original message ----- >> I'll try to reproduce this problem, AIUI you should have the Deferred errback if the connection is lost, but perhaps ittakes some time for Twisted to detect it (actually it takes time for the kernel to detect it). You might try playing withyour TCP keepalive settings. > > I'm trying. No luck so far... > > http://twistedmatrix.com/trac/wiki/FrequentlyAskedQuestions says "If > you rely on TCP timeouts, expect as much as two hours (the precise > amount is platform specific) to pass between when the disruption > occurs and when connectionLost is called". Oops. Yup, default settings for TCP keepalives are quite high... > Hmm, even when I connect, then just down the network interface and > only after that call runQuery, it is also never calling back anything > (well, I didn't wait more than half an hour per try so far). > > But good point, although does this even work for async sockets? - > where you are not reading actively, that is, nobody knows you want to > receive any data? If that worked, that would be the nicest fix. For > the not-so-nice fix, read on :-) AFAIK if you're connected through TCP and waiting for data from the other side, and the other side decides to never send you anything (for instance because it died and did not even send you a RST packet), you have no way of detecting that short of trying to send something every now and then and if there's no response assuming the connection's down. So you actually *need* a heartbeat solution to be able to detect network dying... I think the best idea would be starting a timer every time you start a query and cancelling it when it finishes, and (important) setting the timeout of that timer only a little bit higher than the query timeout setting on the server. This way if your code times out the server won't keep on running your query. > I've now started to do it the way Daniele and you suggested ("just > close it from the client"), so I modified the Connection to start a > timer which will fire if I don't defuse it early enough (and modified > ConnectionPool to check connections periodically and reconnect). Well something like that ;) I'd try doing it on the per-query level, actually. Since you can't have more than one outstanding query, your keepalive won't be sent until the current query finishes. Actually, libpq recently got a feature called PQPing that just checks the state of the connection. So you can have timeouts on your queries and periodic PQPings when you're not running anything. Reminds me: psycopg2 needs to support PQPing, but that should be easy. > After I receive a response, I defuse the timer. If not, the timer > callback will be run. It will call the errback - which will call > connection.close(). > > As far as noticing the "disconnect" (well, potential disconnect) goes, > this works perfectly. > However, doing a connection.close() then doesn't seem to help much, > still investigating why... getting the following: > > File "/usr/lib/python2.6/site-packages/twisted/internet/selectreactor.py", > line 104, in doSelect > [], timeout) > exceptions.ValueError: file descriptor cannot be a negative integer (-1) > > So it seems the FD of the closed connection to postgres is still in > the Twisted reactor? > Seems I am missing some calls to self.reactor.removeReader or -Writer, > maybe. Do those belong in Connection.close() ? Ha, it always comes back to the ticket I filed when writing txpostgres: http://twistedmatrix.com/trac/ticket/4539 Believe it or not, this problem seems to also prevent proper LISTEN/NOTIFY implementation... > If I try to reconnect periodically, can I use the same txpostgres > Connection instance and just call connect() again? I think you can, although recreating the Connection object should not be a problem. Jan