Re: Cygwin PostgreSQL Regression Test Problems (Revisited) - Mailing list pgsql-ports
From | Jason Tishler |
---|---|
Subject | Re: Cygwin PostgreSQL Regression Test Problems (Revisited) |
Date | |
Msg-id | 20010402131917.C798@dothill.com Whole thread Raw |
In response to | Re: Cygwin PostgreSQL Regression Test Problems (Revisited) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Cygwin PostgreSQL Regression Test Problems (Revisited)
|
List | pgsql-ports |
Tom, On Sun, Apr 01, 2001 at 01:57:35PM -0400, Tom Lane wrote: > Jason Tishler <Jason.Tishler@dothill.com> writes: > > I'm glad that you agree. Please post to the list when the change is in > > CVS and I will test that this solves the Cygwin regression test (i.e., > > psql) hangs. > > Done as of yesterday; should be in this morning's snapshot. Thanks. > > Actually, the blocking connect() change for Cygwin is obviated by the > > pqWait() fix. So, I am now no longer recommending making the blocking > > connect() change for Cygwin. Unless, you do so for other Unixes too. > > I made both changes in the hope that the blocking connect change would > suppress your problem with connection-refused failures. If it does not, > then we may as well reverse out the fe-connect.c change. Let me know. With both changes or only the fe-connect.c one, psql does not hang and displays the following error message when the connection is refused: psql: connectDBStart() -- connect() failed: Connection refused Is the postmaster running locally and accepting connections on Unix socket '/tmp/.s.PGSQL.65432'? With only the fe-misc.c change, psql does not hang and displays the following error message when the connection is refused: psql: PQconnectPoll() -- connect() failed: error 10061 Is the postmaster running locally and accepting connections on Unix socket '/tmp/.s.PGSQL.65432'? In both cases there are no hangs, just the error messages are different. Unfortunately, for the non-blocking case the error message is cryptic. I tried tracking down error "10061" which comes from getsockopt(), but I was unsuccessful. Is there any way to improve the readability of this error message? Also, the blocking connect change did *not* fix the connection refused (spurious) regression test failures. So this change should probably be backed out. > > I'm wondering whether it makes sense to add a simple connection retry > > policy as suggested above by Hiroshi? > > I do not think it is appropriate for libpq to do that. When I made my suggestion above, I was concerned that may be libpq was not the right layer to be implementing connection policies and that possibly psql was the better place. > For one thing, where would you stop --- why exactly two tries? This was another one of my concerns too. > > 2. Change the backlog parameter to listen() in src/backend/libpq/pqcomm.c > > to a number that will "ensure" that the parallel_schedule version of the > > regression test does not generate connection refused conditions. Note > > that I'm not even sure this will really work on all (or any) platforms. > > We already use SOMAXCONN which is supposed to be defined by the system > as the maximum allowed queue depth. If Cygwin fails to define it, or > defines it as something less than it should be, then we might consider > installing a Cygwin-specific hack to redefine SOMAXCONN. Cygwin defines SOMAXCONN to be 5. However, winsock.h defines it to be 5 while winsock2.h defines it to be 0x7fffffff. So, I'm not sure what it the real Cygwin (i.e., Windows) maximum. > However Hiroshi says later that he already tried this. Even if it worked, this would have just pushed the problem instead of really fixing it. > I'm inclined to think > that Cygwin simply has a problem with servicing concurrent connection > requests, perhaps even before the alleged SOMAXCONN value is reached. You meant Windows. Right? :,) In summary, I feel that the fe-connect.c change should be backed out so that Cygwin will be consistent with other UNIXes. I also hope that the non-blocking connection failure message can be made more readable and that make check will not generate spurious failure messages under Cygwin on slow machines. Thanks, Jason -- Jason Tishler Director, Software Engineering Phone: +1 (732) 264-8770 x235 Dot Hill Systems Corp. Fax: +1 (732) 264-8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com
pgsql-ports by date: