Re: BUG #18907: SSL error: bad length failure during transfer data in pipeline mode with libpq - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #18907: SSL error: bad length failure during transfer data in pipeline mode with libpq
Date
Msg-id 474412.1749479193@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #18907: SSL error: bad length failure during transfer data in pipeline mode with libpq  (BATBAATAR Dorjpalam <htgn.dbat.95@gmail.com>)
Responses Re: BUG #18907: SSL error: bad length failure during transfer data in pipeline mode with libpq
List pgsql-bugs
BATBAATAR Dorjpalam <htgn.dbat.95@gmail.com> writes:
> I am sending a sample program to reproduce the this phenomenon.

Thank you for the reproducer!  (For anyone following along at home,
it doesn't fail for me with the suggested "-i 200 -u 200" parameters,
but it does fail in most runs with "-i 1000 -u 200".)

After some playing around, I figured out that the trouble scenario
is like this:

* We have a bunch of data pending to be sent, and we try to pqFlush()
it.  OpenSSL returns SSL_WANT_WRITE, and since we're in nonblock mode
we just accept the failure-to-write and continue on.

* The app provides a bit more data to be sent, and we get to
pqPutMsgEnd(), which does this:

    if (conn->outCount >= 8192)
    {
        int            toSend = conn->outCount - (conn->outCount % 8192);

        if (pqSendSome(conn, toSend) < 0)
            return EOF;
        /* in nonblock mode, don't complain if unable to send it all */
    }

Because of rounding toSend down to an 8K multiple, we are asking
OpenSSL to send less than the previous pqFlush call asked to send.
That violates the SSL_write() API, and at least some of the time
it results in SSL_R_BAD_LENGTH.

As a quick cross-check I've been running with

diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index c14e3c95250..75593ef0f72 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -555,7 +555,7 @@ pqPutMsgEnd(PGconn *conn)
 
     if (conn->outCount >= 8192)
     {
-        int            toSend = conn->outCount - (conn->outCount % 8192);
+        int            toSend = conn->outCount;
 
         if (pqSendSome(conn, toSend) < 0)
             return EOF;

and that seems to prevent the failure.

The SSL_write docs say that you should not either increase or
decrease the length during a repeat call after SSL_WANT_WRITE,
but that seems to be a lie: increasing the length doesn't cause
any problems.  (We do use SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER,
and perhaps that affects this?)

For a real fix, the narrowest answer would be to not round down
toSend if we are using an SSL connection.  I wonder though if
the round-down behavior is of any use with GSSAPI either, or
more generally if it's sensible for anything except a Unix-pipe
connection.

            regards, tom lane



pgsql-bugs by date:

Previous
From: Braulio Fdo Gonzalez
Date:
Subject: Logical replication 'ERROR: invalid memory alloc request size 1831213792' after upgrading to 15.13
Next
From: Jacob Champion
Date:
Subject: Re: tlsv1 alert iso-8859-1 ca error on cert authentication