BUG #18907: SSL error: bad length failure during transfer data in pipeline mode with libpq - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18907: SSL error: bad length failure during transfer data in pipeline mode with libpq
Date
Msg-id 18907-d41b9bcf6f29edda@postgresql.org
Whole thread Raw
Responses Re: BUG #18907: SSL error: bad length failure during transfer data in pipeline mode with libpq
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18907
Logged by:          Dorjpalam Batbaatar
Email address:      htgn.dbat.95@gmail.com
PostgreSQL version: 16.4
Operating system:   AlmaLinux 9
Description:

When using libpq to transfer large amounts of data to the server in pipeline
mode (registering with COPY), an error "SSL error: bad length"
sometimes occurs. The most common cause of the error is libpq's
PQsendQueryParams(). PostgreSQL is version 16.4.
I looked into this here, and it seems that the cause is that openssl's
SSL_write() is not being retried when it should be.
According to the openssl documentation SSL_write(), if the return value of
SSL_get_error() is SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE,
it must be called again with the same data.
https://docs.openssl.org/3.0/man3/SSL_write/#warnings
In libpq's message sending function pqPutMsgEnd(PGconn *conn), if not all
data has been sent and in non-blocking mode, it just returns,
but in the libpq's exported API (e.g. PQsendQueryGuts() called by
PQsendQueryParams()), pqPutMsgEnd() is called multiple times, so I think the
sent data changes.
So in the above situation, it needs to be retried with the same data, but it
seems that the error occurs because the send data has changed.
As a test, I tried to retry if pqsecure_write() returned 0 in pqSendSome(),
and it ran in pipeline mode without errors. pqSendSome()
is a function which called from pqPutMsgEnd(PGconn *conn) and
pqsecure_write() is called from this. In pqsecure_write() SSL_write() is
performed.
Below is the patch I tried.
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index 488f7d6e55..bbafb189c9 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -914,22 +914,43 @@ pqSendSome(PGconn *conn, int len)
                         * Note that errors here don't result in
write_failed becoming
                         * set.
                         */
-                       if (pqReadData(conn) < 0)
+                       if (sent > 0)
                        {
-                               result = -1;    /* error message already set
up */
-                               break;
-                       }
+                               if (pqReadData(conn) < 0)
+                               {
+                                       result = -1;    /* error message
already set up */
+                                       break;
+                               }
-                       if (pqIsnonblocking(conn))
-                       {
-                               result = 1;
-                               break;
-                       }
+                               if (pqIsnonblocking(conn))
+                               {
+                                       result = 1;
+                                       break;
+                               }
-                       if (pqWait(true, true, conn))
+                               if (pqWait(true, true, conn))
+                               {
+                                       result = -1;
+                                       break;
+                               }
+                       }
+                       else
                        {
-                               result = -1;
-                               break;
+                               /*
+                                * When sent is 0 retry for write. Before
write again read
+                                * which arrived responses from the server
+                                */
+                               if (pqWait(true, true, conn))
+                               {
+                                       result = -1;
+                                       break;
+                               }
+
+                               if (pqReadData(conn) < 0)
+                               {
+                                       result = -1;    /* error message
already set up */
+                                       break;
+                               }
                        }
                }
        }


pgsql-bugs by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: pg_restore error with partitioned table having exclude constraint
Next
From: Tom Lane
Date:
Subject: Re: BUG #18907: SSL error: bad length failure during transfer data in pipeline mode with libpq