Thread: disabled SSL log_like tests

disabled SSL log_like tests

From
Andrew Dunstan
Date:
Back in 2022 in commit 55828a6b6084 we disabled a bunch of tests due to 
a timing issue. Nothing seems to have been done since ... do we still 
want to keep these commented out lines around? This "temporary" fix 
seems to have stretched the definition of that term more than somewhat.

(noticed when looking into a different issue).


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: disabled SSL log_like tests

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Back in 2022 in commit 55828a6b6084 we disabled a bunch of tests due to 
> a timing issue. Nothing seems to have been done since ... do we still 
> want to keep these commented out lines around? This "temporary" fix 
> seems to have stretched the definition of that term more than somewhat.

Per my coffee cup, "Nothing is as permanent as a temporary fix".

However, I wonder whether Andres' work at 8b886a4e3 could be used
to improve this, either directly or as inspiration?

            regards, tom lane



Re: disabled SSL log_like tests

From
Andrew Dunstan
Date:
On 2025-04-17 Th 10:56 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Back in 2022 in commit 55828a6b6084 we disabled a bunch of tests due to
>> a timing issue. Nothing seems to have been done since ... do we still
>> want to keep these commented out lines around? This "temporary" fix
>> seems to have stretched the definition of that term more than somewhat.
> Per my coffee cup, "Nothing is as permanent as a temporary fix".
>
> However, I wonder whether Andres' work at 8b886a4e3 could be used
> to improve this, either directly or as inspiration?
>
>             



I don't think so - these tests are about checking log file contents, not 
a psql problem.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: disabled SSL log_like tests

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 2025-04-17 Th 10:56 AM, Tom Lane wrote:
>> However, I wonder whether Andres' work at 8b886a4e3 could be used
>> to improve this, either directly or as inspiration?

> I don't think so - these tests are about checking log file contents, not 
> a psql problem.

Well, I was mainly leaning on the "inspiration" part: the idea
is to wait for something that must come out after the text we
want to look for.  After digging around a bit I noticed that
002_connection_limits.pl's connect_fails_wait is doing something
that's almost what we want: that test cranks log_min_messages
up to DEBUG2 and then waits for the postmaster's report of
backend exit before believing that it's done.

Awhile later I had the attached patch.  Some notes:

* The commented-out tests in 001_ssltests.pl contained hard-wired
expectations as to certificate serial numbers, which are obsolete now.
I just replaced them with "\d+", but if anyone feels like that's not
good enough, we could continue to check for exact serial numbers and
eat the ensuing maintenance effort.

* I was more than slightly surprised to find that there are a bunch of
other connect_fails callers that are testing log_like or log_unlike
and thereby risking the same type of race condition.  Some of those
tests are relatively new and perhaps just haven't failed *yet*,
but I wonder if we changed something since 2022 that solves this
problem in a different way?  Anyway, after this change any such
caller must set log_min_messages = debug2 or fail.  I think I got
all the relevant test scripts in the attached.

            regards, tom lane

diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 756b4146050..37d96d95a1a 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -66,6 +66,8 @@ sub test_conn
 my $node = PostgreSQL::Test::Cluster->new('primary');
 $node->init;
 $node->append_conf('postgresql.conf', "log_connections = on\n");
+# Needed to allow connect_fails to inspect postmaster log:
+$node->append_conf('postgresql.conf', "log_min_messages = debug2");
 $node->start;

 # Test behavior of log_connections GUC
diff --git a/src/test/authentication/t/003_peer.pl b/src/test/authentication/t/003_peer.pl
index 69ba73bd2b9..2879800eacf 100644
--- a/src/test/authentication/t/003_peer.pl
+++ b/src/test/authentication/t/003_peer.pl
@@ -72,6 +72,8 @@ sub test_role
 my $node = PostgreSQL::Test::Cluster->new('node');
 $node->init;
 $node->append_conf('postgresql.conf', "log_connections = on\n");
+# Needed to allow connect_fails to inspect postmaster log:
+$node->append_conf('postgresql.conf', "log_min_messages = debug2");
 $node->start;

 # Set pg_hba.conf with the peer authentication.
diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl
index 6748b109dec..2dc6bec9b89 100644
--- a/src/test/kerberos/t/001_auth.pl
+++ b/src/test/kerberos/t/001_auth.pl
@@ -66,6 +66,7 @@ $node->append_conf(
 listen_addresses = '$hostaddr'
 krb_server_keyfile = '$krb->{keytab}'
 log_connections = on
+log_min_messages = debug2
 lc_messages = 'C'
 });
 $node->start;
diff --git a/src/test/ldap/t/001_auth.pl b/src/test/ldap/t/001_auth.pl
index 352b0fc1fa7..d1315ed5351 100644
--- a/src/test/ldap/t/001_auth.pl
+++ b/src/test/ldap/t/001_auth.pl
@@ -48,6 +48,8 @@ note "setting up PostgreSQL instance";
 my $node = PostgreSQL::Test::Cluster->new('node');
 $node->init;
 $node->append_conf('postgresql.conf', "log_connections = on\n");
+# Needed to allow connect_fails to inspect postmaster log:
+$node->append_conf('postgresql.conf', "log_min_messages = debug2");
 $node->start;

 $node->safe_psql('postgres', 'CREATE USER test0;');
diff --git a/src/test/modules/oauth_validator/t/001_server.pl b/src/test/modules/oauth_validator/t/001_server.pl
index d88994abc24..4f035417a40 100644
--- a/src/test/modules/oauth_validator/t/001_server.pl
+++ b/src/test/modules/oauth_validator/t/001_server.pl
@@ -48,6 +48,8 @@ $node->init;
 $node->append_conf('postgresql.conf', "log_connections = on\n");
 $node->append_conf('postgresql.conf',
     "oauth_validator_libraries = 'validator'\n");
+# Needed to allow connect_fails to inspect postmaster log:
+$node->append_conf('postgresql.conf', "log_min_messages = debug2");
 $node->start;

 $node->safe_psql('postgres', 'CREATE USER test;');
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 8759ed2cbba..7877cc8d994 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -2624,7 +2624,12 @@ If this regular expression is set, matches it with the output generated.

 =item log_unlike => [ qr/prohibited message/ ]

-See C<log_check(...)>.
+See C<log_check(...)>.  CAUTION: use of either option requires that
+the server's log_min_messages be at least DEBUG2, and that no other
+client backend is launched concurrently.  These requirements allow
+C<connect_fails> to wait to see the postmaster-log report of backend
+exit, without which there is a race condition as to whether we will
+see the expected backend log output.

 =back

@@ -2652,7 +2657,14 @@ sub connect_fails
         like($stderr, $params{expected_stderr}, "$test_name: matches");
     }

-    $self->log_check($test_name, $log_location, %params);
+    if (defined($params{log_like}) or defined($params{log_unlike}))
+    {
+        $self->wait_for_log(
+            qr/DEBUG:  (00000: )?client backend.*exited with exit code \d/,
+            $log_location);
+
+        $self->log_check($test_name, $log_location, %params);
+    }
 }

 =pod
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 086abf3b8b3..8b0de2d8e7e 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -60,6 +60,8 @@ my $common_connstr;
 note "setting up data directory";
 my $node = PostgreSQL::Test::Cluster->new('primary');
 $node->init;
+# Needed to allow connect_fails to inspect postmaster log:
+$node->append_conf('postgresql.conf', "log_min_messages = debug2");

 # PGHOST is enforced here to set up the node, subsequent connections
 # will use a dedicated connection string.
@@ -807,10 +809,8 @@ $node->connect_fails(
     expected_stderr =>
       qr/certificate authentication failed for user "anotheruser"/,
     # certificate authentication should be logged even on failure
-    # temporarily(?) skip this check due to timing issue
-    #    log_like =>
-    #      [qr/connection authenticated: identity="CN=ssltestuser" method=cert/],
-);
+    log_like =>
+      [qr/connection authenticated: identity="CN=ssltestuser" method=cert/],);

 # revoked client cert
 $node->connect_fails(
@@ -818,11 +818,10 @@ $node->connect_fails(
       . sslkey('client-revoked.key'),
     "certificate authorization fails with revoked client cert",
     expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 0: certificate revoked},
-    #        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656577,
issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, 
-    #    ],
+    log_like => [
+        qr{Client certificate verification failed at depth 0: certificate revoked},
+        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA
forPostgreSQL SSL regression test client certs"}, 
+    ],
     # revoked certificates should not authenticate the user
     log_unlike => [qr/connection authenticated:/],);

@@ -872,24 +871,20 @@ $node->connect_fails(
     $common_connstr . " " . "sslmode=require sslcert=ssl/client.crt",
     "intermediate client certificate is missing",
     expected_stderr => qr/SSL error: tlsv1 alert unknown ca/,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 0: unable to get local issuer certificate},
-    #        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656576,
issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, 
-    #    ]
-);
+    log_like => [
+        qr{Client certificate verification failed at depth 0: unable to get local issuer certificate},
+        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA
forPostgreSQL SSL regression test client certs"}, 
+    ]);

 $node->connect_fails(
     "$common_connstr sslmode=require sslcert=ssl/client-long.crt "
       . sslkey('client-long.key'),
     "logged client certificate Subjects are truncated if they're too long",
     expected_stderr => qr/SSL error: tlsv1 alert unknown ca/,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 0: unable to get local issuer certificate},
-    #        qr{Failed certificate data \(unverified\): subject
"\.\.\./CN=ssl-123456789012345678901234567890123456789012345678901234567890",serial number 2315418733629425152, issuer
"/CN=TestCA for PostgreSQL SSL regression test client certs"}, 
-    #    ]
-);
+    log_like => [
+        qr{Client certificate verification failed at depth 0: unable to get local issuer certificate},
+        qr{Failed certificate data \(unverified\): subject
"\.\.\./CN=ssl-123456789012345678901234567890123456789012345678901234567890",serial number \d+, issuer "/CN=Test CA for
PostgreSQLSSL regression test client certs"}, 
+    ]);

 # Use an invalid cafile here so that the next test won't be able to verify the
 # client CA.
@@ -904,12 +899,10 @@ $node->connect_fails(
     "$common_connstr sslmode=require sslcert=ssl/client+client_ca.crt",
     "intermediate client certificate is untrusted",
     expected_stderr => qr/SSL error: tlsv1 alert unknown ca/,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 1: unable to get local issuer certificate},
-    #        qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client
certs",serial number 2315134995201656577, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, 
-    #    ]
-);
+    log_like => [
+        qr{Client certificate verification failed at depth 1: unable to get local issuer certificate},
+        qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client
certs",serial number \d+, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, 
+    ]);

 # test server-side CRL directory
 switch_server_cert(
@@ -923,12 +916,10 @@ $node->connect_fails(
       . sslkey('client-revoked.key'),
     "certificate authorization fails with revoked client cert with server-side CRL directory",
     expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 0: certificate revoked},
-    #        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656577,
issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, 
-    #    ]
-);
+    log_like => [
+        qr{Client certificate verification failed at depth 0: certificate revoked},
+        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA
forPostgreSQL SSL regression test client certs"}, 
+    ]);

 # revoked client cert, non-ASCII subject
 $node->connect_fails(
@@ -936,11 +927,9 @@ $node->connect_fails(
       . sslkey('client-revoked-utf8.key'),
     "certificate authorization fails with revoked UTF-8 client cert with server-side CRL directory",
     expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 0: certificate revoked},
-    #        qr{Failed certificate data \(unverified\): subject
"/CN=\\xce\\x9f\\xce\\xb4\\xcf\\x85\\xcf\\x83\\xcf\\x83\\xce\\xad\\xce\\xb1\\xcf\\x82",serial number
2315420958437414144,issuer "/CN=Test CA for PostgreSQL SSL regression test client certs"}, 
-    #    ]
-);
+    log_like => [
+        qr{Client certificate verification failed at depth 0: certificate revoked},
+        qr{Failed certificate data \(unverified\): subject
"/CN=\\xce\\x9f\\xce\\xb4\\xcf\\x85\\xcf\\x83\\xcf\\x83\\xce\\xad\\xce\\xb1\\xcf\\x82",serial number \d+, issuer
"/CN=TestCA for PostgreSQL SSL regression test client certs"}, 
+    ]);

 done_testing();

Re: disabled SSL log_like tests

From
Jacob Champion
Date:
On Fri, Apr 18, 2025 at 12:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> * The commented-out tests in 001_ssltests.pl contained hard-wired
> expectations as to certificate serial numbers, which are obsolete now.
> I just replaced them with "\d+", but if anyone feels like that's not
> good enough, we could continue to check for exact serial numbers and
> eat the ensuing maintenance effort.

No, I think \d+ should be fine.

(I haven't stared closely at the racing code yet, but I like the
concept of the patch.)

Thanks!
--Jacob



Re: disabled SSL log_like tests

From
Tom Lane
Date:
I wrote:
> * I was more than slightly surprised to find that there are a bunch of
> other connect_fails callers that are testing log_like or log_unlike
> and thereby risking the same type of race condition.  Some of those
> tests are relatively new and perhaps just haven't failed *yet*,
> but I wonder if we changed something since 2022 that solves this
> problem in a different way?

Nope, apparently not.  The failure depends on the kernel's scheduler,
so unsurprisingly it's quite platform-dependent.  But I can reproduce
it reliably on mamba's host (NetBSD 10/macppc) if I enable the
001_ssltests.pl log_like tests without applying the connect_fails
changes.

The fact that mamba hasn't been failing on the other affected
tests is a bit puzzling.  mamba isn't running kerberos or ldap
or oauth_validator, so the lack of failures there is expected.
authentication/t/001_password.pl appears accidentally not vulnerable:
it's not using log_like in any connect_fails tests.  (It is using
log_unlike, so those tests could be giving silent false negatives.)
But authentication/t/003_peer.pl has 8 test cases that look
vulnerable.  I guess there must be some extra dollop of timing
weirdness in the ssl tests that's not there in 003_peer.pl.

Unfortunately ... it sometimes fails even with the connect_fails
changes, for example

#   Failed test 'intermediate client certificate is missing: log matches'
#   at /home/tgl/pgsql/src/test/ssl/../../../src/test/perl/PostgreSQL/Test/Cluster.pm line 2666.
#                   '2025-04-18 17:59:19.358 EDT [1460] DEBUG:  assigned pm child slot 2 for backend
# 2025-04-18 17:59:19.359 EDT [1460] DEBUG:  forked new client backend, pid=599 socket=8
# 2025-04-18 17:59:19.369 EDT [599] [unknown] LOG:  connection received: host=localhost port=63709
# 2025-04-18 17:59:19.436 EDT [1460] DEBUG:  releasing pm child slot 1
# 2025-04-18 17:59:19.436 EDT [1460] DEBUG:  client backend (PID 25401) exited with exit code 0
# '
#     doesn't match '(?^:Client certificate verification failed at depth 0: unable to get local issuer certificate)'

What I think happened here is that a previous backend hadn't exited
yet when we start the test, and when its report does come out,
connect_fails prematurely stops waiting.  (In the above, evidently
the child process we want to wait for is 599, but we're fooled by
a delayed report for 25401.)  So my v1 patch needs work.

Maybe we can make the test compare the PIDs in the "forked new client
backend" and "client backend exited" log messages.  Stay tuned...

            regards, tom lane



Re: disabled SSL log_like tests

From
Tom Lane
Date:
I wrote:
> What I think happened here is that a previous backend hadn't exited
> yet when we start the test, and when its report does come out,
> connect_fails prematurely stops waiting.  (In the above, evidently
> the child process we want to wait for is 599, but we're fooled by
> a delayed report for 25401.)  So my v1 patch needs work.

> Maybe we can make the test compare the PIDs in the "forked new client
> backend" and "client backend exited" log messages.  Stay tuned...

Okay, this version seems more reliable.

            regards, tom lane

diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 756b4146050..37d96d95a1a 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -66,6 +66,8 @@ sub test_conn
 my $node = PostgreSQL::Test::Cluster->new('primary');
 $node->init;
 $node->append_conf('postgresql.conf', "log_connections = on\n");
+# Needed to allow connect_fails to inspect postmaster log:
+$node->append_conf('postgresql.conf', "log_min_messages = debug2");
 $node->start;

 # Test behavior of log_connections GUC
diff --git a/src/test/authentication/t/003_peer.pl b/src/test/authentication/t/003_peer.pl
index 69ba73bd2b9..2879800eacf 100644
--- a/src/test/authentication/t/003_peer.pl
+++ b/src/test/authentication/t/003_peer.pl
@@ -72,6 +72,8 @@ sub test_role
 my $node = PostgreSQL::Test::Cluster->new('node');
 $node->init;
 $node->append_conf('postgresql.conf', "log_connections = on\n");
+# Needed to allow connect_fails to inspect postmaster log:
+$node->append_conf('postgresql.conf', "log_min_messages = debug2");
 $node->start;

 # Set pg_hba.conf with the peer authentication.
diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl
index 6748b109dec..2dc6bec9b89 100644
--- a/src/test/kerberos/t/001_auth.pl
+++ b/src/test/kerberos/t/001_auth.pl
@@ -66,6 +66,7 @@ $node->append_conf(
 listen_addresses = '$hostaddr'
 krb_server_keyfile = '$krb->{keytab}'
 log_connections = on
+log_min_messages = debug2
 lc_messages = 'C'
 });
 $node->start;
diff --git a/src/test/ldap/t/001_auth.pl b/src/test/ldap/t/001_auth.pl
index 352b0fc1fa7..d1315ed5351 100644
--- a/src/test/ldap/t/001_auth.pl
+++ b/src/test/ldap/t/001_auth.pl
@@ -48,6 +48,8 @@ note "setting up PostgreSQL instance";
 my $node = PostgreSQL::Test::Cluster->new('node');
 $node->init;
 $node->append_conf('postgresql.conf', "log_connections = on\n");
+# Needed to allow connect_fails to inspect postmaster log:
+$node->append_conf('postgresql.conf', "log_min_messages = debug2");
 $node->start;

 $node->safe_psql('postgres', 'CREATE USER test0;');
diff --git a/src/test/modules/oauth_validator/t/001_server.pl b/src/test/modules/oauth_validator/t/001_server.pl
index d88994abc24..4f035417a40 100644
--- a/src/test/modules/oauth_validator/t/001_server.pl
+++ b/src/test/modules/oauth_validator/t/001_server.pl
@@ -48,6 +48,8 @@ $node->init;
 $node->append_conf('postgresql.conf', "log_connections = on\n");
 $node->append_conf('postgresql.conf',
     "oauth_validator_libraries = 'validator'\n");
+# Needed to allow connect_fails to inspect postmaster log:
+$node->append_conf('postgresql.conf', "log_min_messages = debug2");
 $node->start;

 $node->safe_psql('postgres', 'CREATE USER test;');
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 8759ed2cbba..4b858900139 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -2618,13 +2618,19 @@ to fail.

 =item expected_stderr => B<value>

-If this regular expression is set, matches it with the output generated.
+If this regular expression is set, matches it to the output generated
+by B<psql>.

 =item log_like => [ qr/required message/ ]

 =item log_unlike => [ qr/prohibited message/ ]

-See C<log_check(...)>.
+See C<log_check(...)>.  CAUTION: use of either option requires that
+the server's log_min_messages be at least DEBUG2, and that no other
+client backend is launched concurrently.  These requirements allow
+C<connect_fails> to wait to see the postmaster-log report of backend
+exit, without which there is a race condition as to whether we will
+see the expected backend log output.

 =back

@@ -2652,7 +2658,14 @@ sub connect_fails
         like($stderr, $params{expected_stderr}, "$test_name: matches");
     }

-    $self->log_check($test_name, $log_location, %params);
+    if (defined($params{log_like}) or defined($params{log_unlike}))
+    {
+        $self->wait_for_log(
+            qr/DEBUG:  (?:00000: )?forked new client backend, pid=(\d+) socket.*DEBUG:  (?:00000: )?client backend
\(PID\1\) exited with exit code \d/s, 
+            $log_location);
+
+        $self->log_check($test_name, $log_location, %params);
+    }
 }

 =pod
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 086abf3b8b3..8b0de2d8e7e 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -60,6 +60,8 @@ my $common_connstr;
 note "setting up data directory";
 my $node = PostgreSQL::Test::Cluster->new('primary');
 $node->init;
+# Needed to allow connect_fails to inspect postmaster log:
+$node->append_conf('postgresql.conf', "log_min_messages = debug2");

 # PGHOST is enforced here to set up the node, subsequent connections
 # will use a dedicated connection string.
@@ -807,10 +809,8 @@ $node->connect_fails(
     expected_stderr =>
       qr/certificate authentication failed for user "anotheruser"/,
     # certificate authentication should be logged even on failure
-    # temporarily(?) skip this check due to timing issue
-    #    log_like =>
-    #      [qr/connection authenticated: identity="CN=ssltestuser" method=cert/],
-);
+    log_like =>
+      [qr/connection authenticated: identity="CN=ssltestuser" method=cert/],);

 # revoked client cert
 $node->connect_fails(
@@ -818,11 +818,10 @@ $node->connect_fails(
       . sslkey('client-revoked.key'),
     "certificate authorization fails with revoked client cert",
     expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 0: certificate revoked},
-    #        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656577,
issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, 
-    #    ],
+    log_like => [
+        qr{Client certificate verification failed at depth 0: certificate revoked},
+        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA
forPostgreSQL SSL regression test client certs"}, 
+    ],
     # revoked certificates should not authenticate the user
     log_unlike => [qr/connection authenticated:/],);

@@ -872,24 +871,20 @@ $node->connect_fails(
     $common_connstr . " " . "sslmode=require sslcert=ssl/client.crt",
     "intermediate client certificate is missing",
     expected_stderr => qr/SSL error: tlsv1 alert unknown ca/,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 0: unable to get local issuer certificate},
-    #        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656576,
issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, 
-    #    ]
-);
+    log_like => [
+        qr{Client certificate verification failed at depth 0: unable to get local issuer certificate},
+        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA
forPostgreSQL SSL regression test client certs"}, 
+    ]);

 $node->connect_fails(
     "$common_connstr sslmode=require sslcert=ssl/client-long.crt "
       . sslkey('client-long.key'),
     "logged client certificate Subjects are truncated if they're too long",
     expected_stderr => qr/SSL error: tlsv1 alert unknown ca/,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 0: unable to get local issuer certificate},
-    #        qr{Failed certificate data \(unverified\): subject
"\.\.\./CN=ssl-123456789012345678901234567890123456789012345678901234567890",serial number 2315418733629425152, issuer
"/CN=TestCA for PostgreSQL SSL regression test client certs"}, 
-    #    ]
-);
+    log_like => [
+        qr{Client certificate verification failed at depth 0: unable to get local issuer certificate},
+        qr{Failed certificate data \(unverified\): subject
"\.\.\./CN=ssl-123456789012345678901234567890123456789012345678901234567890",serial number \d+, issuer "/CN=Test CA for
PostgreSQLSSL regression test client certs"}, 
+    ]);

 # Use an invalid cafile here so that the next test won't be able to verify the
 # client CA.
@@ -904,12 +899,10 @@ $node->connect_fails(
     "$common_connstr sslmode=require sslcert=ssl/client+client_ca.crt",
     "intermediate client certificate is untrusted",
     expected_stderr => qr/SSL error: tlsv1 alert unknown ca/,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 1: unable to get local issuer certificate},
-    #        qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client
certs",serial number 2315134995201656577, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, 
-    #    ]
-);
+    log_like => [
+        qr{Client certificate verification failed at depth 1: unable to get local issuer certificate},
+        qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client
certs",serial number \d+, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, 
+    ]);

 # test server-side CRL directory
 switch_server_cert(
@@ -923,12 +916,10 @@ $node->connect_fails(
       . sslkey('client-revoked.key'),
     "certificate authorization fails with revoked client cert with server-side CRL directory",
     expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 0: certificate revoked},
-    #        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656577,
issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, 
-    #    ]
-);
+    log_like => [
+        qr{Client certificate verification failed at depth 0: certificate revoked},
+        qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA
forPostgreSQL SSL regression test client certs"}, 
+    ]);

 # revoked client cert, non-ASCII subject
 $node->connect_fails(
@@ -936,11 +927,9 @@ $node->connect_fails(
       . sslkey('client-revoked-utf8.key'),
     "certificate authorization fails with revoked UTF-8 client cert with server-side CRL directory",
     expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|,
-    # temporarily(?) skip this check due to timing issue
-    #    log_like => [
-    #        qr{Client certificate verification failed at depth 0: certificate revoked},
-    #        qr{Failed certificate data \(unverified\): subject
"/CN=\\xce\\x9f\\xce\\xb4\\xcf\\x85\\xcf\\x83\\xcf\\x83\\xce\\xad\\xce\\xb1\\xcf\\x82",serial number
2315420958437414144,issuer "/CN=Test CA for PostgreSQL SSL regression test client certs"}, 
-    #    ]
-);
+    log_like => [
+        qr{Client certificate verification failed at depth 0: certificate revoked},
+        qr{Failed certificate data \(unverified\): subject
"/CN=\\xce\\x9f\\xce\\xb4\\xcf\\x85\\xcf\\x83\\xcf\\x83\\xce\\xad\\xce\\xb1\\xcf\\x82",serial number \d+, issuer
"/CN=TestCA for PostgreSQL SSL regression test client certs"}, 
+    ]);

 done_testing();

Re: disabled SSL log_like tests

From
Andrew Dunstan
Date:
On 2025-04-18 Fr 7:26 PM, Tom Lane wrote:
> I wrote:
>> What I think happened here is that a previous backend hadn't exited
>> yet when we start the test, and when its report does come out,
>> connect_fails prematurely stops waiting.  (In the above, evidently
>> the child process we want to wait for is 599, but we're fooled by
>> a delayed report for 25401.)  So my v1 patch needs work.
>> Maybe we can make the test compare the PIDs in the "forked new client
>> backend" and "client backend exited" log messages.  Stay tuned...
> Okay, this version seems more reliable.
>

+See C<log_check(...)>.  CAUTION: use of either option requires that
+the server's log_min_messages be at least DEBUG2, and that no other
+client backend is launched concurrently.  These requirements allow
+C<connect_fails> to wait to see the postmaster-log report of backend
+exit, without which there is a race condition as to whether we will
+see the expected backend log output.


That seems a little fragile. I can imagine test authors easily 
forgetting this. Is it worth sanity checking to make sure 
log_min_messages is appropriately set?


cheers


andrew




>
--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: disabled SSL log_like tests

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 2025-04-18 Fr 7:26 PM, Tom Lane wrote:
> +See C<log_check(...)>.  CAUTION: use of either option requires that
> +the server's log_min_messages be at least DEBUG2, and that no other
> +client backend is launched concurrently.  These requirements allow
> +C<connect_fails> to wait to see the postmaster-log report of backend
> +exit, without which there is a race condition as to whether we will
> +see the expected backend log output.

> That seems a little fragile. I can imagine test authors easily 
> forgetting this. Is it worth sanity checking to make sure 
> log_min_messages is appropriately set?

Setting log_min_messages is not so easily forgotten, because
connect_fails will just hang until timeout if you didn't.

I'm more worried about the "no other backend" requirement.
I think v2 is reasonably proof against that, but whether it's
sufficiently bulletproof to withstand the buildfarm environment
remains to be seen.  I wish there were a better way to
determine the backend PID for a failed connection...

            regards, tom lane



Re: disabled SSL log_like tests

From
Tom Lane
Date:
I wrote:
> The fact that mamba hasn't been failing on the other affected
> tests is a bit puzzling.  mamba isn't running kerberos or ldap
> or oauth_validator, so the lack of failures there is expected.
> authentication/t/001_password.pl appears accidentally not vulnerable:
> it's not using log_like in any connect_fails tests.  (It is using
> log_unlike, so those tests could be giving silent false negatives.)
> But authentication/t/003_peer.pl has 8 test cases that look
> vulnerable.  I guess there must be some extra dollop of timing
> weirdness in the ssl tests that's not there in 003_peer.pl.

After pushing this patch, the probable explanation for the "extra
timing weirdness" finally penetrated my brain: the error reports
we are looking for are cases that are detected by OpenSSL.  So
it's plausible that OpenSSL has told the connecting client to
go away before it returns the error indication to the backend's
calling code, which would be what would log the message.  That is
what provides the window for a race condition.  You still need
a bunch of assumptions about the kernel's subsequent scheduling
of psql and the TAP script versus the backend, but the whole
thing is certainly dependent on psql getting the word sooner
than the backend.

(Not all of the tests disabled by 55828a6b6 meet this description,
but a bunch of them do, and I believe that I just zapped every
log_like condition in the script rather than trying to be selective
about which ones were known to have failed.)

It seems at least plausible that the Kerberos tests could have
a similar problem.  I'm less sure about LDAP or OAuth.  But
authentication/t/003_peer.pl would not, because elog.c emits errors
to the log before sending them to the client.  So that explains
the lack of buildfarm reports from mamba, and it's unclear that
we have any other animals with the right timing behavior to
see this.

            regards, tom lane



Re: disabled SSL log_like tests

From
Thomas Munro
Date:
If you run the not-yet-enabled-by-default OpenBSD CI task on master,
ssl/001_ssltests fails in "intermediate client certificate is
untrusted", recently uncommented by commit e0f373ee.  I think it might
be telling us that LibreSSL's x509_store_ctx_get_current_cert() is
giving us the client certificate (ie chain depth 0) instead of the
intermediate certificate, even though X509_STORE_CTX_get_error_depth()
returned 1 as expected.  I don't know why it would do that, given the
documentation:

X509_STORE_CTX_get_current_cert() returns the certificate in ctx which
caused the error or NULL if no certificate is relevant.

The explanation is probably in here somewhere, but I don't understand
these things:

https://github.com/openbsd/src/blob/master/lib/libcrypto/x509/x509_vfy.c
https://github.com/openssl/openssl/blob/master/crypto/x509/x509_vfy.c

[17:55:28.888] #   Failed test 'intermediate client certificate is
untrusted: log matches'
[17:55:28.888] #   at
/home/postgres/postgres/src/test/perl/PostgreSQL/Test/Cluster.pm line
2667.
[17:55:28.888] #                   '2025-05-05 17:55:28.353 UTC
[10009][postmaster] DEBUG:  assigned pm child slot 1 for backend
[17:55:28.888] # 2025-05-05 17:55:28.354 UTC [10009][postmaster]
DEBUG:  forked new client backend, pid=27624 socket=8
[17:55:28.888] # 2025-05-05 17:55:28.355 UTC [27624][not initialized]
[[unknown]][:0] LOG:  connection received: host=localhost port=11357
[17:55:28.888] # 2025-05-05 17:55:28.374 UTC [27624][not initialized]
[[unknown]][:0] LOG:  could not accept SSL connection: certificate
verify failed
[17:55:28.888] # 2025-05-05 17:55:28.374 UTC [27624][not initialized]
[[unknown]][:0] DETAIL:  Client certificate verification failed at
depth 1: unable to get local issuer certificate.
[17:55:28.888] #     Failed certificate data (unverified): subject
"/CN=ssltestuser", serial number 2315702411956921344, issuer "/CN=Test
CA for PostgreSQL SSL regression test client certs".
[17:55:28.888] # 2025-05-05 17:55:28.374 UTC [27624][not initialized]
[[unknown]][:0] DEBUG:  SSL connection from DN:"(anonymous)"
CN:"(anonymous)"
[17:55:28.888] # 2025-05-05 17:55:28.377 UTC [10009][postmaster]
DEBUG:  releasing pm child slot 1
[17:55:28.888] # 2025-05-05 17:55:28.377 UTC [10009][postmaster]
DEBUG:  client backend (PID 27624) exited with exit code 0
[17:55:28.888] # '
[17:55:28.888] #     doesn't match '(?^:Failed certificate data
\(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression
test client certs", serial number \d+, issuer "/CN=Test root CA for
PostgreSQL SSL regression test suite")'
[17:55:28.888] # Looks like you failed 1 test of 240.

https://cirrus-ci.com/task/4708964002168832?logs=test_world#L345

https://api.cirrus-ci.com/v1/artifact/task/4708964002168832/testrun/build/testrun/ssl/001_ssltests/log/regress_log_001_ssltests

https://api.cirrus-ci.com/v1/artifact/task/4708964002168832/testrun/build/testrun/ssl/001_ssltests/log/001_ssltests_primary.log



Re: disabled SSL log_like tests

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> If you run the not-yet-enabled-by-default OpenBSD CI task on master,
> ssl/001_ssltests fails in "intermediate client certificate is
> untrusted", recently uncommented by commit e0f373ee.

Yeah, I see that too.  But I also see three failures in 002_scram.pl,
which presumably were there before e0f373ee.  (Tested on OpenBSD 7.6
and 7.7.)  The buildfarm's OpenBSD animals haven't caught this
because they don't run this test suite :-(.  Yes they build with
--with-openssl, but one of them lacks --enable-tap-tests and the
other two aren't filling PG_TEST_EXTRA.

The SCRAM failures are a bit discouraging ...

[18:16:33.259](0.565s) not ok 26 - SCRAM with SSL and channel_binding=require, server certificate uses 'rsassaPss'
[18:16:33.261](0.002s)
[18:16:33.261](0.000s) #   Failed test 'SCRAM with SSL and channel_binding=require, server certificate uses
'rsassaPss''
#   at t/002_scram.pl line 161.
[18:16:33.262](0.001s) #          got: '2'
#     expected: '0'
[18:16:33.264](0.002s) not ok 27 - SCRAM with SSL and channel_binding=require, server certificate uses 'rsassaPss': no
stderr
[18:16:33.265](0.001s)
[18:16:33.265](0.000s) #   Failed test 'SCRAM with SSL and channel_binding=require, server certificate uses
'rsassaPss':no stderr' 
#   at t/002_scram.pl line 161.
[18:16:33.266](0.001s) #          got: 'psql: error: connection to server at "127.0.0.1", port 10442 failed: SSL error:
sslv3alert handshake failure' 
#     expected: ''
[18:16:33.268](0.002s) not ok 28 - SCRAM with SSL and channel_binding=require, server certificate uses 'rsassaPss': log
matches
[18:16:33.269](0.001s)
[18:16:33.269](0.000s) #   Failed test 'SCRAM with SSL and channel_binding=require, server certificate uses
'rsassaPss':log matches' 
#   at /home/tgl/pgsql/src/test/ssl/../../../src/test/perl/PostgreSQL/Test/Cluster.pm line 2607.
[18:16:33.270](0.001s) #                   '2025-05-05 18:16:33.222 EDT [71478] [unknown] LOG:  connection received:
host=localhostport=42632 
# 2025-05-05 18:16:33.244 EDT [71478] [unknown] LOG:  could not accept SSL connection: missing rsa certificate
# '
#     doesn't match '(?^:connection authenticated: identity="ssltestuser" method=scram-sha-256)'

            regards, tom lane



Re: disabled SSL log_like tests

From
Tom Lane
Date:
I wrote:
> Yeah, I see that too.  But I also see three failures in 002_scram.pl,
> which presumably were there before e0f373ee.  (Tested on OpenBSD 7.6
> and 7.7.)  The buildfarm's OpenBSD animals haven't caught this
> because they don't run this test suite :-(.

I dug into this with gdb, and it seems like it's an omission in
LibreSSL's RSA-PSS support.  We're requesting a signature algorithm
with code 2054, which in their source is SIGALG_RSA_PSS_RSAE_SHA512,
and that leads them to this sigalgs[] table entry in
lib/libssl/ssl_sigalgs.c:

    {
        .value = SIGALG_RSA_PSS_RSAE_SHA512,
        .key_type = EVP_PKEY_RSA,
        .md = EVP_sha512,
        .security_level = 5,
        .flags = SIGALG_FLAG_RSA_PSS,
    },

The problem is that the private key has key_type EVP_PKEY_RSA_PSS
which is not equal to EVP_PKEY_RSA, so ssl_sigalg_pkey_ok() in the
same file rejects this entry as not compatible.  In fact, there
are *no* entries in sigalgs[] that can match a PSS private key
according to ssl_sigalg_pkey_ok().  So while perhaps
SIGALG_RSA_PSS_RSAE_SHA512 is the wrong thing for us to request,
I do not see a right thing.

Looking around a little more, there are places like
SSL_get_signature_type_nid() that do things like this:

    *nid = sigalg->key_type;
    if (sigalg->key_type == EVP_PKEY_RSA &&
        (sigalg->flags & SIGALG_FLAG_RSA_PSS))
        *nid = EVP_PKEY_RSA_PSS;

So it seems like this might be a simple oversight in
ssl_sigalg_pkey_ok(), which doesn't make any such correction:

    if (sigalg->key_type != EVP_PKEY_id(pkey))
        return 0;

Anyone know anything about where to submit LibreSSL bugs?

            regards, tom lane



Re: disabled SSL log_like tests

From
Thomas Munro
Date:
On Wed, May 7, 2025 at 1:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So it seems like this might be a simple oversight in
> ssl_sigalg_pkey_ok(), which doesn't make any such correction:
>
>         if (sigalg->key_type != EVP_PKEY_id(pkey))
>                 return 0;

Nice detective work.

> Anyone know anything about where to submit LibreSSL bugs?

I think it's done with sendbug on an OpenBSD box, or perhaps you can
just write a normal email to the bugs@openbsd.org or
libressl@openbsd.org list, based on:

https://www.openbsd.org/mail.html



Re: disabled SSL log_like tests

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Wed, May 7, 2025 at 1:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Anyone know anything about where to submit LibreSSL bugs?

> I think it's done with sendbug on an OpenBSD box, or perhaps you can
> just write a normal email to the bugs@openbsd.org or
> libressl@openbsd.org list, based on:
> https://www.openbsd.org/mail.html

Thanks, I'll look into reporting it tomorrow.  In the meantime,
I couldn't help noticing that the backtraces went through
lib/libssl/tls13_legacy.c, which doesn't give a warm feeling
about how supported they think our usage is (and perhaps also
explains why they didn't detect this bug themselves).  This is
evidently because we set up the SSL context with SSLv23_method(),
per this comment in be_tls_init():

     * We use SSLv23_method() because it can negotiate use of the highest
     * mutually supported protocol version, while alternatives like
     * TLSv1_2_method() permit only one specific version.  Note that we don't
     * actually allow SSL v2 or v3, only TLS protocols (see below).

This choice seems to be more than 20 years old, though the above
comment defending it dates only to 2014.  I wonder if it's time to
revisit that idea.

            regards, tom lane



Re: disabled SSL log_like tests

From
Thomas Munro
Date:
On Wed, May 7, 2025 at 4:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thanks, I'll look into reporting it tomorrow.  In the meantime,
> I couldn't help noticing that the backtraces went through
> lib/libssl/tls13_legacy.c, which doesn't give a warm feeling
> about how supported they think our usage is (and perhaps also
> explains why they didn't detect this bug themselves).  This is
> evidently because we set up the SSL context with SSLv23_method(),
> per this comment in be_tls_init():

Oh, interesting.  I also wondered if the problem I reported might be
related to the separate legacy code paths in x509_vfy.c.



Re: disabled SSL log_like tests

From
Daniel Gustafsson
Date:
> On 7 May 2025, at 06:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Thomas Munro <thomas.munro@gmail.com> writes:
>> On Wed, May 7, 2025 at 1:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Anyone know anything about where to submit LibreSSL bugs?
>
>> I think it's done with sendbug on an OpenBSD box, or perhaps you can
>> just write a normal email to the bugs@openbsd.org or
>> libressl@openbsd.org list, based on:
>> https://www.openbsd.org/mail.html

Bugs are frequently reported, and dealt with, on the libressl mailing list.

> Thanks, I'll look into reporting it tomorrow.  In the meantime,
> I couldn't help noticing that the backtraces went through
> lib/libssl/tls13_legacy.c, which doesn't give a warm feeling
> about how supported they think our usage is (and perhaps also
> explains why they didn't detect this bug themselves).  This is
> evidently because we set up the SSL context with SSLv23_method(),
> per this comment in be_tls_init():
>
>     * We use SSLv23_method() because it can negotiate use of the highest
>     * mutually supported protocol version, while alternatives like
>     * TLSv1_2_method() permit only one specific version.  Note that we don't
>     * actually allow SSL v2 or v3, only TLS protocols (see below).
>
> This choice seems to be more than 20 years old, though the above
> comment defending it dates only to 2014.  I wonder if it's time to
> revisit that idea.

The use of SSLv23_method() comes from us supporting 1.0.2, the replacement
TLS_method() was introduced in 1.1.0 and SSLv23_method() was turned into an
alias of TLS_method() in OpenSSL commit 32ec41539b5.

Since we no longer support 1.0.2 we can apply something like the (lightly
tested) attached which should be a no-op as we already use TLS_method() but via
an alias.

There's likely more OpenSSL code we can modernize, hopefully we can make some
progress on that during the v19 cycle.

--
Daniel Gustafsson


Attachment

Re: disabled SSL log_like tests

From
Tom Lane
Date:
Daniel Gustafsson <daniel@yesql.se> writes:
>> On 7 May 2025, at 06:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I couldn't help noticing that the backtraces went through
>> lib/libssl/tls13_legacy.c, which doesn't give a warm feeling
>> about how supported they think our usage is (and perhaps also
>> explains why they didn't detect this bug themselves).

> Since we no longer support 1.0.2 we can apply something like the (lightly
> tested) attached which should be a no-op as we already use TLS_method() but via
> an alias.

Yeah, I saw that SSLv23_method() was merely an alias for TLS_method()
in LibreSSL as well.  That means unfortunately that your proposal is
just cosmetic and doesn't get us out of using code that they're
calling "legacy".  I wonder what it would take to get to the "modern"
code paths.

            regards, tom lane



Re: disabled SSL log_like tests

From
Tom Lane
Date:
I filed a report with the LibreSSL folks, and the answer was basically
"yeah, we know RSA-PSS doesn't work; there are multiple things that
need fixing" [1].  So it seems like we'd better find a way to skip
that test case when using LibreSSL.  Not sure what the most
appropriate check would look like.

            regards, tom lane

[1] https://marc.info/?l=libressl&m=174664225002441&w=2



Re: disabled SSL log_like tests

From
Daniel Gustafsson
Date:
> On 7 May 2025, at 21:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I filed a report with the LibreSSL folks, and the answer was basically
> "yeah, we know RSA-PSS doesn't work; there are multiple things that
> need fixing" [1].  So it seems like we'd better find a way to skip
> that test case when using LibreSSL.  Not sure what the most
> appropriate check would look like.

I haven't looked at the test in question yet, but we do skip some SSL tests if
running against libressl already so I assume this will be able to follow the
same pattern.

--
Daniel Gustafsson




Re: disabled SSL log_like tests

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> Oh, interesting.  I also wondered if the problem I reported might be
> related to the separate legacy code paths in x509_vfy.c.

I poked at this a little more, and apparently the code to look at is
actually in the sibling file x509_verify.c.  Here's the backtrace
(in OpenBSD 7.7) from where we construct the incorrect report:

#0  verify_cb (ok=0, ctx=0x17d83e77c00) at be-secure-openssl.c:1213
#1  0x0000017ce17483dc in x509_verify (ctx=0x17d1886f700, leaf=0x17d188882a0, 
    name=0x0) at /usr/src/lib/libcrypto/x509/x509_verify.c:1271
#2  0x0000017ce16ed683 in _lcry_X509_verify_cert (ctx=0x17d83e77c00)
    at /usr/src/lib/libcrypto/x509/x509_vfy.c:645
#3  0x0000017d44ebc4e3 in ssl_verify_cert_chain (s=0x17d83e96500, 
    certs=0x17d83e5fce0) at /usr/src/lib/libssl/ssl_cert.c:437
#4  0x0000017d44eb3d13 in tls13_client_certificate_recv (ctx=0x17d18881f00, 
    cbs=<optimized out>) at /usr/src/lib/libssl/tls13_server.c:918
#5  0x0000017d44ee8d66 in tls13_handshake_recv_action (ctx=0x17d18881f00, 
    action=0x17d44ef6dd0 <state_machine+352>)
    at /usr/src/lib/libssl/tls13_handshake.c:549
#6  tls13_handshake_perform (ctx=0x17d18881f00)
    at /usr/src/lib/libssl/tls13_handshake.c:412
#7  0x0000017d44eb18c7 in tls13_legacy_accept (ssl=0x17d83e96500)
    at /usr/src/lib/libssl/tls13_legacy.c:434
#8  0x0000017a8bbd6f3a in be_tls_open_server (port=0x17d6d91f9b0)
    at be-secure-openssl.c:494
#9  0x0000017a8bbc41c8 in secure_open_server (port=0x17d6d91f9b0)
    at be-secure.c:136
#10 0x0000017a8be9d79e in ProcessStartupPacket (port=0x17d6d91f9b0, 
    ssl_done=false, gss_done=false) at backend_startup.c:614

The code in x509_verify() is about as convoluted as can be:
it seems to be dealing with two different verify-callback APIs
(and naturally, the one we are using is the legacy one).
There is some stuff there that saves and restores an error
condition, but fails to save/restore current_cert, so the first
thing I thought was that that was the problem --- but it's not,
the current_cert doesn't change anyway between the save and the
restore.  So I failed to identify exactly where the problem is,
but I wonder if it'd behave better if we switched to their
new-style callback API.

            regards, tom lane



Re: disabled SSL log_like tests

From
Tom Lane
Date:
Daniel Gustafsson <daniel@yesql.se> writes:
> I haven't looked at the test in question yet, but we do skip some SSL tests if
> running against libressl already so I assume this will be able to follow the
> same pattern.

Ah, thanks for the tip.  I propose the attached, which disables the
RSA-PSS test altogether on LibreSSL, and modifies the
intermediate-cert test to accept the result we're actually getting
on LibreSSL.  We could revert that one if anyone can figure out
how to make it better, but I don't wish to put any more time into
it myself.

            regards, tom lane

diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 8b0de2d8e7e..ccf8ef9bf2e 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -895,14 +895,30 @@ switch_server_cert(

 # intermediate CA is provided but doesn't have a trusted root (checks error
 # logging for cert chain depths > 0)
-$node->connect_fails(
-    "$common_connstr sslmode=require sslcert=ssl/client+client_ca.crt",
-    "intermediate client certificate is untrusted",
-    expected_stderr => qr/SSL error: tlsv1 alert unknown ca/,
-    log_like => [
-        qr{Client certificate verification failed at depth 1: unable to get local issuer certificate},
-        qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client
certs",serial number \d+, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, 
-    ]);
+# As of 5/2025, LibreSSL reports a different cert as being at fault;
+# it's probably wrong, but seems to be their bug not ours
+if (!$libressl)
+{
+    $node->connect_fails(
+        "$common_connstr sslmode=require sslcert=ssl/client+client_ca.crt",
+        "intermediate client certificate is untrusted",
+        expected_stderr => qr/SSL error: tlsv1 alert unknown ca/,
+        log_like => [
+            qr{Client certificate verification failed at depth 1: unable to get local issuer certificate},
+            qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client
certs",serial number \d+, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, 
+        ]);
+}
+else
+{
+    $node->connect_fails(
+        "$common_connstr sslmode=require sslcert=ssl/client+client_ca.crt",
+        "intermediate client certificate is untrusted",
+        expected_stderr => qr/SSL error: tlsv1 alert unknown ca/,
+        log_like => [
+            qr{Client certificate verification failed at depth 1: unable to get local issuer certificate},
+            qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test
CAfor PostgreSQL SSL regression test client certs"}, 
+        ]);
+}

 # test server-side CRL directory
 switch_server_cert(
diff --git a/src/test/ssl/t/002_scram.pl b/src/test/ssl/t/002_scram.pl
index 9e4947f4e3c..73d45909e31 100644
--- a/src/test/ssl/t/002_scram.pl
+++ b/src/test/ssl/t/002_scram.pl
@@ -49,6 +49,13 @@ my $SERVERHOSTCIDR = '127.0.0.1/32';
 my $supports_rsapss_certs =
   check_pg_config("#define HAVE_X509_GET_SIGNATURE_INFO 1");

+# Determine whether this build uses OpenSSL or LibreSSL. As a heuristic, the
+# HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL.
+my $libressl = not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1");
+
+# As of 5/2025, LibreSSL doesn't actually work for RSA-PSS certificates.
+$supports_rsapss_certs = 0 if $libressl;
+
 # Allocation of base connection string shared among multiple tests.
 my $common_connstr;


Re: disabled SSL log_like tests

From
Daniel Gustafsson
Date:
> On 7 May 2025, at 23:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>> I haven't looked at the test in question yet, but we do skip some SSL tests if
>> running against libressl already so I assume this will be able to follow the
>> same pattern.
>
> Ah, thanks for the tip.  I propose the attached, which disables the
> RSA-PSS test altogether on LibreSSL, and modifies the
> intermediate-cert test to accept the result we're actually getting
> on LibreSSL.  We could revert that one if anyone can figure out
> how to make it better, but I don't wish to put any more time into
> it myself.

LGTM for now.

> +# Determine whether this build uses OpenSSL or LibreSSL. As a heuristic, the
> +# HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL.
> +my $libressl = not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1");

Longer term it would be nice to move this into SSL::Server and have the module
export a function or symbol which returns the underlying library and version,
but that's not for this patch.

> +
> +# As of 5/2025, LibreSSL doesn't actually work for RSA-PSS certificates.

Should we add a link to the relevant thread for future readers?  OpenBSD refer
to MARC for archiving which I believe is stable enough for an inclusion.

    https://marc.info/?l=libressl&m=174664225002441&w=2

--
Daniel Gustafsson




Re: disabled SSL log_like tests

From
Tom Lane
Date:
Daniel Gustafsson <daniel@yesql.se> writes:
> On 7 May 2025, at 23:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> +# Determine whether this build uses OpenSSL or LibreSSL. As a heuristic, the
>> +# HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL.
>> +my $libressl = not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1");

> Longer term it would be nice to move this into SSL::Server and have the module
> export a function or symbol which returns the underlying library and version,
> but that's not for this patch.

I was feeling itchy about having two copies of code that looks none
too set-in-stone.  Maybe we should just do that.  Any preferences
on the API?

>> +# As of 5/2025, LibreSSL doesn't actually work for RSA-PSS certificates.

> Should we add a link to the relevant thread for future readers?  OpenBSD refer
> to MARC for archiving which I believe is stable enough for an inclusion.

WFM, I'll make it so.

            regards, tom lane



Re: disabled SSL log_like tests

From
Daniel Gustafsson
Date:
> On 8 May 2025, at 15:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>> On 7 May 2025, at 23:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> +# Determine whether this build uses OpenSSL or LibreSSL. As a heuristic, the
>>> +# HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL.
>>> +my $libressl = not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1");
>
>> Longer term it would be nice to move this into SSL::Server and have the module
>> export a function or symbol which returns the underlying library and version,
>> but that's not for this patch.
>
> I was feeling itchy about having two copies of code that looks none
> too set-in-stone.  Maybe we should just do that.  Any preferences
> on the API?

There is already SSL::Server::ssl_library() which returns the underlying
library, but it's not smart enough to differentiate between which flavour of
OpenSSL compatible library is being used (OpenSSL, Libressl, BoringSSL etc) as
it's only returning a hardcoded string as of now.  My plan was to expand that
at some point.

--
Daniel Gustafsson




Re: disabled SSL log_like tests

From
Tom Lane
Date:
Daniel Gustafsson <daniel@yesql.se> writes:
> On 8 May 2025, at 15:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I was feeling itchy about having two copies of code that looks none
>> too set-in-stone.  Maybe we should just do that.  Any preferences
>> on the API?

> There is already SSL::Server::ssl_library() which returns the underlying
> library, but it's not smart enough to differentiate between which flavour of
> OpenSSL compatible library is being used (OpenSSL, Libressl, BoringSSL etc) as
> it's only returning a hardcoded string as of now.  My plan was to expand that
> at some point.

Hm.  There is this bit in 001_ssltests.pl:

my $result = $node->safe_psql('postgres', "SHOW ssl_library");
is($result, $ssl_server->ssl_library(), 'ssl_library parameter');

which would break.  Admittedly that's not a very exciting test,
so I wouldn't feel bad about dropping it, but maybe someone else
would.

Also, it seems like ssl_library is mainly intended to distinguish
which "backend" module is in use, so having the one string "OpenSSL"
seems to match up with the one backend "OpenSSL.pm".  What we're
talking about here feels like a finer subdivision.  I'm not quite
sure how it ought to fit into that "backend" structure.

We could deal with the RSA-PSS issue pretty cleanly by inventing a
backend method "supports_rsa_pss", but the other thing we're trying
to hack around doesn't seem like it has such a clean definition.

            regards, tom lane



Re: disabled SSL log_like tests

From
Daniel Gustafsson
Date:
> On 8 May 2025, at 22:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>> On 8 May 2025, at 15:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I was feeling itchy about having two copies of code that looks none
>>> too set-in-stone.  Maybe we should just do that.  Any preferences
>>> on the API?
>
>> There is already SSL::Server::ssl_library() which returns the underlying
>> library, but it's not smart enough to differentiate between which flavour of
>> OpenSSL compatible library is being used (OpenSSL, Libressl, BoringSSL etc) as
>> it's only returning a hardcoded string as of now.  My plan was to expand that
>> at some point.
>
> Hm.  There is this bit in 001_ssltests.pl:
>
> my $result = $node->safe_psql('postgres', "SHOW ssl_library");
> is($result, $ssl_server->ssl_library(), 'ssl_library parameter');
>
> which would break.  Admittedly that's not a very exciting test,
> so I wouldn't feel bad about dropping it, but maybe someone else
> would.

I have no problems dropping that, it's rather uninteresting.

> Also, it seems like ssl_library is mainly intended to distinguish
> which "backend" module is in use, so having the one string "OpenSSL"
> seems to match up with the one backend "OpenSSL.pm".  What we're
> talking about here feels like a finer subdivision.  I'm not quite
> sure how it ought to fit into that "backend" structure.

The backend concept was mostly intended to match up with the underlying library.
It get's a bit murky as OpenSSL tough since it's a library, but also a popular API
compatibility target implemented by multiple libraries (Libressl, Boringssl,
Wolfssl come to mind).

Maybe the ssl_library function should return a hash with backend => 'OpenSSL'
and library => <the actual implementation used>?  Then the test author can
decide which level of compatibility they want?  If we were to end up with a
Libressl libtls implementation in libpq we'd still have to test with Libressl
against the OpenSSL compat layer in libssl since it could act as both.  Not a
bridge we have to cross today but might be worth at least keeping in mind when
designing something to not make it impossible in the future.

--
Daniel Gustafsson




Re: disabled SSL log_like tests

From
Tom Lane
Date:
Daniel Gustafsson <daniel@yesql.se> writes:
> Maybe the ssl_library function should return a hash with backend => 'OpenSSL'
> and library => <the actual implementation used>?

I don't love doing it exactly like that: seems like it adds notational
complexity for little gain.  Also, it forces ssl_library to expend
work detecting things the current caller may not care about.

I was thinking about just transposing the existing test down to the
backend layer, more or less as attached.  Not wedded to these names
of course.

> If we were to end up with a
> Libressl libtls implementation in libpq we'd still have to test with Libressl
> against the OpenSSL compat layer in libssl since it could act as both.  Not a
> bridge we have to cross today but might be worth at least keeping in mind when
> designing something to not make it impossible in the future.

Right.  I think the attached would be amenable to that.

Further down the road, it seems inevitable that we'll need to have a
way of detecting the SSL library version --- for example, assuming
the LibreSSL folk eventually fix their RSA-PSS code, we'll need a
version-dependent test.  That could be another new backend method,
I guess.

            regards, tom lane

diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 8b0de2d8e7e..60b5c3630b0 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -35,9 +35,8 @@ sub switch_server_cert
     $ssl_server->switch_server_cert(@_);
 }

-# Determine whether this build uses OpenSSL or LibreSSL. As a heuristic, the
-# HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL.
-my $libressl = not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1");
+# Determine whether this build uses OpenSSL or LibreSSL.
+my $libressl = $ssl_server->is_libressl;

 #### Some configuration

diff --git a/src/test/ssl/t/SSL/Backend/OpenSSL.pm b/src/test/ssl/t/SSL/Backend/OpenSSL.pm
index e044318531f..4159addb700 100644
--- a/src/test/ssl/t/SSL/Backend/OpenSSL.pm
+++ b/src/test/ssl/t/SSL/Backend/OpenSSL.pm
@@ -26,6 +26,7 @@ package SSL::Backend::OpenSSL;

 use strict;
 use warnings FATAL => 'all';
+use PostgreSQL::Test::Utils;
 use File::Basename;
 use File::Copy;

@@ -205,6 +206,23 @@ sub get_library
     return $self->{_library};
 }

+=pod
+
+=item $backend->library_is_libressl()
+
+Detect whether the SSL library is LibreSSL.
+
+=cut
+
+sub library_is_libressl
+{
+    my ($self) = @_;
+
+    # The HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL.
+    # We may eventually need a less-bogus heuristic.
+    return not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1");
+}
+
 # Internal method for copying a set of files, taking into account wildcards
 sub _copy_files
 {
diff --git a/src/test/ssl/t/SSL/Server.pm b/src/test/ssl/t/SSL/Server.pm
index 14277418419..33975b28e8c 100644
--- a/src/test/ssl/t/SSL/Server.pm
+++ b/src/test/ssl/t/SSL/Server.pm
@@ -244,6 +244,23 @@ sub ssl_library

 =pod

+=item $server->is_libressl()
+
+Detect whether the currently used SSL backend is LibreSSL.
+(Ideally we'd not need this hack, but presently we do.)
+
+=cut
+
+sub is_libressl
+{
+    my $self = shift;
+    my $backend = $self->{backend};
+
+    return $backend->library_is_libressl();
+}
+
+=pod
+
 =item switch_server_cert(params)

 Change the configuration to use the given set of certificate, key, ca and

Re: disabled SSL log_like tests

From
Tom Lane
Date:
Daniel Gustafsson <daniel@yesql.se> writes:
> On 9 May 2025, at 02:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Right.  I think the attached would be amenable to that.

> It will be a bit awkward to ask "are you libressl" if we ever add support for
> something not OpenSSL based, but we could always revisit should that happen.

I was imagining that unrelated backends would simply make the method
report constant-false.  But in any case, redesigning this in the light
of any future requirements doesn't seem like a big deal.  What I'm
mainly after for today is just to ensure that these various which-
library-is-it tests are visibly interconnected, so that we don't
forget to check them all when things change.

            regards, tom lane



Re: disabled SSL log_like tests

From
Tom Lane
Date:
Pushed all that stuff.  The SSL tests pass for me now on OpenBSD 7.7,
and hopefully the CI environment will be happy too.

            regards, tom lane



Re: disabled SSL log_like tests

From
Daniel Gustafsson
Date:
> On 9 May 2025, at 18:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> Pushed all that stuff.  The SSL tests pass for me now on OpenBSD 7.7,
> and hopefully the CI environment will be happy too.

Thanks!

--
Daniel Gustafsson