Thread: disabled SSL log_like tests
Back in 2022 in commit 55828a6b6084 we disabled a bunch of tests due to a timing issue. Nothing seems to have been done since ... do we still want to keep these commented out lines around? This "temporary" fix seems to have stretched the definition of that term more than somewhat. (noticed when looking into a different issue). cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > Back in 2022 in commit 55828a6b6084 we disabled a bunch of tests due to > a timing issue. Nothing seems to have been done since ... do we still > want to keep these commented out lines around? This "temporary" fix > seems to have stretched the definition of that term more than somewhat. Per my coffee cup, "Nothing is as permanent as a temporary fix". However, I wonder whether Andres' work at 8b886a4e3 could be used to improve this, either directly or as inspiration? regards, tom lane
On 2025-04-17 Th 10:56 AM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Back in 2022 in commit 55828a6b6084 we disabled a bunch of tests due to >> a timing issue. Nothing seems to have been done since ... do we still >> want to keep these commented out lines around? This "temporary" fix >> seems to have stretched the definition of that term more than somewhat. > Per my coffee cup, "Nothing is as permanent as a temporary fix". > > However, I wonder whether Andres' work at 8b886a4e3 could be used > to improve this, either directly or as inspiration? > > I don't think so - these tests are about checking log file contents, not a psql problem. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 2025-04-17 Th 10:56 AM, Tom Lane wrote: >> However, I wonder whether Andres' work at 8b886a4e3 could be used >> to improve this, either directly or as inspiration? > I don't think so - these tests are about checking log file contents, not > a psql problem. Well, I was mainly leaning on the "inspiration" part: the idea is to wait for something that must come out after the text we want to look for. After digging around a bit I noticed that 002_connection_limits.pl's connect_fails_wait is doing something that's almost what we want: that test cranks log_min_messages up to DEBUG2 and then waits for the postmaster's report of backend exit before believing that it's done. Awhile later I had the attached patch. Some notes: * The commented-out tests in 001_ssltests.pl contained hard-wired expectations as to certificate serial numbers, which are obsolete now. I just replaced them with "\d+", but if anyone feels like that's not good enough, we could continue to check for exact serial numbers and eat the ensuing maintenance effort. * I was more than slightly surprised to find that there are a bunch of other connect_fails callers that are testing log_like or log_unlike and thereby risking the same type of race condition. Some of those tests are relatively new and perhaps just haven't failed *yet*, but I wonder if we changed something since 2022 that solves this problem in a different way? Anyway, after this change any such caller must set log_min_messages = debug2 or fail. I think I got all the relevant test scripts in the attached. regards, tom lane diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl index 756b4146050..37d96d95a1a 100644 --- a/src/test/authentication/t/001_password.pl +++ b/src/test/authentication/t/001_password.pl @@ -66,6 +66,8 @@ sub test_conn my $node = PostgreSQL::Test::Cluster->new('primary'); $node->init; $node->append_conf('postgresql.conf', "log_connections = on\n"); +# Needed to allow connect_fails to inspect postmaster log: +$node->append_conf('postgresql.conf', "log_min_messages = debug2"); $node->start; # Test behavior of log_connections GUC diff --git a/src/test/authentication/t/003_peer.pl b/src/test/authentication/t/003_peer.pl index 69ba73bd2b9..2879800eacf 100644 --- a/src/test/authentication/t/003_peer.pl +++ b/src/test/authentication/t/003_peer.pl @@ -72,6 +72,8 @@ sub test_role my $node = PostgreSQL::Test::Cluster->new('node'); $node->init; $node->append_conf('postgresql.conf', "log_connections = on\n"); +# Needed to allow connect_fails to inspect postmaster log: +$node->append_conf('postgresql.conf', "log_min_messages = debug2"); $node->start; # Set pg_hba.conf with the peer authentication. diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl index 6748b109dec..2dc6bec9b89 100644 --- a/src/test/kerberos/t/001_auth.pl +++ b/src/test/kerberos/t/001_auth.pl @@ -66,6 +66,7 @@ $node->append_conf( listen_addresses = '$hostaddr' krb_server_keyfile = '$krb->{keytab}' log_connections = on +log_min_messages = debug2 lc_messages = 'C' }); $node->start; diff --git a/src/test/ldap/t/001_auth.pl b/src/test/ldap/t/001_auth.pl index 352b0fc1fa7..d1315ed5351 100644 --- a/src/test/ldap/t/001_auth.pl +++ b/src/test/ldap/t/001_auth.pl @@ -48,6 +48,8 @@ note "setting up PostgreSQL instance"; my $node = PostgreSQL::Test::Cluster->new('node'); $node->init; $node->append_conf('postgresql.conf', "log_connections = on\n"); +# Needed to allow connect_fails to inspect postmaster log: +$node->append_conf('postgresql.conf', "log_min_messages = debug2"); $node->start; $node->safe_psql('postgres', 'CREATE USER test0;'); diff --git a/src/test/modules/oauth_validator/t/001_server.pl b/src/test/modules/oauth_validator/t/001_server.pl index d88994abc24..4f035417a40 100644 --- a/src/test/modules/oauth_validator/t/001_server.pl +++ b/src/test/modules/oauth_validator/t/001_server.pl @@ -48,6 +48,8 @@ $node->init; $node->append_conf('postgresql.conf', "log_connections = on\n"); $node->append_conf('postgresql.conf', "oauth_validator_libraries = 'validator'\n"); +# Needed to allow connect_fails to inspect postmaster log: +$node->append_conf('postgresql.conf', "log_min_messages = debug2"); $node->start; $node->safe_psql('postgres', 'CREATE USER test;'); diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm index 8759ed2cbba..7877cc8d994 100644 --- a/src/test/perl/PostgreSQL/Test/Cluster.pm +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm @@ -2624,7 +2624,12 @@ If this regular expression is set, matches it with the output generated. =item log_unlike => [ qr/prohibited message/ ] -See C<log_check(...)>. +See C<log_check(...)>. CAUTION: use of either option requires that +the server's log_min_messages be at least DEBUG2, and that no other +client backend is launched concurrently. These requirements allow +C<connect_fails> to wait to see the postmaster-log report of backend +exit, without which there is a race condition as to whether we will +see the expected backend log output. =back @@ -2652,7 +2657,14 @@ sub connect_fails like($stderr, $params{expected_stderr}, "$test_name: matches"); } - $self->log_check($test_name, $log_location, %params); + if (defined($params{log_like}) or defined($params{log_unlike})) + { + $self->wait_for_log( + qr/DEBUG: (00000: )?client backend.*exited with exit code \d/, + $log_location); + + $self->log_check($test_name, $log_location, %params); + } } =pod diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl index 086abf3b8b3..8b0de2d8e7e 100644 --- a/src/test/ssl/t/001_ssltests.pl +++ b/src/test/ssl/t/001_ssltests.pl @@ -60,6 +60,8 @@ my $common_connstr; note "setting up data directory"; my $node = PostgreSQL::Test::Cluster->new('primary'); $node->init; +# Needed to allow connect_fails to inspect postmaster log: +$node->append_conf('postgresql.conf', "log_min_messages = debug2"); # PGHOST is enforced here to set up the node, subsequent connections # will use a dedicated connection string. @@ -807,10 +809,8 @@ $node->connect_fails( expected_stderr => qr/certificate authentication failed for user "anotheruser"/, # certificate authentication should be logged even on failure - # temporarily(?) skip this check due to timing issue - # log_like => - # [qr/connection authenticated: identity="CN=ssltestuser" method=cert/], -); + log_like => + [qr/connection authenticated: identity="CN=ssltestuser" method=cert/],); # revoked client cert $node->connect_fails( @@ -818,11 +818,10 @@ $node->connect_fails( . sslkey('client-revoked.key'), "certificate authorization fails with revoked client cert", expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 0: certificate revoked}, - # qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656577, issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, - # ], + log_like => [ + qr{Client certificate verification failed at depth 0: certificate revoked}, + qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA forPostgreSQL SSL regression test client certs"}, + ], # revoked certificates should not authenticate the user log_unlike => [qr/connection authenticated:/],); @@ -872,24 +871,20 @@ $node->connect_fails( $common_connstr . " " . "sslmode=require sslcert=ssl/client.crt", "intermediate client certificate is missing", expected_stderr => qr/SSL error: tlsv1 alert unknown ca/, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 0: unable to get local issuer certificate}, - # qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656576, issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, - # ] -); + log_like => [ + qr{Client certificate verification failed at depth 0: unable to get local issuer certificate}, + qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA forPostgreSQL SSL regression test client certs"}, + ]); $node->connect_fails( "$common_connstr sslmode=require sslcert=ssl/client-long.crt " . sslkey('client-long.key'), "logged client certificate Subjects are truncated if they're too long", expected_stderr => qr/SSL error: tlsv1 alert unknown ca/, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 0: unable to get local issuer certificate}, - # qr{Failed certificate data \(unverified\): subject "\.\.\./CN=ssl-123456789012345678901234567890123456789012345678901234567890",serial number 2315418733629425152, issuer "/CN=TestCA for PostgreSQL SSL regression test client certs"}, - # ] -); + log_like => [ + qr{Client certificate verification failed at depth 0: unable to get local issuer certificate}, + qr{Failed certificate data \(unverified\): subject "\.\.\./CN=ssl-123456789012345678901234567890123456789012345678901234567890",serial number \d+, issuer "/CN=Test CA for PostgreSQLSSL regression test client certs"}, + ]); # Use an invalid cafile here so that the next test won't be able to verify the # client CA. @@ -904,12 +899,10 @@ $node->connect_fails( "$common_connstr sslmode=require sslcert=ssl/client+client_ca.crt", "intermediate client certificate is untrusted", expected_stderr => qr/SSL error: tlsv1 alert unknown ca/, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 1: unable to get local issuer certificate}, - # qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client certs",serial number 2315134995201656577, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, - # ] -); + log_like => [ + qr{Client certificate verification failed at depth 1: unable to get local issuer certificate}, + qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client certs",serial number \d+, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, + ]); # test server-side CRL directory switch_server_cert( @@ -923,12 +916,10 @@ $node->connect_fails( . sslkey('client-revoked.key'), "certificate authorization fails with revoked client cert with server-side CRL directory", expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 0: certificate revoked}, - # qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656577, issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, - # ] -); + log_like => [ + qr{Client certificate verification failed at depth 0: certificate revoked}, + qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA forPostgreSQL SSL regression test client certs"}, + ]); # revoked client cert, non-ASCII subject $node->connect_fails( @@ -936,11 +927,9 @@ $node->connect_fails( . sslkey('client-revoked-utf8.key'), "certificate authorization fails with revoked UTF-8 client cert with server-side CRL directory", expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 0: certificate revoked}, - # qr{Failed certificate data \(unverified\): subject "/CN=\\xce\\x9f\\xce\\xb4\\xcf\\x85\\xcf\\x83\\xcf\\x83\\xce\\xad\\xce\\xb1\\xcf\\x82",serial number 2315420958437414144,issuer "/CN=Test CA for PostgreSQL SSL regression test client certs"}, - # ] -); + log_like => [ + qr{Client certificate verification failed at depth 0: certificate revoked}, + qr{Failed certificate data \(unverified\): subject "/CN=\\xce\\x9f\\xce\\xb4\\xcf\\x85\\xcf\\x83\\xcf\\x83\\xce\\xad\\xce\\xb1\\xcf\\x82",serial number \d+, issuer "/CN=TestCA for PostgreSQL SSL regression test client certs"}, + ]); done_testing();
On Fri, Apr 18, 2025 at 12:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > * The commented-out tests in 001_ssltests.pl contained hard-wired > expectations as to certificate serial numbers, which are obsolete now. > I just replaced them with "\d+", but if anyone feels like that's not > good enough, we could continue to check for exact serial numbers and > eat the ensuing maintenance effort. No, I think \d+ should be fine. (I haven't stared closely at the racing code yet, but I like the concept of the patch.) Thanks! --Jacob
I wrote: > * I was more than slightly surprised to find that there are a bunch of > other connect_fails callers that are testing log_like or log_unlike > and thereby risking the same type of race condition. Some of those > tests are relatively new and perhaps just haven't failed *yet*, > but I wonder if we changed something since 2022 that solves this > problem in a different way? Nope, apparently not. The failure depends on the kernel's scheduler, so unsurprisingly it's quite platform-dependent. But I can reproduce it reliably on mamba's host (NetBSD 10/macppc) if I enable the 001_ssltests.pl log_like tests without applying the connect_fails changes. The fact that mamba hasn't been failing on the other affected tests is a bit puzzling. mamba isn't running kerberos or ldap or oauth_validator, so the lack of failures there is expected. authentication/t/001_password.pl appears accidentally not vulnerable: it's not using log_like in any connect_fails tests. (It is using log_unlike, so those tests could be giving silent false negatives.) But authentication/t/003_peer.pl has 8 test cases that look vulnerable. I guess there must be some extra dollop of timing weirdness in the ssl tests that's not there in 003_peer.pl. Unfortunately ... it sometimes fails even with the connect_fails changes, for example # Failed test 'intermediate client certificate is missing: log matches' # at /home/tgl/pgsql/src/test/ssl/../../../src/test/perl/PostgreSQL/Test/Cluster.pm line 2666. # '2025-04-18 17:59:19.358 EDT [1460] DEBUG: assigned pm child slot 2 for backend # 2025-04-18 17:59:19.359 EDT [1460] DEBUG: forked new client backend, pid=599 socket=8 # 2025-04-18 17:59:19.369 EDT [599] [unknown] LOG: connection received: host=localhost port=63709 # 2025-04-18 17:59:19.436 EDT [1460] DEBUG: releasing pm child slot 1 # 2025-04-18 17:59:19.436 EDT [1460] DEBUG: client backend (PID 25401) exited with exit code 0 # ' # doesn't match '(?^:Client certificate verification failed at depth 0: unable to get local issuer certificate)' What I think happened here is that a previous backend hadn't exited yet when we start the test, and when its report does come out, connect_fails prematurely stops waiting. (In the above, evidently the child process we want to wait for is 599, but we're fooled by a delayed report for 25401.) So my v1 patch needs work. Maybe we can make the test compare the PIDs in the "forked new client backend" and "client backend exited" log messages. Stay tuned... regards, tom lane
I wrote: > What I think happened here is that a previous backend hadn't exited > yet when we start the test, and when its report does come out, > connect_fails prematurely stops waiting. (In the above, evidently > the child process we want to wait for is 599, but we're fooled by > a delayed report for 25401.) So my v1 patch needs work. > Maybe we can make the test compare the PIDs in the "forked new client > backend" and "client backend exited" log messages. Stay tuned... Okay, this version seems more reliable. regards, tom lane diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl index 756b4146050..37d96d95a1a 100644 --- a/src/test/authentication/t/001_password.pl +++ b/src/test/authentication/t/001_password.pl @@ -66,6 +66,8 @@ sub test_conn my $node = PostgreSQL::Test::Cluster->new('primary'); $node->init; $node->append_conf('postgresql.conf', "log_connections = on\n"); +# Needed to allow connect_fails to inspect postmaster log: +$node->append_conf('postgresql.conf', "log_min_messages = debug2"); $node->start; # Test behavior of log_connections GUC diff --git a/src/test/authentication/t/003_peer.pl b/src/test/authentication/t/003_peer.pl index 69ba73bd2b9..2879800eacf 100644 --- a/src/test/authentication/t/003_peer.pl +++ b/src/test/authentication/t/003_peer.pl @@ -72,6 +72,8 @@ sub test_role my $node = PostgreSQL::Test::Cluster->new('node'); $node->init; $node->append_conf('postgresql.conf', "log_connections = on\n"); +# Needed to allow connect_fails to inspect postmaster log: +$node->append_conf('postgresql.conf', "log_min_messages = debug2"); $node->start; # Set pg_hba.conf with the peer authentication. diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl index 6748b109dec..2dc6bec9b89 100644 --- a/src/test/kerberos/t/001_auth.pl +++ b/src/test/kerberos/t/001_auth.pl @@ -66,6 +66,7 @@ $node->append_conf( listen_addresses = '$hostaddr' krb_server_keyfile = '$krb->{keytab}' log_connections = on +log_min_messages = debug2 lc_messages = 'C' }); $node->start; diff --git a/src/test/ldap/t/001_auth.pl b/src/test/ldap/t/001_auth.pl index 352b0fc1fa7..d1315ed5351 100644 --- a/src/test/ldap/t/001_auth.pl +++ b/src/test/ldap/t/001_auth.pl @@ -48,6 +48,8 @@ note "setting up PostgreSQL instance"; my $node = PostgreSQL::Test::Cluster->new('node'); $node->init; $node->append_conf('postgresql.conf', "log_connections = on\n"); +# Needed to allow connect_fails to inspect postmaster log: +$node->append_conf('postgresql.conf', "log_min_messages = debug2"); $node->start; $node->safe_psql('postgres', 'CREATE USER test0;'); diff --git a/src/test/modules/oauth_validator/t/001_server.pl b/src/test/modules/oauth_validator/t/001_server.pl index d88994abc24..4f035417a40 100644 --- a/src/test/modules/oauth_validator/t/001_server.pl +++ b/src/test/modules/oauth_validator/t/001_server.pl @@ -48,6 +48,8 @@ $node->init; $node->append_conf('postgresql.conf', "log_connections = on\n"); $node->append_conf('postgresql.conf', "oauth_validator_libraries = 'validator'\n"); +# Needed to allow connect_fails to inspect postmaster log: +$node->append_conf('postgresql.conf', "log_min_messages = debug2"); $node->start; $node->safe_psql('postgres', 'CREATE USER test;'); diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm index 8759ed2cbba..4b858900139 100644 --- a/src/test/perl/PostgreSQL/Test/Cluster.pm +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm @@ -2618,13 +2618,19 @@ to fail. =item expected_stderr => B<value> -If this regular expression is set, matches it with the output generated. +If this regular expression is set, matches it to the output generated +by B<psql>. =item log_like => [ qr/required message/ ] =item log_unlike => [ qr/prohibited message/ ] -See C<log_check(...)>. +See C<log_check(...)>. CAUTION: use of either option requires that +the server's log_min_messages be at least DEBUG2, and that no other +client backend is launched concurrently. These requirements allow +C<connect_fails> to wait to see the postmaster-log report of backend +exit, without which there is a race condition as to whether we will +see the expected backend log output. =back @@ -2652,7 +2658,14 @@ sub connect_fails like($stderr, $params{expected_stderr}, "$test_name: matches"); } - $self->log_check($test_name, $log_location, %params); + if (defined($params{log_like}) or defined($params{log_unlike})) + { + $self->wait_for_log( + qr/DEBUG: (?:00000: )?forked new client backend, pid=(\d+) socket.*DEBUG: (?:00000: )?client backend \(PID\1\) exited with exit code \d/s, + $log_location); + + $self->log_check($test_name, $log_location, %params); + } } =pod diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl index 086abf3b8b3..8b0de2d8e7e 100644 --- a/src/test/ssl/t/001_ssltests.pl +++ b/src/test/ssl/t/001_ssltests.pl @@ -60,6 +60,8 @@ my $common_connstr; note "setting up data directory"; my $node = PostgreSQL::Test::Cluster->new('primary'); $node->init; +# Needed to allow connect_fails to inspect postmaster log: +$node->append_conf('postgresql.conf', "log_min_messages = debug2"); # PGHOST is enforced here to set up the node, subsequent connections # will use a dedicated connection string. @@ -807,10 +809,8 @@ $node->connect_fails( expected_stderr => qr/certificate authentication failed for user "anotheruser"/, # certificate authentication should be logged even on failure - # temporarily(?) skip this check due to timing issue - # log_like => - # [qr/connection authenticated: identity="CN=ssltestuser" method=cert/], -); + log_like => + [qr/connection authenticated: identity="CN=ssltestuser" method=cert/],); # revoked client cert $node->connect_fails( @@ -818,11 +818,10 @@ $node->connect_fails( . sslkey('client-revoked.key'), "certificate authorization fails with revoked client cert", expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 0: certificate revoked}, - # qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656577, issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, - # ], + log_like => [ + qr{Client certificate verification failed at depth 0: certificate revoked}, + qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA forPostgreSQL SSL regression test client certs"}, + ], # revoked certificates should not authenticate the user log_unlike => [qr/connection authenticated:/],); @@ -872,24 +871,20 @@ $node->connect_fails( $common_connstr . " " . "sslmode=require sslcert=ssl/client.crt", "intermediate client certificate is missing", expected_stderr => qr/SSL error: tlsv1 alert unknown ca/, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 0: unable to get local issuer certificate}, - # qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656576, issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, - # ] -); + log_like => [ + qr{Client certificate verification failed at depth 0: unable to get local issuer certificate}, + qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA forPostgreSQL SSL regression test client certs"}, + ]); $node->connect_fails( "$common_connstr sslmode=require sslcert=ssl/client-long.crt " . sslkey('client-long.key'), "logged client certificate Subjects are truncated if they're too long", expected_stderr => qr/SSL error: tlsv1 alert unknown ca/, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 0: unable to get local issuer certificate}, - # qr{Failed certificate data \(unverified\): subject "\.\.\./CN=ssl-123456789012345678901234567890123456789012345678901234567890",serial number 2315418733629425152, issuer "/CN=TestCA for PostgreSQL SSL regression test client certs"}, - # ] -); + log_like => [ + qr{Client certificate verification failed at depth 0: unable to get local issuer certificate}, + qr{Failed certificate data \(unverified\): subject "\.\.\./CN=ssl-123456789012345678901234567890123456789012345678901234567890",serial number \d+, issuer "/CN=Test CA for PostgreSQLSSL regression test client certs"}, + ]); # Use an invalid cafile here so that the next test won't be able to verify the # client CA. @@ -904,12 +899,10 @@ $node->connect_fails( "$common_connstr sslmode=require sslcert=ssl/client+client_ca.crt", "intermediate client certificate is untrusted", expected_stderr => qr/SSL error: tlsv1 alert unknown ca/, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 1: unable to get local issuer certificate}, - # qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client certs",serial number 2315134995201656577, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, - # ] -); + log_like => [ + qr{Client certificate verification failed at depth 1: unable to get local issuer certificate}, + qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client certs",serial number \d+, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, + ]); # test server-side CRL directory switch_server_cert( @@ -923,12 +916,10 @@ $node->connect_fails( . sslkey('client-revoked.key'), "certificate authorization fails with revoked client cert with server-side CRL directory", expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 0: certificate revoked}, - # qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number 2315134995201656577, issuer"/CN=Test CA for PostgreSQL SSL regression test client certs"}, - # ] -); + log_like => [ + qr{Client certificate verification failed at depth 0: certificate revoked}, + qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CA forPostgreSQL SSL regression test client certs"}, + ]); # revoked client cert, non-ASCII subject $node->connect_fails( @@ -936,11 +927,9 @@ $node->connect_fails( . sslkey('client-revoked-utf8.key'), "certificate authorization fails with revoked UTF-8 client cert with server-side CRL directory", expected_stderr => qr|SSL error: ssl[a-z0-9/]* alert certificate revoked|, - # temporarily(?) skip this check due to timing issue - # log_like => [ - # qr{Client certificate verification failed at depth 0: certificate revoked}, - # qr{Failed certificate data \(unverified\): subject "/CN=\\xce\\x9f\\xce\\xb4\\xcf\\x85\\xcf\\x83\\xcf\\x83\\xce\\xad\\xce\\xb1\\xcf\\x82",serial number 2315420958437414144,issuer "/CN=Test CA for PostgreSQL SSL regression test client certs"}, - # ] -); + log_like => [ + qr{Client certificate verification failed at depth 0: certificate revoked}, + qr{Failed certificate data \(unverified\): subject "/CN=\\xce\\x9f\\xce\\xb4\\xcf\\x85\\xcf\\x83\\xcf\\x83\\xce\\xad\\xce\\xb1\\xcf\\x82",serial number \d+, issuer "/CN=TestCA for PostgreSQL SSL regression test client certs"}, + ]); done_testing();
On 2025-04-18 Fr 7:26 PM, Tom Lane wrote: > I wrote: >> What I think happened here is that a previous backend hadn't exited >> yet when we start the test, and when its report does come out, >> connect_fails prematurely stops waiting. (In the above, evidently >> the child process we want to wait for is 599, but we're fooled by >> a delayed report for 25401.) So my v1 patch needs work. >> Maybe we can make the test compare the PIDs in the "forked new client >> backend" and "client backend exited" log messages. Stay tuned... > Okay, this version seems more reliable. > +See C<log_check(...)>. CAUTION: use of either option requires that +the server's log_min_messages be at least DEBUG2, and that no other +client backend is launched concurrently. These requirements allow +C<connect_fails> to wait to see the postmaster-log report of backend +exit, without which there is a race condition as to whether we will +see the expected backend log output. That seems a little fragile. I can imagine test authors easily forgetting this. Is it worth sanity checking to make sure log_min_messages is appropriately set? cheers andrew > -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 2025-04-18 Fr 7:26 PM, Tom Lane wrote: > +See C<log_check(...)>. CAUTION: use of either option requires that > +the server's log_min_messages be at least DEBUG2, and that no other > +client backend is launched concurrently. These requirements allow > +C<connect_fails> to wait to see the postmaster-log report of backend > +exit, without which there is a race condition as to whether we will > +see the expected backend log output. > That seems a little fragile. I can imagine test authors easily > forgetting this. Is it worth sanity checking to make sure > log_min_messages is appropriately set? Setting log_min_messages is not so easily forgotten, because connect_fails will just hang until timeout if you didn't. I'm more worried about the "no other backend" requirement. I think v2 is reasonably proof against that, but whether it's sufficiently bulletproof to withstand the buildfarm environment remains to be seen. I wish there were a better way to determine the backend PID for a failed connection... regards, tom lane
I wrote: > The fact that mamba hasn't been failing on the other affected > tests is a bit puzzling. mamba isn't running kerberos or ldap > or oauth_validator, so the lack of failures there is expected. > authentication/t/001_password.pl appears accidentally not vulnerable: > it's not using log_like in any connect_fails tests. (It is using > log_unlike, so those tests could be giving silent false negatives.) > But authentication/t/003_peer.pl has 8 test cases that look > vulnerable. I guess there must be some extra dollop of timing > weirdness in the ssl tests that's not there in 003_peer.pl. After pushing this patch, the probable explanation for the "extra timing weirdness" finally penetrated my brain: the error reports we are looking for are cases that are detected by OpenSSL. So it's plausible that OpenSSL has told the connecting client to go away before it returns the error indication to the backend's calling code, which would be what would log the message. That is what provides the window for a race condition. You still need a bunch of assumptions about the kernel's subsequent scheduling of psql and the TAP script versus the backend, but the whole thing is certainly dependent on psql getting the word sooner than the backend. (Not all of the tests disabled by 55828a6b6 meet this description, but a bunch of them do, and I believe that I just zapped every log_like condition in the script rather than trying to be selective about which ones were known to have failed.) It seems at least plausible that the Kerberos tests could have a similar problem. I'm less sure about LDAP or OAuth. But authentication/t/003_peer.pl would not, because elog.c emits errors to the log before sending them to the client. So that explains the lack of buildfarm reports from mamba, and it's unclear that we have any other animals with the right timing behavior to see this. regards, tom lane
If you run the not-yet-enabled-by-default OpenBSD CI task on master, ssl/001_ssltests fails in "intermediate client certificate is untrusted", recently uncommented by commit e0f373ee. I think it might be telling us that LibreSSL's x509_store_ctx_get_current_cert() is giving us the client certificate (ie chain depth 0) instead of the intermediate certificate, even though X509_STORE_CTX_get_error_depth() returned 1 as expected. I don't know why it would do that, given the documentation: X509_STORE_CTX_get_current_cert() returns the certificate in ctx which caused the error or NULL if no certificate is relevant. The explanation is probably in here somewhere, but I don't understand these things: https://github.com/openbsd/src/blob/master/lib/libcrypto/x509/x509_vfy.c https://github.com/openssl/openssl/blob/master/crypto/x509/x509_vfy.c [17:55:28.888] # Failed test 'intermediate client certificate is untrusted: log matches' [17:55:28.888] # at /home/postgres/postgres/src/test/perl/PostgreSQL/Test/Cluster.pm line 2667. [17:55:28.888] # '2025-05-05 17:55:28.353 UTC [10009][postmaster] DEBUG: assigned pm child slot 1 for backend [17:55:28.888] # 2025-05-05 17:55:28.354 UTC [10009][postmaster] DEBUG: forked new client backend, pid=27624 socket=8 [17:55:28.888] # 2025-05-05 17:55:28.355 UTC [27624][not initialized] [[unknown]][:0] LOG: connection received: host=localhost port=11357 [17:55:28.888] # 2025-05-05 17:55:28.374 UTC [27624][not initialized] [[unknown]][:0] LOG: could not accept SSL connection: certificate verify failed [17:55:28.888] # 2025-05-05 17:55:28.374 UTC [27624][not initialized] [[unknown]][:0] DETAIL: Client certificate verification failed at depth 1: unable to get local issuer certificate. [17:55:28.888] # Failed certificate data (unverified): subject "/CN=ssltestuser", serial number 2315702411956921344, issuer "/CN=Test CA for PostgreSQL SSL regression test client certs". [17:55:28.888] # 2025-05-05 17:55:28.374 UTC [27624][not initialized] [[unknown]][:0] DEBUG: SSL connection from DN:"(anonymous)" CN:"(anonymous)" [17:55:28.888] # 2025-05-05 17:55:28.377 UTC [10009][postmaster] DEBUG: releasing pm child slot 1 [17:55:28.888] # 2025-05-05 17:55:28.377 UTC [10009][postmaster] DEBUG: client backend (PID 27624) exited with exit code 0 [17:55:28.888] # ' [17:55:28.888] # doesn't match '(?^:Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client certs", serial number \d+, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite")' [17:55:28.888] # Looks like you failed 1 test of 240. https://cirrus-ci.com/task/4708964002168832?logs=test_world#L345 https://api.cirrus-ci.com/v1/artifact/task/4708964002168832/testrun/build/testrun/ssl/001_ssltests/log/regress_log_001_ssltests https://api.cirrus-ci.com/v1/artifact/task/4708964002168832/testrun/build/testrun/ssl/001_ssltests/log/001_ssltests_primary.log
Thomas Munro <thomas.munro@gmail.com> writes: > If you run the not-yet-enabled-by-default OpenBSD CI task on master, > ssl/001_ssltests fails in "intermediate client certificate is > untrusted", recently uncommented by commit e0f373ee. Yeah, I see that too. But I also see three failures in 002_scram.pl, which presumably were there before e0f373ee. (Tested on OpenBSD 7.6 and 7.7.) The buildfarm's OpenBSD animals haven't caught this because they don't run this test suite :-(. Yes they build with --with-openssl, but one of them lacks --enable-tap-tests and the other two aren't filling PG_TEST_EXTRA. The SCRAM failures are a bit discouraging ... [18:16:33.259](0.565s) not ok 26 - SCRAM with SSL and channel_binding=require, server certificate uses 'rsassaPss' [18:16:33.261](0.002s) [18:16:33.261](0.000s) # Failed test 'SCRAM with SSL and channel_binding=require, server certificate uses 'rsassaPss'' # at t/002_scram.pl line 161. [18:16:33.262](0.001s) # got: '2' # expected: '0' [18:16:33.264](0.002s) not ok 27 - SCRAM with SSL and channel_binding=require, server certificate uses 'rsassaPss': no stderr [18:16:33.265](0.001s) [18:16:33.265](0.000s) # Failed test 'SCRAM with SSL and channel_binding=require, server certificate uses 'rsassaPss':no stderr' # at t/002_scram.pl line 161. [18:16:33.266](0.001s) # got: 'psql: error: connection to server at "127.0.0.1", port 10442 failed: SSL error: sslv3alert handshake failure' # expected: '' [18:16:33.268](0.002s) not ok 28 - SCRAM with SSL and channel_binding=require, server certificate uses 'rsassaPss': log matches [18:16:33.269](0.001s) [18:16:33.269](0.000s) # Failed test 'SCRAM with SSL and channel_binding=require, server certificate uses 'rsassaPss':log matches' # at /home/tgl/pgsql/src/test/ssl/../../../src/test/perl/PostgreSQL/Test/Cluster.pm line 2607. [18:16:33.270](0.001s) # '2025-05-05 18:16:33.222 EDT [71478] [unknown] LOG: connection received: host=localhostport=42632 # 2025-05-05 18:16:33.244 EDT [71478] [unknown] LOG: could not accept SSL connection: missing rsa certificate # ' # doesn't match '(?^:connection authenticated: identity="ssltestuser" method=scram-sha-256)' regards, tom lane
I wrote: > Yeah, I see that too. But I also see three failures in 002_scram.pl, > which presumably were there before e0f373ee. (Tested on OpenBSD 7.6 > and 7.7.) The buildfarm's OpenBSD animals haven't caught this > because they don't run this test suite :-(. I dug into this with gdb, and it seems like it's an omission in LibreSSL's RSA-PSS support. We're requesting a signature algorithm with code 2054, which in their source is SIGALG_RSA_PSS_RSAE_SHA512, and that leads them to this sigalgs[] table entry in lib/libssl/ssl_sigalgs.c: { .value = SIGALG_RSA_PSS_RSAE_SHA512, .key_type = EVP_PKEY_RSA, .md = EVP_sha512, .security_level = 5, .flags = SIGALG_FLAG_RSA_PSS, }, The problem is that the private key has key_type EVP_PKEY_RSA_PSS which is not equal to EVP_PKEY_RSA, so ssl_sigalg_pkey_ok() in the same file rejects this entry as not compatible. In fact, there are *no* entries in sigalgs[] that can match a PSS private key according to ssl_sigalg_pkey_ok(). So while perhaps SIGALG_RSA_PSS_RSAE_SHA512 is the wrong thing for us to request, I do not see a right thing. Looking around a little more, there are places like SSL_get_signature_type_nid() that do things like this: *nid = sigalg->key_type; if (sigalg->key_type == EVP_PKEY_RSA && (sigalg->flags & SIGALG_FLAG_RSA_PSS)) *nid = EVP_PKEY_RSA_PSS; So it seems like this might be a simple oversight in ssl_sigalg_pkey_ok(), which doesn't make any such correction: if (sigalg->key_type != EVP_PKEY_id(pkey)) return 0; Anyone know anything about where to submit LibreSSL bugs? regards, tom lane
On Wed, May 7, 2025 at 1:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > So it seems like this might be a simple oversight in > ssl_sigalg_pkey_ok(), which doesn't make any such correction: > > if (sigalg->key_type != EVP_PKEY_id(pkey)) > return 0; Nice detective work. > Anyone know anything about where to submit LibreSSL bugs? I think it's done with sendbug on an OpenBSD box, or perhaps you can just write a normal email to the bugs@openbsd.org or libressl@openbsd.org list, based on: https://www.openbsd.org/mail.html
Thomas Munro <thomas.munro@gmail.com> writes: > On Wed, May 7, 2025 at 1:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Anyone know anything about where to submit LibreSSL bugs? > I think it's done with sendbug on an OpenBSD box, or perhaps you can > just write a normal email to the bugs@openbsd.org or > libressl@openbsd.org list, based on: > https://www.openbsd.org/mail.html Thanks, I'll look into reporting it tomorrow. In the meantime, I couldn't help noticing that the backtraces went through lib/libssl/tls13_legacy.c, which doesn't give a warm feeling about how supported they think our usage is (and perhaps also explains why they didn't detect this bug themselves). This is evidently because we set up the SSL context with SSLv23_method(), per this comment in be_tls_init(): * We use SSLv23_method() because it can negotiate use of the highest * mutually supported protocol version, while alternatives like * TLSv1_2_method() permit only one specific version. Note that we don't * actually allow SSL v2 or v3, only TLS protocols (see below). This choice seems to be more than 20 years old, though the above comment defending it dates only to 2014. I wonder if it's time to revisit that idea. regards, tom lane
On Wed, May 7, 2025 at 4:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thanks, I'll look into reporting it tomorrow. In the meantime, > I couldn't help noticing that the backtraces went through > lib/libssl/tls13_legacy.c, which doesn't give a warm feeling > about how supported they think our usage is (and perhaps also > explains why they didn't detect this bug themselves). This is > evidently because we set up the SSL context with SSLv23_method(), > per this comment in be_tls_init(): Oh, interesting. I also wondered if the problem I reported might be related to the separate legacy code paths in x509_vfy.c.
> On 7 May 2025, at 06:34, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Thomas Munro <thomas.munro@gmail.com> writes: >> On Wed, May 7, 2025 at 1:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Anyone know anything about where to submit LibreSSL bugs? > >> I think it's done with sendbug on an OpenBSD box, or perhaps you can >> just write a normal email to the bugs@openbsd.org or >> libressl@openbsd.org list, based on: >> https://www.openbsd.org/mail.html Bugs are frequently reported, and dealt with, on the libressl mailing list. > Thanks, I'll look into reporting it tomorrow. In the meantime, > I couldn't help noticing that the backtraces went through > lib/libssl/tls13_legacy.c, which doesn't give a warm feeling > about how supported they think our usage is (and perhaps also > explains why they didn't detect this bug themselves). This is > evidently because we set up the SSL context with SSLv23_method(), > per this comment in be_tls_init(): > > * We use SSLv23_method() because it can negotiate use of the highest > * mutually supported protocol version, while alternatives like > * TLSv1_2_method() permit only one specific version. Note that we don't > * actually allow SSL v2 or v3, only TLS protocols (see below). > > This choice seems to be more than 20 years old, though the above > comment defending it dates only to 2014. I wonder if it's time to > revisit that idea. The use of SSLv23_method() comes from us supporting 1.0.2, the replacement TLS_method() was introduced in 1.1.0 and SSLv23_method() was turned into an alias of TLS_method() in OpenSSL commit 32ec41539b5. Since we no longer support 1.0.2 we can apply something like the (lightly tested) attached which should be a no-op as we already use TLS_method() but via an alias. There's likely more OpenSSL code we can modernize, hopefully we can make some progress on that during the v19 cycle. -- Daniel Gustafsson
Attachment
Daniel Gustafsson <daniel@yesql.se> writes: >> On 7 May 2025, at 06:34, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I couldn't help noticing that the backtraces went through >> lib/libssl/tls13_legacy.c, which doesn't give a warm feeling >> about how supported they think our usage is (and perhaps also >> explains why they didn't detect this bug themselves). > Since we no longer support 1.0.2 we can apply something like the (lightly > tested) attached which should be a no-op as we already use TLS_method() but via > an alias. Yeah, I saw that SSLv23_method() was merely an alias for TLS_method() in LibreSSL as well. That means unfortunately that your proposal is just cosmetic and doesn't get us out of using code that they're calling "legacy". I wonder what it would take to get to the "modern" code paths. regards, tom lane
I filed a report with the LibreSSL folks, and the answer was basically "yeah, we know RSA-PSS doesn't work; there are multiple things that need fixing" [1]. So it seems like we'd better find a way to skip that test case when using LibreSSL. Not sure what the most appropriate check would look like. regards, tom lane [1] https://marc.info/?l=libressl&m=174664225002441&w=2
> On 7 May 2025, at 21:29, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I filed a report with the LibreSSL folks, and the answer was basically > "yeah, we know RSA-PSS doesn't work; there are multiple things that > need fixing" [1]. So it seems like we'd better find a way to skip > that test case when using LibreSSL. Not sure what the most > appropriate check would look like. I haven't looked at the test in question yet, but we do skip some SSL tests if running against libressl already so I assume this will be able to follow the same pattern. -- Daniel Gustafsson
Thomas Munro <thomas.munro@gmail.com> writes: > Oh, interesting. I also wondered if the problem I reported might be > related to the separate legacy code paths in x509_vfy.c. I poked at this a little more, and apparently the code to look at is actually in the sibling file x509_verify.c. Here's the backtrace (in OpenBSD 7.7) from where we construct the incorrect report: #0 verify_cb (ok=0, ctx=0x17d83e77c00) at be-secure-openssl.c:1213 #1 0x0000017ce17483dc in x509_verify (ctx=0x17d1886f700, leaf=0x17d188882a0, name=0x0) at /usr/src/lib/libcrypto/x509/x509_verify.c:1271 #2 0x0000017ce16ed683 in _lcry_X509_verify_cert (ctx=0x17d83e77c00) at /usr/src/lib/libcrypto/x509/x509_vfy.c:645 #3 0x0000017d44ebc4e3 in ssl_verify_cert_chain (s=0x17d83e96500, certs=0x17d83e5fce0) at /usr/src/lib/libssl/ssl_cert.c:437 #4 0x0000017d44eb3d13 in tls13_client_certificate_recv (ctx=0x17d18881f00, cbs=<optimized out>) at /usr/src/lib/libssl/tls13_server.c:918 #5 0x0000017d44ee8d66 in tls13_handshake_recv_action (ctx=0x17d18881f00, action=0x17d44ef6dd0 <state_machine+352>) at /usr/src/lib/libssl/tls13_handshake.c:549 #6 tls13_handshake_perform (ctx=0x17d18881f00) at /usr/src/lib/libssl/tls13_handshake.c:412 #7 0x0000017d44eb18c7 in tls13_legacy_accept (ssl=0x17d83e96500) at /usr/src/lib/libssl/tls13_legacy.c:434 #8 0x0000017a8bbd6f3a in be_tls_open_server (port=0x17d6d91f9b0) at be-secure-openssl.c:494 #9 0x0000017a8bbc41c8 in secure_open_server (port=0x17d6d91f9b0) at be-secure.c:136 #10 0x0000017a8be9d79e in ProcessStartupPacket (port=0x17d6d91f9b0, ssl_done=false, gss_done=false) at backend_startup.c:614 The code in x509_verify() is about as convoluted as can be: it seems to be dealing with two different verify-callback APIs (and naturally, the one we are using is the legacy one). There is some stuff there that saves and restores an error condition, but fails to save/restore current_cert, so the first thing I thought was that that was the problem --- but it's not, the current_cert doesn't change anyway between the save and the restore. So I failed to identify exactly where the problem is, but I wonder if it'd behave better if we switched to their new-style callback API. regards, tom lane
Daniel Gustafsson <daniel@yesql.se> writes: > I haven't looked at the test in question yet, but we do skip some SSL tests if > running against libressl already so I assume this will be able to follow the > same pattern. Ah, thanks for the tip. I propose the attached, which disables the RSA-PSS test altogether on LibreSSL, and modifies the intermediate-cert test to accept the result we're actually getting on LibreSSL. We could revert that one if anyone can figure out how to make it better, but I don't wish to put any more time into it myself. regards, tom lane diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl index 8b0de2d8e7e..ccf8ef9bf2e 100644 --- a/src/test/ssl/t/001_ssltests.pl +++ b/src/test/ssl/t/001_ssltests.pl @@ -895,14 +895,30 @@ switch_server_cert( # intermediate CA is provided but doesn't have a trusted root (checks error # logging for cert chain depths > 0) -$node->connect_fails( - "$common_connstr sslmode=require sslcert=ssl/client+client_ca.crt", - "intermediate client certificate is untrusted", - expected_stderr => qr/SSL error: tlsv1 alert unknown ca/, - log_like => [ - qr{Client certificate verification failed at depth 1: unable to get local issuer certificate}, - qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client certs",serial number \d+, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, - ]); +# As of 5/2025, LibreSSL reports a different cert as being at fault; +# it's probably wrong, but seems to be their bug not ours +if (!$libressl) +{ + $node->connect_fails( + "$common_connstr sslmode=require sslcert=ssl/client+client_ca.crt", + "intermediate client certificate is untrusted", + expected_stderr => qr/SSL error: tlsv1 alert unknown ca/, + log_like => [ + qr{Client certificate verification failed at depth 1: unable to get local issuer certificate}, + qr{Failed certificate data \(unverified\): subject "/CN=Test CA for PostgreSQL SSL regression test client certs",serial number \d+, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite"}, + ]); +} +else +{ + $node->connect_fails( + "$common_connstr sslmode=require sslcert=ssl/client+client_ca.crt", + "intermediate client certificate is untrusted", + expected_stderr => qr/SSL error: tlsv1 alert unknown ca/, + log_like => [ + qr{Client certificate verification failed at depth 1: unable to get local issuer certificate}, + qr{Failed certificate data \(unverified\): subject "/CN=ssltestuser", serial number \d+, issuer "/CN=Test CAfor PostgreSQL SSL regression test client certs"}, + ]); +} # test server-side CRL directory switch_server_cert( diff --git a/src/test/ssl/t/002_scram.pl b/src/test/ssl/t/002_scram.pl index 9e4947f4e3c..73d45909e31 100644 --- a/src/test/ssl/t/002_scram.pl +++ b/src/test/ssl/t/002_scram.pl @@ -49,6 +49,13 @@ my $SERVERHOSTCIDR = '127.0.0.1/32'; my $supports_rsapss_certs = check_pg_config("#define HAVE_X509_GET_SIGNATURE_INFO 1"); +# Determine whether this build uses OpenSSL or LibreSSL. As a heuristic, the +# HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL. +my $libressl = not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1"); + +# As of 5/2025, LibreSSL doesn't actually work for RSA-PSS certificates. +$supports_rsapss_certs = 0 if $libressl; + # Allocation of base connection string shared among multiple tests. my $common_connstr;
> On 7 May 2025, at 23:54, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >> I haven't looked at the test in question yet, but we do skip some SSL tests if >> running against libressl already so I assume this will be able to follow the >> same pattern. > > Ah, thanks for the tip. I propose the attached, which disables the > RSA-PSS test altogether on LibreSSL, and modifies the > intermediate-cert test to accept the result we're actually getting > on LibreSSL. We could revert that one if anyone can figure out > how to make it better, but I don't wish to put any more time into > it myself. LGTM for now. > +# Determine whether this build uses OpenSSL or LibreSSL. As a heuristic, the > +# HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL. > +my $libressl = not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1"); Longer term it would be nice to move this into SSL::Server and have the module export a function or symbol which returns the underlying library and version, but that's not for this patch. > + > +# As of 5/2025, LibreSSL doesn't actually work for RSA-PSS certificates. Should we add a link to the relevant thread for future readers? OpenBSD refer to MARC for archiving which I believe is stable enough for an inclusion. https://marc.info/?l=libressl&m=174664225002441&w=2 -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: > On 7 May 2025, at 23:54, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> +# Determine whether this build uses OpenSSL or LibreSSL. As a heuristic, the >> +# HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL. >> +my $libressl = not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1"); > Longer term it would be nice to move this into SSL::Server and have the module > export a function or symbol which returns the underlying library and version, > but that's not for this patch. I was feeling itchy about having two copies of code that looks none too set-in-stone. Maybe we should just do that. Any preferences on the API? >> +# As of 5/2025, LibreSSL doesn't actually work for RSA-PSS certificates. > Should we add a link to the relevant thread for future readers? OpenBSD refer > to MARC for archiving which I believe is stable enough for an inclusion. WFM, I'll make it so. regards, tom lane
> On 8 May 2025, at 15:49, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >> On 7 May 2025, at 23:54, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> +# Determine whether this build uses OpenSSL or LibreSSL. As a heuristic, the >>> +# HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL. >>> +my $libressl = not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1"); > >> Longer term it would be nice to move this into SSL::Server and have the module >> export a function or symbol which returns the underlying library and version, >> but that's not for this patch. > > I was feeling itchy about having two copies of code that looks none > too set-in-stone. Maybe we should just do that. Any preferences > on the API? There is already SSL::Server::ssl_library() which returns the underlying library, but it's not smart enough to differentiate between which flavour of OpenSSL compatible library is being used (OpenSSL, Libressl, BoringSSL etc) as it's only returning a hardcoded string as of now. My plan was to expand that at some point. -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: > On 8 May 2025, at 15:49, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I was feeling itchy about having two copies of code that looks none >> too set-in-stone. Maybe we should just do that. Any preferences >> on the API? > There is already SSL::Server::ssl_library() which returns the underlying > library, but it's not smart enough to differentiate between which flavour of > OpenSSL compatible library is being used (OpenSSL, Libressl, BoringSSL etc) as > it's only returning a hardcoded string as of now. My plan was to expand that > at some point. Hm. There is this bit in 001_ssltests.pl: my $result = $node->safe_psql('postgres', "SHOW ssl_library"); is($result, $ssl_server->ssl_library(), 'ssl_library parameter'); which would break. Admittedly that's not a very exciting test, so I wouldn't feel bad about dropping it, but maybe someone else would. Also, it seems like ssl_library is mainly intended to distinguish which "backend" module is in use, so having the one string "OpenSSL" seems to match up with the one backend "OpenSSL.pm". What we're talking about here feels like a finer subdivision. I'm not quite sure how it ought to fit into that "backend" structure. We could deal with the RSA-PSS issue pretty cleanly by inventing a backend method "supports_rsa_pss", but the other thing we're trying to hack around doesn't seem like it has such a clean definition. regards, tom lane
> On 8 May 2025, at 22:24, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >> On 8 May 2025, at 15:49, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I was feeling itchy about having two copies of code that looks none >>> too set-in-stone. Maybe we should just do that. Any preferences >>> on the API? > >> There is already SSL::Server::ssl_library() which returns the underlying >> library, but it's not smart enough to differentiate between which flavour of >> OpenSSL compatible library is being used (OpenSSL, Libressl, BoringSSL etc) as >> it's only returning a hardcoded string as of now. My plan was to expand that >> at some point. > > Hm. There is this bit in 001_ssltests.pl: > > my $result = $node->safe_psql('postgres', "SHOW ssl_library"); > is($result, $ssl_server->ssl_library(), 'ssl_library parameter'); > > which would break. Admittedly that's not a very exciting test, > so I wouldn't feel bad about dropping it, but maybe someone else > would. I have no problems dropping that, it's rather uninteresting. > Also, it seems like ssl_library is mainly intended to distinguish > which "backend" module is in use, so having the one string "OpenSSL" > seems to match up with the one backend "OpenSSL.pm". What we're > talking about here feels like a finer subdivision. I'm not quite > sure how it ought to fit into that "backend" structure. The backend concept was mostly intended to match up with the underlying library. It get's a bit murky as OpenSSL tough since it's a library, but also a popular API compatibility target implemented by multiple libraries (Libressl, Boringssl, Wolfssl come to mind). Maybe the ssl_library function should return a hash with backend => 'OpenSSL' and library => <the actual implementation used>? Then the test author can decide which level of compatibility they want? If we were to end up with a Libressl libtls implementation in libpq we'd still have to test with Libressl against the OpenSSL compat layer in libssl since it could act as both. Not a bridge we have to cross today but might be worth at least keeping in mind when designing something to not make it impossible in the future. -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: > Maybe the ssl_library function should return a hash with backend => 'OpenSSL' > and library => <the actual implementation used>? I don't love doing it exactly like that: seems like it adds notational complexity for little gain. Also, it forces ssl_library to expend work detecting things the current caller may not care about. I was thinking about just transposing the existing test down to the backend layer, more or less as attached. Not wedded to these names of course. > If we were to end up with a > Libressl libtls implementation in libpq we'd still have to test with Libressl > against the OpenSSL compat layer in libssl since it could act as both. Not a > bridge we have to cross today but might be worth at least keeping in mind when > designing something to not make it impossible in the future. Right. I think the attached would be amenable to that. Further down the road, it seems inevitable that we'll need to have a way of detecting the SSL library version --- for example, assuming the LibreSSL folk eventually fix their RSA-PSS code, we'll need a version-dependent test. That could be another new backend method, I guess. regards, tom lane diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl index 8b0de2d8e7e..60b5c3630b0 100644 --- a/src/test/ssl/t/001_ssltests.pl +++ b/src/test/ssl/t/001_ssltests.pl @@ -35,9 +35,8 @@ sub switch_server_cert $ssl_server->switch_server_cert(@_); } -# Determine whether this build uses OpenSSL or LibreSSL. As a heuristic, the -# HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL. -my $libressl = not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1"); +# Determine whether this build uses OpenSSL or LibreSSL. +my $libressl = $ssl_server->is_libressl; #### Some configuration diff --git a/src/test/ssl/t/SSL/Backend/OpenSSL.pm b/src/test/ssl/t/SSL/Backend/OpenSSL.pm index e044318531f..4159addb700 100644 --- a/src/test/ssl/t/SSL/Backend/OpenSSL.pm +++ b/src/test/ssl/t/SSL/Backend/OpenSSL.pm @@ -26,6 +26,7 @@ package SSL::Backend::OpenSSL; use strict; use warnings FATAL => 'all'; +use PostgreSQL::Test::Utils; use File::Basename; use File::Copy; @@ -205,6 +206,23 @@ sub get_library return $self->{_library}; } +=pod + +=item $backend->library_is_libressl() + +Detect whether the SSL library is LibreSSL. + +=cut + +sub library_is_libressl +{ + my ($self) = @_; + + # The HAVE_SSL_CTX_SET_CERT_CB macro isn't defined for LibreSSL. + # We may eventually need a less-bogus heuristic. + return not check_pg_config("#define HAVE_SSL_CTX_SET_CERT_CB 1"); +} + # Internal method for copying a set of files, taking into account wildcards sub _copy_files { diff --git a/src/test/ssl/t/SSL/Server.pm b/src/test/ssl/t/SSL/Server.pm index 14277418419..33975b28e8c 100644 --- a/src/test/ssl/t/SSL/Server.pm +++ b/src/test/ssl/t/SSL/Server.pm @@ -244,6 +244,23 @@ sub ssl_library =pod +=item $server->is_libressl() + +Detect whether the currently used SSL backend is LibreSSL. +(Ideally we'd not need this hack, but presently we do.) + +=cut + +sub is_libressl +{ + my $self = shift; + my $backend = $self->{backend}; + + return $backend->library_is_libressl(); +} + +=pod + =item switch_server_cert(params) Change the configuration to use the given set of certificate, key, ca and
Daniel Gustafsson <daniel@yesql.se> writes: > On 9 May 2025, at 02:15, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Right. I think the attached would be amenable to that. > It will be a bit awkward to ask "are you libressl" if we ever add support for > something not OpenSSL based, but we could always revisit should that happen. I was imagining that unrelated backends would simply make the method report constant-false. But in any case, redesigning this in the light of any future requirements doesn't seem like a big deal. What I'm mainly after for today is just to ensure that these various which- library-is-it tests are visibly interconnected, so that we don't forget to check them all when things change. regards, tom lane
Pushed all that stuff. The SSL tests pass for me now on OpenBSD 7.7, and hopefully the CI environment will be happy too. regards, tom lane
> On 9 May 2025, at 18:42, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Pushed all that stuff. The SSL tests pass for me now on OpenBSD 7.7, > and hopefully the CI environment will be happy too. Thanks! -- Daniel Gustafsson