From: Artem Boldariev Date: Tue, 4 Oct 2022 18:03:23 +0000 (+0300) Subject: TLS: clear error queue before doing IO or calling SSL_get_error() X-Git-Tag: v9.19.7~70^2~2 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=6789b88d258378befc528b1a8ac9a36b9c4fea1e;p=thirdparty%2Fbind9.git TLS: clear error queue before doing IO or calling SSL_get_error() Ensure that TLS error is empty before calling SSL_get_error() or doing SSL I/O so that the result will not get affected by prior error statuses. In particular, the improper error handling led to intermittent unit test failure and, thus, could be responsible for some of the system test failures and other intermittent TLS-related issues. See here for more details: https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html In particular, it mentions the following: > The current thread's error queue must be empty before the TLS/SSL > I/O operation is attempted, or SSL_get_error() will not work > reliably. As we use the result of SSL_get_error() to decide on I/O operations, we need to ensure that it works reliably by cleaning the error queue. TLS DNS: empty error queue before attempting I/O --- diff --git a/lib/isc/netmgr/tlsdns.c b/lib/isc/netmgr/tlsdns.c index 47c21dbd4d8..051dbf814f4 100644 --- a/lib/isc/netmgr/tlsdns.c +++ b/lib/isc/netmgr/tlsdns.c @@ -1394,6 +1394,27 @@ static isc_result_t tls_cycle(isc_nmsocket_t *sock) { isc_result_t result; + /* + * Clear the TLS error queue so that SSL_get_error() and SSL I/O + * routine calls will not get affected by prior error statuses. + * + * See here: + * https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html + * + * In particular, it mentions the following: + * + * The current thread's error queue must be empty before the + * TLS/SSL I/O operation is attempted, or SSL_get_error() will not + * work reliably. + * + * As we use the result of SSL_get_error() to decide on I/O + * operations, we need to ensure that it works reliably by + * cleaning the error queue. + * + * The sum of details: https://stackoverflow.com/a/37980911 + */ + ERR_clear_error(); + if (isc__nmsocket_closing(sock)) { return (ISC_R_CANCELED); } diff --git a/lib/isc/netmgr/tlsstream.c b/lib/isc/netmgr/tlsstream.c index a90485a0665..bd562f3313b 100644 --- a/lib/isc/netmgr/tlsstream.c +++ b/lib/isc/netmgr/tlsstream.c @@ -399,6 +399,27 @@ tls_do_bio(isc_nmsocket_t *sock, isc_region_t *received_data, isc_nm_read_stop(sock->outerhandle); } + /* + * Clear the TLS error queue so that SSL_get_error() and SSL I/O + * routine calls will not get affected by prior error statuses. + * + * See here: + * https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html + * + * In particular, it mentions the following: + * + * The current thread's error queue must be empty before the + * TLS/SSL I/O operation is attempted, or SSL_get_error() will not + * work reliably. + * + * As we use the result of SSL_get_error() to decide on I/O + * operations, we need to ensure that it works reliably by + * cleaning the error queue. + * + * The sum of details: https://stackoverflow.com/a/37980911 + */ + ERR_clear_error(); + if (sock->tlsstream.state == TLS_INIT) { INSIST(received_data == NULL && send_data == NULL); if (sock->tlsstream.server) { @@ -534,7 +555,7 @@ tls_do_bio(isc_nmsocket_t *sock, isc_region_t *received_data, } pending = tls_process_outgoing(sock, finish, send_data); - if (pending > 0) { + if (pending > 0 && tls_status != SSL_ERROR_SSL) { /* We'll continue in tls_senddone */ return; }