From: Neil Horman Date: Mon, 4 Aug 2025 22:29:52 +0000 (-0400) Subject: Ensure that the largest_pn values are migrated to our channel qrx X-Git-Url: http://git.ipfire.org/gitweb/gitweb.cgi?a=commitdiff_plain;h=0fa07898e1e0b57e548be3a037f31a746ae7f9a4;p=thirdparty%2Fopenssl.git Ensure that the largest_pn values are migrated to our channel qrx Recently, our overnight QUIC interop runs began failing in CI when an openssl server was tested against an ngtcp2 client: https://github.com/openssl/openssl/actions/runs/16739736813 The underlying cause bears some explination for historical purposes The problem began happening with a recent update to ngtcp2 in which ngtcp2 updated its wolfssl tls backend to support ML-KEM, which caused ngtcp to emit a client hello message that offered several groups (including X25519MLKEM768) but only provided a keyshare for x25519. This in turn triggered the openssl server to respond with a hello retry request (HRR), requesting an ML-KEM keyshare instead, which ngtcp2 obliged. However all subsequent frames from the client were discarded by the server, due to failing packet body decryption. The problem was tracked down to a mismatch in the initial vectors used by the client and server, leading to an AEAD tag mismatch. Packet protection keys generate their IV's in QUIC by xoring the packet number of the received frame with the base IV as derived via HKDF in the tls layer. The underlying problem was that openssl hit a very odd corner case with how we compute the packet number of the received frame. To save space, QUIC encodes packet numbers using a variable length integer, and only sends the changed bits in the packet number. This requires that the receiver (openssl) store the largest received pn of the connection, which we nominally do. However, in default_port_packet_handler (where QUIC frames are processed prior to having an established channel allocated) we use a temporary qrx to validate the packet protection of those frames. This temporary qrx may be incorporated into the channel in some cases, but is not in the case of a valid frame that generates an HRR at the TLS layer. In this case, the channel allocates its own qrx independently. When this occurs, the largest_pn value of the temporary qrx is lost, and subsequent frames are unable to be received, as the newly allocated qrx belives that the larges_pn for a given pn_space is 0, rather than the value received in the initial frame (which was a complete 32 bit value, rather than just the changed lower 8 bits). As a result the IV construction produced the wrong value, and the decrypt failed on those subsequent frames. Up to this point, that wasn't even a problem, as most quic implementations start their packet numbering at 0, so the next packet could still have its packet number computed properly. The combination of ngtcp using large random values for initial packet numbers, along with the HRR triggering a separate qrx creation on a channel led to the discovery of this discrepancy. The fix seems pretty straightforward. When we detect in port_default_packet_handler, that we have a separate qrx in the new channel, we migrate processed packets from the temporary qrx to the canonical channel qrx. In addition to doing that, we also need to migrate the largest_pn array from the temporary qrx to the channel_qrx so that subsequent frame reception is guaranteed to compute the received frame packet number properly, and as such, compute the proper IV for packet protection decryption. Fixes openssl/project#1296 Reviewed-by: Saša Nedvědický Reviewed-by: Tomas Mraz (Merged from https://github.com/openssl/openssl/pull/28189) --- diff --git a/include/internal/quic_record_rx.h b/include/internal/quic_record_rx.h index 8e0cd6a2c92..24143f91f2f 100644 --- a/include/internal/quic_record_rx.h +++ b/include/internal/quic_record_rx.h @@ -167,6 +167,17 @@ int ossl_qrx_provide_secret(OSSL_QRX *qrx, const unsigned char *secret, size_t secret_len); +/* + * Utility function to update the pn space from a src to a dst qrx. + * Occasionally we use a temporary qrx to do packet validation on quic frames + * that are not yet associated with a channel, and in the event a validation is + * successful AND we allocate a new qrx for the newly created channel, we need + * to migrate the largest_pn values recorded in the tmp qrx to the channel qrx. + * If we don't then PN decoding fails in cases where the initial PN is a large value. + * This function does that migration for us + */ +void ossl_qrx_update_pn_space(OSSL_QRX *src, OSSL_QRX *dst); + /* * Informs the QRX that it can now discard key material for a given EL. The QRX * will no longer be able to process incoming packets received at that diff --git a/ssl/quic/quic_port.c b/ssl/quic/quic_port.c index 680d2b4afa9..d6e6d4d25cb 100644 --- a/ssl/quic/quic_port.c +++ b/ssl/quic/quic_port.c @@ -1691,6 +1691,7 @@ static void port_default_packet_handler(QUIC_URXE *e, void *arg, */ while (ossl_qrx_read_pkt(qrx_src, &qrx_pkt) == 1) ossl_quic_channel_inject_pkt(new_ch, qrx_pkt); + ossl_qrx_update_pn_space(qrx_src, new_ch->qrx); } /* diff --git a/ssl/quic/quic_record_rx.c b/ssl/quic/quic_record_rx.c index e01cc525345..1a8194b396d 100644 --- a/ssl/quic/quic_record_rx.c +++ b/ssl/quic/quic_record_rx.c @@ -237,6 +237,16 @@ static void qrx_cleanup_urxl(OSSL_QRX *qrx, QUIC_URXE_LIST *l) } } +void ossl_qrx_update_pn_space(OSSL_QRX *src, OSSL_QRX *dst) +{ + size_t i; + + for (i = 0; i < QUIC_PN_SPACE_NUM; i++) + dst->largest_pn[i] = src->largest_pn[i]; + + return; +} + void ossl_qrx_free(OSSL_QRX *qrx) { uint32_t i;