BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only)
By "aws-lc only", one means that this bug was first revealed by aws-lc stack.
This does not mean it will not appeared for new versions of other TLS stacks which
have never revealed this bug.
This bug was reported by Ilya (@chipitsine) in GH #2657 where some QUIC interop
tests (resumption, zerortt) could lead to crash with haproxy compiled against
aws-lc TLS stack. These crashed were triggered by this BUG_ON() which detects
that too short datagrams with at least one ack-eliciting Initial packet inside
could be built.
<0>2024-07-31T15:13:42.562717+02:00 [01|quic|5|quic_tx.c:739] qc_prep_pkts():
next encryption level : qc@0x61d000041080 idle_timer_task@0x60d000006b80 flags=0x6000058
That said everything was correctly done by qc_prep_ptks() to prevent such a case.
But this relied on the hypothesis that the list of encryption levels it used
was always built in the same order as follows for 0-RTT sessions:
initial, early-data, handshake, application
But this order is determined but the order the TLS stack derives the secrets
for these encryption levels. For aws-lc, this order is not the same but
as follows:
initial, handshake, application, early-data
During 0-RTT sessions, the server may have to build three ack-eliciting packets
(with CRYPTO data inside) to reply to the first client packet: initial, hanshake,
application. qc_prep_pkts() adds a PADDING frame to the last built packet
for the last encryption level in the list. But after application level encryption,
there is early-data encryption level. This prevented qc_prep_pkts() to build
a padded applicaiton level last packet to send a 1200-bytes datagram.
To fix this, always insert early-data encryption level after the initial
encryption level into the encryption levels list when initializing this encryption
level from quic_conn_enc_level_init().