raw_send_hdrinc() validates that the caller-supplied IPv4 header
fits within the message length:
iphlen = iph->ihl * 4;
err = -EINVAL;
if (iphlen > length)
goto error_free;
if (iphlen >= sizeof(*iph)) {
/* fix up saddr, tot_len, id, csum, transport_header */
}
It does not, however, reject ihl < 5. For such a packet the
"if (iphlen >= sizeof(*iph))" branch is skipped, leaving the
crafted iphdr untouched, but the packet is still handed to
__ip_local_out() and onward. Downstream consumers that read
iph->ihl assume a sane value: net/ipv4/ah4.c:ah_output() in
particular subtracts sizeof(struct iphdr) from top_iph->ihl * 4
and passes the (signed-int-negative, then cast to size_t)
result to memcpy(), producing an OOB access of length close to
SIZE_MAX and a host kernel panic.
An IPv4 header with ihl < 5 is malformed by definition (RFC 791:
"Internet Header Length is the length of the internet header in
32 bit words ... Note that the minimum value for a correct header
is 5."). The kernel should not be willing to inject such a
packet into its own output path.
Reject "iphlen < sizeof(*iph)" alongside the existing
"iphlen > length" check. This matches the principle that locally
constructed packets that re-enter the IP stack must pass the same
basic sanity tests that a foreign packet would be subjected to.
Once this lands, the "if (iphlen >= sizeof(*iph))" wrapper around
the fixup branch becomes redundant; left in place to keep the
patch minimal and backport-friendly. A follow-up can unwrap it.
Note that commit
86f4c90a1c5c ("ipv4, ipv6: ensure raw socket
message is big enough to hold an IP header") ensures the message
buffer is large enough to hold an iphdr, but does not constrain
the self-reported iph->ihl.
Reachability: the malformed packet source is any caller with
CAP_NET_RAW, including an unprivileged process in a user+net
namespace on a kernel with CONFIG_USER_NS=y. The reproduced AH
crash also requires a matching xfrm AH policy on the outgoing
route; a container granted CAP_NET_ADMIN can install that state
and policy in its netns. Loopback bypasses xfrm_output, so the
trigger uses a real netdev.
Reproduced on UML + KASAN: kernel-mode fault at addr 0x0 with
memcpy_orig at the crash site. Same shape reproduces inside a
rootless Docker container with --cap-add NET_ADMIN on a stock
distro kernel.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/77ec2b5e8111961c2c39883c92e8aa2709039c17.1778614451.git.michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>