When a dumb http server reports alternates with an absolute path, we try
to paste that onto the root of the URL we're trying to fetch from. So if
we go to "http://example.com/path/to/child.git" and it tells us about an
alternate at "/parent.git", we'll hit "http://example.com/parent.git".
But there's a bug in computing the base when the URL does not have any
path component at all, like "http://example.com". When looking for the
first slash after the host, strchr() returns NULL, and we compute a
nonsense value for the length of the host portion. And then when we use
that length to copy the base of the URL into a strbuf, we're likely to
fail.
The security implications are minimal here. We store the nonsense length
("serverlen") as an int, so on a 64-bit system it may effectively be
anything (it is zero minus a 64-bit heap pointer, then truncated to
32-bits and stuffed into a signed value). When we feed that length to
strbuf_add(), it is cast into a size_t and one of four things will
happen:
1. If serverlen was negative, it will turn into a very large positive
value and strbuf_add() will fail to allocate, ending the program.
Ditto if serverlen was positive but just very large.
This doesn't really get an attacker anything; the victim will just
fail to clone their evil repo.
2. If serverlen was small enough, we'll successfully extend the target
strbuf, and then copy an arbitrary set of bytes from "base". And
then one of these is true:
a. That set of bytes is much larger than the length of the "base"
string. This is an out-of-bounds read, but there's no
out-of-bounds write, since the strbuf code both allocates and
copies using the same size_t. This is likely to cause a
segfault as we try to read unmapped pages of memory.
b. Like (2a), but if the set of bytes is small enough we might
not segfault. We might read random memory from the process and
copy it into the "target" strbuf.
What happens then? We know that "base" ends with a NUL
terminator, which will be copied into "target" as well. So
even though target.len might be 1000 bytes (or whatever), when
interpreted as a NUL-terminated string, target.buf is still
the exact same string as "base".
And that's all we ever do with target: pass it around as a C
string, and then eventually strbuf_detach() it to become a C
string. So even though there was arbitrary memory copied into
the strbuf, we never access it.
c. The other interesting case is when serverlen is actually
_shorter_ than the length of base. And there we truncate the
string. Probably in a way that makes it totally invalid, but
if you were very unlucky you could turn something like:
http://victim.com.evil.domain:8000
into:
http://victim.com
Which looks like the start of a redirect attack, except that
the attacker could just have written "http://victim.com" in
the first place! Either way we feed it to
is_alternate_allowed(), which is where we check redirect and
protocol rules.
I think we can just treat this like a regular bug.
And it's quite a weird setup in the first place, as it implies that the
root of the web server is serving a repository (i.e., that you can get
something useful from "http://example.com/info/refs"). The bug has been
there since
b3661567cf ([PATCH] Add support for alternates in HTTP,
2005-09-14) without anybody noticing.
I kind of doubt anybody really cares about making this work, but it's
easy enough to do so: the host-portion of the URL ends at either the
first slash or the end-of-string. So we can just replace strchr() with
strchrnul().
The test setup is a little gross, as we take over the httpd document
root by shoving our bare-repo components into it. But it demonstrates
the problem and shows that our solution actually allows the alternate to
function, if the server is configured to allow it.
Reported-by: slonkazoid <slonkazoid@slonk.ing>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>