git.ipfire.org Git - thirdparty/git.git/commit

author	Jeff King <peff@peff.net>
	Sun, 25 Aug 2019 08:08:21 +0000 (04:08 -0400)
committer	Junio C Hamano <gitster@pobox.com>
	Tue, 27 Aug 2019 22:02:49 +0000 (15:02 -0700)
commit	9756082b3cfaf69574bc3283cf4c9ba9c91442bc
tree	d7ee74108438064f41cdceae893a15c665b7d203	tree \| snapshot
parent	745f6812895b31c02b29bdfe4ae8e5498f776c26	commit \| diff

fast-import: duplicate parsed encoding string

We read each line of the fast-import stream into the command_buf strbuf.
When reading a commit, we parse a line like "encoding foo" by storing a
pointer to "foo", but not making a copy. We may then read an unbounded
number of other lines (e.g., one for each modified file in the commit),
each of which writes into command_buf.

This works out in practice for small cases, because we hand off
ownership of the heap buffer from command_buf to the cmd_hist array, and
read new commands into a fresh heap buffer. And thus the pointer to
"foo" remains valid as long as there aren't so many intermediate lines
that we end up dropping the original "encoding" line from the history.

But as the test modification shows, if we go over our default of 100
lines, we end up with our encoding string pointing into freed heap
memory. This seems to fail reliably by writing garbage into the output,
but running under ASan definitely detects this as a use-after-free.

We can fix it by duplicating the encoding value, just as we do for other
parsed lines (e.g., an author line ends up in parse_ident, which copies
it to a new string).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fast-import.c		diff \| blob \| blame \| history
t/t9300-fast-import.sh		diff \| blob \| blame \| history