git.ipfire.org Git - thirdparty/git.git/commit

author	Jeff King <peff@peff.net>
	Wed, 28 Feb 2024 22:37:44 +0000 (17:37 -0500)
committer	Junio C Hamano <gitster@pobox.com>
	Wed, 28 Feb 2024 22:42:01 +0000 (14:42 -0800)
commit	388b96df319c62e9c8d984921b8967db37481d8a
tree	05b7e5dd82ea3b9dcdc802fac1304bac4d058715	tree \| snapshot
parent	720ba25d993423ab7b12b9bbdc62dadaa78d315e	commit \| diff

upload-pack: use oidset for deepen_not list

We record the oid of every deepen-not line the client sends to us. For a
well-behaved client, the resulting array should be bounded by the number
of unique refs we have. But because there's no de-duplication, a
malicious client can cause the array to grow unbounded by just sending
the same "refs/heads/foo" over and over (assuming such a ref exists).

Since the deepen-not list is just being fed to a "rev-list --not"
traversal, the order of items doesn't matter. So we can replace the
oid_array with an oidset which notices and skips duplicates.

That bounds the memory in malicious cases to be linear in the number of
unique refs. And even in non-malicious cases, there may be a slight
improvement in memory usage if multiple refs point to the same oid
(though in practice this list is probably pretty tiny anyway, as it
comes from the user specifying "--shallow-exclude" on the client fetch).

Note that in the trace2 output we'll now output the number of
de-duplicated objects, rather than the total number of "deepen-not"
lines we received. This is arguably a more useful value for tracing /
debugging anyway.

Reported-by: Benjamin Flesch <benjaminflesch@icloud.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>