git.ipfire.org Git - thirdparty/git.git/commit

revision: manage memory ownership of argv in setup_revisions()

The setup_revisions() function takes an argc/argv pair and consumes
arguments from it, returning a reduced argc count to the caller. But it
may also overwrite entries within the argv array, as it shifts unknown
options to the front of argv (so they can be found in the range of
0..argc-1 after we return).

For a normal argc/argv coming from the operating system, this is OK.
We don't need to worry about memory ownership of the strings in those
entries. But some callers pass in allocated strings from a strvec, and
we do need to care about those.

We faced a similar issue in f92dbdbc6a (revisions API: don't leak memory
on argv elements that need free()-ing, 2022-08-02), which added an
option for callers to tell us that elements need to be freed. But the
implementation within setup_revisions() was incomplete.  It only covered
the case of dropping "--", but not the movement of unknown options.

When we shift argv entries around, we should free the elements we are
about to overwrite, so they are not leaked. For example, in:

  git stash show -p --invalid

we will pass this to setup_revisions():

  argc = 3, argv[] = { "show", "-p", "--invalid", NULL }

which will then return:

   argc = 2, argv[] = { "show", "--invalid", "--invalid", NULL }

overwriting the "-p" entry, which is leaked unless we free it at that
moment.

You can see in the output above another potential problem. We now have
two copies of the "--invalid" string. If the caller does not respect the
new argc when free-ing the strings via strvec_clear(), we'll get a
double-free. And git-stash suffers from this, and will crash with the
above command.

So it seems at first glance that the solution is to just assign the
reduced argc to the strvec.nr field in the caller. Then it would stop
after freeing only any copied entries. But that's not always right
either!

Remember that we are reducing "argc" to account for elements we've
consumed. So if there isn't an invalid option, we'd turn:

  argc = 2, argv[] = { "show", "-p", NULL }

into:

  argc = 1, argv[] = { "show", "-p", NULL }

In that case strvec_clear() must keep looking past the shortened argc we
return to find the original "-p" to free. It needs to use the original
argc to do that.

We can solve this by turning our argv writes into strict moves, not
copies. When we shuffle an unknown option to the front, we'll overwrite
its old position with NULL. That leaves an argv array that may have NULL
"holes" in it.

So in the "--invalid" example above we get:

   argc = 2, argv[] = { "show", "--invalid", NULL, NULL }

but something like "git stash -p --invalid -p" would yield:

  argc = 3, argv[] = { "show", "--invalid", NULL, "-p", NULL }

because we move "--invalid" to overwrite the first "-p", but the second
one is quietly consumed. But strvec_clear() can handle that fine (it
iterates over the "nr" field, and passing NULL to free() is OK).

To ease the implementation, I've introduced a helper function. It's a
little hacky because it must take a double-pointer to set the old
position to NULL. Which in turn means we cannot pass "&arg", our local
alias for the current entry we're parsing, but instead "&argv[i]", the
pointer in the original array. And to make it even more confusing, we
delegate some of this work to handle_revision_opt(), which is passed a
subset of the argv array, so is always working on "&argv[0]".

Likewise, because handle_revision_opt() only receives the part of argv
left to parse, it receives the array to accumulate unknown options as a
separate unkc/unkv pair. But we're always working on the same argv
array, so our strategy works fine. I suspect this would be a bit more
obvious (and avoid some pointer cleverness) if all functions saw the
full argv array and worked with positions within it (and our new helper
would take two positions, a src and dst). But that would involve
refactoring handle_revision_opt().  I punted on that, as what's here is
not too ugly and is all contained within revision.c itself.

The new test demonstrates that "git stash show -p --invalid" no longer
crashes with a double-free (because we move instead of copy). And it
passes with SANITIZE=leak because we free "-p" before overwriting.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

author	Jeff King <peff@peff.net>
	Fri, 19 Sep 2025 22:45:56 +0000 (18:45 -0400)
committer	Junio C Hamano <gitster@pobox.com>
	Mon, 22 Sep 2025 21:27:03 +0000 (14:27 -0700)
commit	cd439487980a212f103fd28ca81a9df33a994d33
tree	f63b76a269d90509e55805023f76efffeb4ca26f	tree \| snapshot
parent	3ea35c64b0e6c86450ebadda22400295103cda64	commit \| diff

revision.c		diff \| blob \| blame \| history
t/t3903-stash.sh		diff \| blob \| blame \| history