Jeff King [Fri, 14 Jun 2024 10:31:22 +0000 (06:31 -0400)]
remote: allow resetting url list
Because remote.*.url is treated as a multi-valued key, there is no way
to override previous config. So for example if you have
remote.origin.url set to some wrong value, doing:
git -c remote.origin.url=right fetch
would not work. It would append "right" to the list, which means we'd
still fetch from "wrong" (since subsequent values are used only as push
urls).
Let's provide a mechanism to reset the list, like we do for other
multi-valued keys (e.g., credential.helper, http.extraheaders, and
merge.suppressDest all use this "empty string means reset" pattern).
Reported-by: Mathew George <mathewegeorge@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 14 Jun 2024 10:30:05 +0000 (06:30 -0400)]
config: document remote.*.url/pushurl interaction
The documentation for these keys gives a very terse definition and
points you to the fetch/push manpages. But from reading those pages it
was not at all obvious to me that:
- these are keys that can be defined multiple times with meaningful
behavior (especially remote.*.url)
- the way that pushurl overrides url (the git-push page does mention
that "pushurl defaults to url", but it is not immediately clear what
a multi-valued url would do in that situation).
Let's try to summarize the current behavior.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 14 Jun 2024 10:29:09 +0000 (06:29 -0400)]
remote: simplify url/pushurl selection
When we want to know the push urls for a remote, there is some simple
logic:
- if the user configured any remote.*.pushurl keys, then those make
the complete set of push urls
- otherwise we push to all urls in remote.*.url
Many spots implement this with a level of indirection, assigning to a
local url/url_nr pair. But since both arrays are now strvecs, we can
just use a pointer to select the appropriate strvec, shortening the code
a bit.
Even though this is now a one-liner, since it is application logic that
is present in so many places, it's worth abstracting a helper function.
In fact, we already have such a function, but it's local to
builtin/push.c. So we'll just make it available everywhere via remote.h.
There are two spots to pay special attention to here:
1. in builtin/remote.c's get_url(), we are selecting first based on
push_mode and then falling back to "url" when we're in push_mode
but no pushurl is defined. The updated code makes that much more
clear, compared to the original which had an "else" fall-through.
2. likewise in that file's set_url(), we _only_ respect push_mode,
sine the point is that we are adding to pushurl in that case
(whether it is empty or not). And thus it does not use our helper
function.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 14 Jun 2024 10:28:01 +0000 (06:28 -0400)]
remote: use strvecs to store remote url/pushurl
Now that the url/pushurl fields of "struct remote" own their strings, we
can switch from bare arrays to strvecs. This has a few advantages:
- push/clear are now one-liners
- likewise the free+assigns in alias_all_urls() can use
strvec_replace()
- we now use size_t for storage, avoiding possible overflow
- this will enable some further cleanups in future patches
There's quite a bit of fallout in the code that reads these fields, as
it tends to access these arrays directly. But it's mostly a mechanical
replacement of "url_nr" with "url.nr", and "url[i]" with "url.v[i]",
with a few variations (e.g. "*url" could become "*url.v", but I used
"url.v[0]" for consistency).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 14 Jun 2024 10:27:22 +0000 (06:27 -0400)]
remote: transfer ownership of memory in add_url(), etc
Many of the internal functions in remote.c take const strings and store
them forever in instances of "struct remote". Since the functions are
internal and callers are aware of the convention, this seems to mostly
work and not cause leaks. But there are some issues:
- it's impossible to clear any of the arrays, because the data
dependencies between them are too muddled (if you free() a string,
it might also be referenced from another array, causing a
user-after-free; but if you don't, that might be the last reference,
causing a leak).
This is mostly of interest for further refactoring and features, but
there's at least one spot that's already a problem. In alias_all_urls(),
we replace elements of remote->url and remote->pushurl with their
aliased forms, dropping references to the original.
- sometimes strings from outside callers make their way in. For
example, calling remote_get("foo") when there is no configured "foo"
remote will create a remote struct with the single url "foo". But
we'll do so by holding on to the string passed to remote_get()
forever.
In practice I think this works out because we'd usually pass in a
string that lasts the length of the program (a string literal, or
argv reference, or other data structure allocated in the main
function). But it's a rather subtle requirement.
Instead, let's have remote->url and remote->pushurl own their string
memory. They'll copy the const strings that are passed in, and callers
can stop making their own copies. Likewise, when we overwrite an entry,
we can free the memory it points to, fixing the leak mentioned above.
We'll leave the struct members as "const" since they are visible to the
outside world, and shouldn't usually be touched. This requires casting
on free() for now, but we'll clean that further in a future patch.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 14 Jun 2024 10:26:16 +0000 (06:26 -0400)]
remote: refactor alias_url() memory ownership
The alias_url() function may return either a newly allocated string
(which the caller must take ownership of), or the original const "url"
parameter that was passed in.
This often works OK because callers are generally passing in a "url"
that they expect to retain ownership of anyway. So whether we got back
the original or a new string, we're always interested in storing it
forever. But I suspect there are some possible leaks here (e.g.,
add_url_alias() may end up discarding the original "url").
Whether there are active leaks or not, this is a confusing setup that
makes further refactoring of memory ownership harder. So instead of
returning the original string, return NULL, forcing callers to decide
what to do with it explicitly. We can then build further cleanups on top
of that.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 14 Jun 2024 10:25:25 +0000 (06:25 -0400)]
archive: fix check for missing url
Running "git archive --remote" checks that we have at least one url for
the remote. It does so by looking at remote.url[0], but that won't work;
if we have no url at all, then remote.url will be NULL, and we'll
segfault.
Check url_nr instead, which is a more direct way of asking what we
want.
You can trigger the segfault like this:
git -c remote.foo.vcs=bar archive --remote=foo
but I didn't bother adding a test. This is the tip of the iceberg for
no-url remotes, and a later patch will improve that situation. I just
wanted to clean up this bug so it didn't make further refactoring of
this code more confusing.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 12 Jun 2024 20:37:14 +0000 (13:37 -0700)]
Merge branch 'cp/reftable-unit-test'
Basic unit tests for reftable have been reimplemented under the
unit test framework.
* cp/reftable-unit-test:
t: improve the test-case for parse_names()
t: add test for put_be16()
t: move tests from reftable/record_test.c to the new unit test
t: move tests from reftable/stack_test.c to the new unit test
t: move reftable/basics_test.c to the unit testing framework
Junio C Hamano [Mon, 10 Jun 2024 17:30:39 +0000 (10:30 -0700)]
Merge branch 'jk/leakfixes'
Memory leaks in "git mv" has been plugged.
* jk/leakfixes:
mv: replace src_dir with a strvec
mv: factor out empty src_dir removal
mv: move src_dir cleanup to end of cmd_mv()
t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL
t-strvec: use va_end() to match va_start()
Junio C Hamano [Fri, 7 Jun 2024 17:57:23 +0000 (10:57 -0700)]
Merge branch 'tb/midx-write-cleanup'
Code clean-up around writing the .midx files.
* tb/midx-write-cleanup:
pack-bitmap.c: reimplement `midx_bitmap_filename()` with helper
midx: replace `get_midx_rev_filename()` with a generic helper
midx-write.c: support reading an existing MIDX with `packs_to_include`
midx-write.c: extract `fill_packs_from_midx()`
midx-write.c: extract `should_include_pack()`
midx-write.c: pass `start_pack` to `compute_sorted_entries()`
midx-write.c: reduce argument count for `get_sorted_entries()`
midx-write.c: tolerate `--preferred-pack` without bitmaps
Junio C Hamano [Thu, 6 Jun 2024 19:49:23 +0000 (12:49 -0700)]
Merge branch 'ps/leakfixes'
Leakfixes.
* ps/leakfixes:
builtin/mv: fix leaks for submodule gitfile paths
builtin/mv: refactor to use `struct strvec`
builtin/mv duplicate string list memory
builtin/mv: refactor `add_slash()` to always return allocated strings
strvec: add functions to replace and remove strings
submodule: fix leaking memory for submodule entries
commit-reach: fix memory leak in `ahead_behind()`
builtin/credential: clear credential before exit
config: plug various memory leaks
config: clarify memory ownership in `git_config_string()`
builtin/log: stop using globals for format config
builtin/log: stop using globals for log config
convert: refactor code to clarify ownership of check_roundtrip_encoding
diff: refactor code to clarify memory ownership of prefixes
config: clarify memory ownership in `git_config_pathname()`
http: refactor code to clarify memory ownership
checkout: clarify memory ownership in `unique_tracking_name()`
strbuf: fix leak when `appendwholeline()` fails with EOF
transport-helper: fix leaking helper name
Jeff King [Tue, 4 Jun 2024 10:13:40 +0000 (06:13 -0400)]
sparse-checkout: free duplicate hashmap entries
In insert_recursive_pattern(), we create a new pattern_entry to insert
into the parent_hashmap. If we find that the same entry already exists
in the hashmap, we skip adding the new one. But we forget to free the new
one, creating a leak.
We can fix it by cleaning up the discarded entry. It would probably be
possible to avoid creating it in the first place, but it's non-trivial.
We'd have to define a "keydata" struct that lets us compare the existing
entries to the broken-out fields. It's probably not worth the
complexity, so we'll punt on that for now.
There is one subtlety here: our insertion is happening in a loop, with
each iteration looking at the pattern we just inserted (hence the
"recursive" in the name). So if we skip insertion, what do we look at?
The obvious answer is that we should remember the existing duplicate we
found and use that. But I _think_ in that case, we probably already have
all of the recursive bits already (from when the original entry was
added). And so just breaking out of the loop would be correct. But I'm
not 100% sure on that; after all, the original leaky code could have
done the same break, but it didn't.
So I went with the "obvious answer" above, which has no chance of
changing the behavior aside from fixing the leak.
With this patch, t1091 can now be marked leak-free.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 4 Jun 2024 10:13:35 +0000 (06:13 -0400)]
sparse-checkout: free pattern list in sparse_checkout_list()
In sparse_checkout_list(), we create a pattern_list that needs to
eventually be cleared. We remember to do so in the regular code path,
but the cone-mode path does an early return, and forgets to clean up.
We could fix the leak by adding a new call to clear_pattern_list(). But
we can simplify even further by just skipping the early return, pushing
the other code path (which consists now of only one line!) into an else
block. That also matches the same cone/non-cone if/else used in some
other functions.
This fixes 15 leaks found in t1091.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 4 Jun 2024 10:13:32 +0000 (06:13 -0400)]
sparse-checkout: free sparse_filename after use
We allocate a heap buffer via get_sparse_checkout_filename(). Most calls
remember to free it, but sparse_checkout_init() forgets to, causing a
leak. Ironically, it remembers to do so in the error return paths, but
not in the path that makes it all the way to the function end!
Fixing this clears up 6 leaks from t1091.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In update_working_directory(), we take in a pattern_list, attach it to
the repository index by assigning it to index->sparse_checkout_patterns,
and then call unpack_trees. Afterwards, we remove it by setting
index->sparse_checkout_patterns back to NULL.
But there are two possible leaks here:
1. If the index already had a populated sparse_checkout_patterns,
we've obliterated it. We can fix this by saving and restoring it,
rather than always setting it back to NULL.
2. We may call the function with a NULL pattern_list, expecting it to
use the on-disk sparse file. In that case, the index routines will
lazy-load the sparse patterns automatically. But now at the end of
the function when we restore the patterns, we'll leak those
lazy-loaded ones!
We can fix this by freeing the pattern list before overwriting its
pointer whenever it does not match what was passed in (in practice
this should only happen when the passed-in list is NULL, but this
is erring on the defensive side).
Together these remove 48 indirect leaks found in t1091.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 4 Jun 2024 10:13:27 +0000 (06:13 -0400)]
sparse-checkout: always free "line" strbuf after reading input
In add_patterns_from_input(), we may read lines from a file with a loop
like this:
while (!strbuf_getline(&line, file)) {
...
strbuf_to_cone_pattern(&line, pl);
}
/* we don't strbuf_release(&line) here! */
This generally is OK because strbuf_to_cone_pattern() consumes the
buffer via strbuf_detach(). But we can leak in a few cases:
1. We don't always consume the buffer! If the line ends up empty after
trimming, we leave strbuf_to_cone_pattern() without detaching. In
most cases this is OK, because a subsequent getline() call will use
the same buffer. But if you had an empty line at the end of file,
for example, it would leak.
2. Even if strbuf_to_cone_pattern() always consumed the buffer,
there's a subtle issue with strbuf_getline(). As we saw in 94e2aa555e (strbuf: fix leak when `appendwholeline()` fails with
EOF, 2024-05-27), it's possible for it to return EOF with an
allocated buffer (e.g., if the underlying getdelim() call saw an
error). So we should always strbuf_release() after finishing a read
loop like this.
Note that even the code to read patterns from argv has the same problem.
Because that also uses strbuf_to_cone_pattern(), we stuff each argv
entry into a strbuf. It uses the same "line" strbuf as the getline code,
but we should position the strbuf_release() to cover both code paths.
This fixes at least 9 leaks found in t1091.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 4 Jun 2024 10:13:25 +0000 (06:13 -0400)]
sparse-checkout: reuse --stdin buffer when reading patterns
When we read patterns from --stdin, we loop on strbuf_getline(), and
detach each line we read to pass into add_pattern(). This used to be
necessary because add_pattern() required that the pattern strings remain
valid while the pattern_list was in use. But it also created a leak,
since we didn't record the detached buffers anywhere else.
Now that add_pattern() has been modified to make its own copy of the
strings, we can stop detaching and fix the leak. This fixes 4 leaks
detected in t1091.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 4 Jun 2024 10:13:22 +0000 (06:13 -0400)]
dir.c: always copy input to add_pattern()
The add_pattern() function has a subtle and undocumented gotcha: the
pattern string you pass in must remain valid as long as the pattern_list
is in use (and nor do we take ownership of it). This is easy to get
wrong, causing either subtle bugs (because you free or reuse the string
buffer) or leaks (because you copy the string, but don't track ownership
separately).
All of this "pattern" code was originally the "exclude" mechanism. So
this _usually_ works OK because you add entries in one of two ways:
1. From the command-line (e.g., "--exclude"), in which case we're
pointing to an argv entry which remains valid for the lifetime of
the program.
2. From a file (e.g., ".gitignore"), in which case we read the whole
file into a buffer, attach it to the pattern_list's "filebuf"
entry, then parse the buffer in-place (adding NULs). The strings
point into the filebuf, which is cleaned up when the whole
pattern_list goes away.
But other code, like sparse-checkout, reads individual lines from stdin
and passes them one by one to add_pattern(), leaking each. We could fix
this by refactoring it to take in the whole buffer at once, like (2)
above, and stuff it in "filebuf". But given how subtle the interface is,
let's just fix it to always copy the string.
That seems at first like we'd be wasting extra memory, but we can
mitigate that:
a. The path_pattern struct already uses a FLEXPTR, since we sometimes
make a copy (when we see "foo/", we strip off the trailing slash,
requiring a modifiable copy of the string).
Since we'll now always embed the string inside the struct, we can
switch to the regular FLEX_ARRAY pattern, saving us 8 bytes of
pointer. So patterns with a trailing slash and ones under 8 bytes
actually get smaller.
b. Now that we don't need the original string to hang around, we can
get rid of the "filebuf" mechanism entirely, and just free the file
contents after parsing. Since files are the sources we'd expect to
have the largest pattern sets, we should mostly break even on
stuffing the same data into the individual structs.
This patch just adjusts the add_pattern() interface; it doesn't fix any
leaky callers yet.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 5 Jun 2024 08:03:08 +0000 (04:03 -0400)]
dir.c: reduce max pattern file size to 100MB
In a2bc523e1e (dir.c: skip .gitignore, etc larger than INT_MAX,
2024-05-31) we put capped the size of some files whose parsing code and
data structures used ints. Setting the limit to INT_MAX was a natural
spot, since we know the parsing code would misbehave above that.
But it also leaves the possibility of overflow errors when we multiply
that limit to allocate memory. For instance, a file consisting only of
"a\na\n..." could have INT_MAX/2 entries. Allocating an array of
pointers for each would need INT_MAX*4 bytes on a 64-bit system, enough
to overflow a 32-bit int.
So let's give ourselves a bit more safety margin by giving a much
smaller limit. The size 100MB is somewhat arbitrary, but is based on the
similar value for attribute files added by 3c50032ff5 (attr: ignore
overly large gitattributes files, 2022-12-01).
There's no particular reason these have to be the same, but the idea is
that they are in the ballpark of "so huge that nobody would care, but
small enough to avoid malicious overflow". So lacking a better guess, it
makes sense to use the same value. The implementation here doesn't share
the same constant, but we could change that later (or even give it a
runtime config knob, though nobody has complained yet about the
attribute limit).
And likewise, let's add a few tests that exercise the limits, based on
the attr ones. In this case, though, we never read .gitignore from the
index; the blob code is exercised only for sparse filters. So we'll
trigger it that way.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 4 Jun 2024 18:46:10 +0000 (11:46 -0700)]
imap-send: minimum leakfix
EVen with the minimum "no-op" invocation t1517 makes, "git imap-send"
leaks an empty strbuf it used to read a 0-byte string into.
There are a few other topics cooking in 'next' that plugs many
other leaks in this program, so let's minimally fix this one, barely
enough to make CI pass, leaving the rest for the other topic.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In add_pattern_to_hashsets(), we remove entries from the
recursive_hashmap when adding similar ones to the parent_hashmap. I
won't pretend to understand all of what's going on here, but there's an
obvious leak: whatever we removed from recursive_hashmap is not
referenced anywhere else, and is never free()d.
We can easily fix this by asking the hashmap to return a pointer to the
old entry. This makes t7002 now completely leak-free.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 4 Jun 2024 10:13:17 +0000 (06:13 -0400)]
sparse-checkout: clear patterns when init() sees existing sparse file
In sparse_checkout_init(), we first try to load patterns from an
existing file. If we found any, we return immediately, but end up
leaking the patterns we parsed. Fixing this reduces the number of leaks
in t7002 from 9 down to 5.
Note that there are two other exits from the function, but they don't
need the same treatment:
- if we can't resolve HEAD, we write out a hard-coded sparse file and
return. But we know the pattern list is empty there, since we didn't
find any in the on-disk file and we haven't yet added any of our
own.
- otherwise, we do populate the list and then tail-call into
write_patterns_and_update(). But that function frees the
pattern_list itself, so we don't need to.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 4 Jun 2024 10:13:14 +0000 (06:13 -0400)]
dir.c: free strings in sparse cone pattern hashmaps
The pattern_list structs used for cone-mode sparse lookups use a few
extra hashmaps. These store pattern_entry structs, each of which has its
own heap-allocated pattern string. When we clean up the hashmaps, we
free the individual pattern_entry structs, but forget to clean up the
embedded strings, causing memory leaks.
We can fix this by iterating over the hashmaps to free the extra
strings. This reduces the numbers of leaks in t7002 from 22 to 9.
One alternative here would be to make the string a FLEX_ARRAY member of
the pattern_entry. Then there's no extra free() required, and as a bonus
it would be a little more efficient. However, some of the refactoring
gets awkward, as we are often assigning strings allocated by helper
functions. So let's just fix the leak for now, and we can explore bigger
refactoring separately.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 4 Jun 2024 10:13:10 +0000 (06:13 -0400)]
sparse-checkout: pass string literals directly to add_pattern()
The add_pattern() function takes a pattern string, but neither makes a
copy of it nor takes ownership of the memory. So it is the caller's
responsibility to make sure the string hangs around as long as the
pattern_list which references it.
There are a few cases in sparse-checkout where we use string literal
patterns by stuffing them into a strbuf, detaching the buffer, and then
passing the result into add_pattern(). This creates a leak when the
pattern_list is eventually cleared, since we don't retain a copy of the
detached buffer to free.
But we can observe that the whole strbuf dance is unnecessary. The point
was presumably[1] to satisfy the lifetime requirement of the string. But
string literals have static duration; we can count on them lasting for
the whole program.
So we can fix the leak by just passing them directly. And as a bonus,
that simplifies the code. The leaks can be seen in t7002, which drops
from 25 leaks to 22 with this patch. It also makes t3602 and t1090
leak-free.
In the long run, we will also want to clean up this (undocumented!)
memory lifetime requirement of add_pattern(). But that can come in a
later patch; passing the string literals directly will be the right
thing either way.
[1] The code in question comes from 416adc8711 (sparse-checkout: update
working directory in-process for 'init', 2019-11-21) and 99dfa6f970
(sparse-checkout: use in-process update for disable subcommand,
2019-11-21), but I didn't see anything in their commit messages or
on the list explaining the strbufs.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git push" notices that the commit at the tip of the ref on
the other side it is about to overwrite does not exist locally, it
used to first try fetching it if the local repository is a partial
clone. The command has been taught not to do so and immediately
fail instead.
* th/push-local-ff-check-without-lazy-fetch:
push: don't fetch commit object when checking existence
Prior changes have made the documentation around the internals of the
alias command execution clearer, but I have still found this detailed
view of the aliased command being run helpful for debugging purposes.
A test case is added to ensure the full command output is present in
the execution flow.
Signed-off-by: Ian Wienand <iwienand@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ian Wienand [Mon, 27 May 2024 00:30:48 +0000 (10:30 +1000)]
Documentation: alias: add notes on shell expansion
When writing inline shell for shell-expansion aliases (i.e. prefixed
with "!"), there are some caveats around argument parsing to be aware
of. This series of notes attempts to explain what is happening more
clearly.
Signed-off-by: Ian Wienand <iwienand@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 31 May 2024 12:00:34 +0000 (08:00 -0400)]
dir.c: skip .gitignore, etc larger than INT_MAX
We use add_patterns() to read .gitignore, .git/info/exclude, etc, as
well as other pattern-like files like sparse-checkout. The parser for
these uses an "int" as an index, meaning that files over 2GB will
generally cause signed integer overflow and out-of-bounds access.
This is unlikely to happen in any real files, but we do read .gitignore
files from the tree. A malicious tree could cause an out-of-bounds read
and segfault (we also write NULs over newlines, so in theory it could be
an out-of-bounds write, too, but as we go char-by-char, the first thing
that happens is trying to read a negative 2GB offset).
We could fix the most obvious issue by replacing one "int" with a
"size_t". But there are tons of "int" sprinkled throughout this code for
things like pattern lengths, number of patterns, and so on. Since nobody
would actually want a 2GB .gitignore file, an easy defensive measure is
to just refuse to parse them.
The "int" in question is in add_patterns_from_buffer(), so we could
catch it there. But by putting the checks in its two callers, we can
produce more useful error messages.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 31 May 2024 20:51:15 +0000 (13:51 -0700)]
Post 2.45.2 updates
Merge down a handful of topics to adjust tests and CI to make them
work better, without changing Git itself, and a bit of developer
docs update:
* Tests that try to corrupt in-repository files in chunked format did
not work well on macOS due to its broken "mv", which has been
worked around.
* Unbreak CI jobs so that we do not attempt to use Python 2 that has
been removed from the platform.
* Git 2.43 started using the tree of HEAD as the source of attributes
in a bare repository, which has severe performance implications.
For now, revert the change, without ripping out a more explicit
support for the attr.tree configuration variable.
* Windows CI running in GitHub Actions started complaining about the
order of arguments given to calloc(); the imported regex code uses
the wrong order almost consistently, which has been corrected.
Junio C Hamano [Fri, 31 May 2024 22:28:21 +0000 (15:28 -0700)]
Merge branch 'jk/ci-macos-gcc13-fix' into maint-2.45
CI fix.
* jk/ci-macos-gcc13-fix:
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
Junio C Hamano [Fri, 31 May 2024 22:28:19 +0000 (15:28 -0700)]
Merge branch 'jc/compat-regex-calloc-fix' into maint-2.45
Windows CI running in GitHub Actions started complaining about the
order of arguments given to calloc(); the imported regex code uses
the wrong order almost consistently, which has been corrected.
* jc/compat-regex-calloc-fix:
compat/regex: fix argument order to calloc(3)
Junio C Hamano [Fri, 31 May 2024 22:28:19 +0000 (15:28 -0700)]
Merge branch 'jc/no-default-attr-tree-in-bare' into maint-2.45
Git 2.43 started using the tree of HEAD as the source of attributes
in a bare repository, which has severe performance implications.
For now, revert the change, without ripping out a more explicit
support for the attr.tree configuration variable.
* jc/no-default-attr-tree-in-bare:
stop using HEAD for attributes in bare repository by default
Junio C Hamano [Fri, 31 May 2024 15:55:34 +0000 (08:55 -0700)]
Merge branch 'jk/leakfixes' into jk/sparse-leakfix
* jk/leakfixes:
mv: replace src_dir with a strvec
mv: factor out empty src_dir removal
mv: move src_dir cleanup to end of cmd_mv()
t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL
t-strvec: use va_end() to match va_start()
Junio C Hamano [Fri, 31 May 2024 08:43:27 +0000 (01:43 -0700)]
t1517: more coverage for commands that work without repository
While most of the commands in Git suite are designed to do useful
things in Git repositories, some commands are also usable outside
any repository. Building on top of an earlier work abece6e9 (t1517:
test commands that are designed to be run outside repository,
2024-05-20) that adds tests for such commands, let's give coverage
to some more commands.
This patch covers commands whose code has hits for
$ git grep setup_git_directory_gently
and passes a pointer to nongit_ok variable it uses to allow it to
run outside a Git repository, but mostly they are tested only to see
that they start up (as opposed to dying with "not in a git
repository" complaint). We may want to update them to actually do
something useful later, but this would at least help us catch
regressions by mistake.
Junio C Hamano [Fri, 31 May 2024 00:17:21 +0000 (17:17 -0700)]
Merge branch 'jc/fix-2.45.1-and-friends-for-maint' into maint-2.45
* jc/fix-2.45.1-and-friends-for-maint:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Fri, 31 May 2024 00:11:02 +0000 (17:11 -0700)]
Merge branch 'fixes/2.45.1/2.44' into maint-2.44
* fixes/2.45.1/2.44:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Fri, 31 May 2024 00:04:37 +0000 (17:04 -0700)]
Merge branch 'fixes/2.45.1/2.43' into maint-2.43
* fixes/2.45.1/2.43:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Fri, 31 May 2024 00:00:57 +0000 (17:00 -0700)]
Merge branch 'fixes/2.45.1/2.42' into maint-2.42
* fixes/2.45.1/2.42:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Thu, 30 May 2024 23:58:12 +0000 (16:58 -0700)]
Merge branch 'fixes/2.45.1/2.41' into maint-2.41
* fixes/2.45.1/2.41:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Thu, 30 May 2024 23:54:42 +0000 (16:54 -0700)]
Merge branch 'fixes/2.45.1/2.40' into maint-2.40
* fixes/2.45.1/2.40:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Thu, 30 May 2024 23:38:58 +0000 (16:38 -0700)]
Merge branch 'jc/fix-2.45.1-and-friends-for-2.39' into maint-2.39
* jc/fix-2.45.1-and-friends-for-2.39:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Adjust jc/fix-2.45.1-and-friends-for-2.39 for more recent
maintenance track.
* jc/fix-2.45.1-and-friends-for-maint:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Thu, 30 May 2024 21:15:15 +0000 (14:15 -0700)]
Merge branch 'es/chainlint-ncores-fix'
The chainlint script (invoked during "make test") did nothing when
it failed to detect the number of available CPUs. It now falls
back to 1 CPU to avoid the problem.
* es/chainlint-ncores-fix:
chainlint.pl: latch CPU count directly reported by /proc/cpuinfo
chainlint.pl: fix incorrect CPU count on Linux SPARC
chainlint.pl: make CPU count computation more robust
The base topic started to make it an error for a command to leave
the hash algorithm unspecified, which revealed a few commands that
were not ready for the change. Give users a knob to revert back to
the "default is sha-1" behaviour as an escape hatch, and start
fixing these breakages.
* jc/undecided-is-not-necessarily-sha1-fix:
apply: fix uninitialized hash function
builtin/hash-object: fix uninitialized hash function
builtin/patch-id: fix uninitialized hash function
t1517: test commands that are designed to be run outside repository
setup: add an escape hatch for "no more default hash algorithm" change
Junio C Hamano [Thu, 30 May 2024 21:15:12 +0000 (14:15 -0700)]
Merge branch 'ps/reftable-reusable-iterator'
Code clean-up to make the reftable iterator closer to be reusable.
* ps/reftable-reusable-iterator:
reftable/merged: adapt interface to allow reuse of iterators
reftable/stack: provide convenience functions to create iterators
reftable/reader: adapt interface to allow reuse of iterators
reftable/generic: adapt interface to allow reuse of iterators
reftable/generic: move seeking of records into the iterator
reftable/merged: simplify indices for subiterators
reftable/merged: split up initialization and seeking of records
reftable/reader: set up the reader when initializing table iterator
reftable/reader: inline `reader_seek_internal()`
reftable/reader: separate concerns of table iter and reftable reader
reftable/reader: unify indexed and linear seeking
reftable/reader: avoid copying index iterator
reftable/block: use `size_t` to track restart point index
Junio C Hamano [Thu, 30 May 2024 21:15:11 +0000 (14:15 -0700)]
Merge branch 'ps/reftable-write-options'
The knobs to tweak how reftable files are written have been made
available as configuration variables.
* ps/reftable-write-options:
refs/reftable: allow configuring geometric factor
reftable: make the compaction factor configurable
refs/reftable: allow disabling writing the object index
refs/reftable: allow configuring restart interval
reftable: use `uint16_t` to track restart interval
refs/reftable: allow configuring block size
reftable/dump: support dumping a table's block structure
reftable/writer: improve error when passed an invalid block size
reftable/writer: drop static variable used to initialize strbuf
reftable: pass opts as constant pointer
reftable: consistently refer to `reftable_write_options` as `opts`
Before discovering the repository details, We used to assume SHA-1
as the "default" hash function, which has been corrected. Hopefully
this will smoke out codepaths that rely on such an unwarranted
assumptions.
* ps/undecided-is-not-necessarily-sha1:
repository: stop setting SHA1 as the default object hash
oss-fuzz/commit-graph: set up hash algorithm
builtin/shortlog: don't set up revisions without repo
builtin/diff: explicitly set hash algo when there is no repo
builtin/bundle: abort "verify" early when there is no repository
builtin/blame: don't access potentially unitialized `the_hash_algo`
builtin/rev-parse: allow shortening to more than 40 hex characters
remote-curl: fix parsing of detached SHA256 heads
attr: fix BUG() when parsing attrs outside of repo
attr: don't recompute default attribute source
parse-options-cb: only abbreviate hashes when hash algo is known
path: move `validate_headref()` to its only user
path: harden validation of HEAD with non-standard hashes
Taylor Blau [Wed, 29 May 2024 22:55:42 +0000 (18:55 -0400)]
midx: replace `get_midx_rev_filename()` with a generic helper
Commit f894081deae (pack-revindex: read multi-pack reverse indexes,
2021-03-30) introduced the `get_midx_rev_filename()` helper (later
modified by commit 60980aed786 (midx.c: write MIDX filenames to
strbuf, 2021-10-26)).
This function returns the location of the classic ".rev" files we used
to write for MIDXs (prior to 95e8383bac1 (midx.c: make changing the
preferred pack safe, 2022-01-25)), which is always of the form:
$GIT_DIR/objects/pack/multi-pack-index-$HASH.rev
Replace this function with a generic helper that populates a strbuf with
the above form, replacing the ".rev" extension with a caller-provided
argument.
This will allow us to remove a similarly-defined function in the
pack-bitmap code (used to determine the location of a MIDX .bitmap file)
by reimplementing it in terms of `get_midx_filename_ext()`.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 29 May 2024 22:55:39 +0000 (18:55 -0400)]
midx-write.c: support reading an existing MIDX with `packs_to_include`
Avoid unconditionally copying all packs from an existing MIDX into a new
MIDX by checking that packs added via `fill_packs_from_midx()` don't
appear in the `to_include` set, if one was provided.
Do so by calling `should_include_pack()` from both `add_pack_to_midx()`
and `fill_packs_from_midx()`.
In order to make this work, teach `should_include_pack()` a new
"exclude_from_midx" parameter, which allows skipping the first check.
This is done so that the caller in `fill_packs_from_midx()` doesn't
reject all of the packs it provided since they appear in an existing
MIDX by definition.
The sum total of this change is that we are now able to read and
reference objects in an existing MIDX even when given a non-NULL
`packs_to_include`. This is a prerequisite step for incremental MIDXs,
which need to load any existing MIDX (if one is present) in order to
determine whether or not an object already appears in an earlier portion
of the MIDX to avoid duplicating it across multiple portions.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 29 May 2024 22:55:36 +0000 (18:55 -0400)]
midx-write.c: extract `fill_packs_from_midx()`
When write_midx_internal() loads an existing MIDX, all packs are copied
forward into the new MIDX. Improve the readability of
write_midx_internal() by extracting this functionality out into a
separate function.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 29 May 2024 22:55:32 +0000 (18:55 -0400)]
midx-write.c: extract `should_include_pack()`
The add_pack_to_midx() callback used via for_each_file_in_pack_dir() is
used to add packs with .idx files to the MIDX being written.
Within this function, we have a pair of checks that discards packs
which:
- appear in an existing MIDX, if we successfully read an existing MIDX
from disk
- or, appear in the "to_include" list, if invoking the MIDX write
machinery with the `--stdin-packs` command-line argument.
A future commit will want to call a slight variant of these checks from
the code that reuses all packs from an existing MIDX, as well as the
current location via add_pack_to_midx(). The latter will be modified in
subsequent commits to only reuse packs which appear in the to_include
list, if one was given.
Prepare for that step by extracting these checks as a subroutine that
may be called from both places.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 29 May 2024 22:55:28 +0000 (18:55 -0400)]
midx-write.c: pass `start_pack` to `compute_sorted_entries()`
The function `compute_sorted_entries()` is broadly responsible for
building an array of the objects to be written into a MIDX based on the
provided list of packs.
If we have loaded an existing MIDX, however, we may not use all of its
packs, despite loading them into the ctx->info array.
The existing implementation simply skips past the first
ctx->m->num_packs (if ctx->m is non-NULL, indicating that we loaded an
existing MIDX). This is because we read objects in packs from an
existing MIDX via the MIDX itself, rather than from the pack-level
fanout to guarantee a de-duplicated result (see: a40498a1265 (midx: use
existing midx when writing new one, 2018-07-12)).
Future changes (outside the scope of this patch series) to the MIDX code
will require us to skip *at most* that number[^1].
We could tag each pack with a bit that indicates the pack's contents
should be included in the MIDX. But we can just as easily determine the
number of packs to skip by passing in the number of packs we learned
about after processing an existing MIDX.
[^1]: Kind of. The real number will be bounded by the number of packs in
a MIDX layer, and the number of packs in its base layer(s), but that
concept hasn't been fully defined yet.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 29 May 2024 22:55:23 +0000 (18:55 -0400)]
midx-write.c: reduce argument count for `get_sorted_entries()`
The function `midx-write.c::get_sorted_entries()` is responsible for
constructing the array of OIDs from a given list of packs which will
comprise the MIDX being written.
The singular call-site for this function looks something like:
This function has five formal arguments, all of which are members of the
shared `struct write_midx_context` used to track various pieces of
information about the MIDX being written.
The function `get_sorted_entries()` dates back to fe1ed56f5e4 (midx:
sort and deduplicate objects from packfiles, 2018-07-12), which came
shortly after 396f257018a (multi-pack-index: read packfile list,
2018-07-12). The latter patch introduced the `pack_list` structure,
which was a precursor to the structure we now know as
`write_midx_context` (c.f. 577dc49696a (midx: rename pack_info to
write_midx_context, 2021-02-18)).
At the time, `get_sorted_entries()` likely could have used the pack_list
structure introduced earlier in 396f257018a, but understandably did not
since the structure only contained three fields (only two of which were
relevant to `get_sorted_entries()`) at the time.
Simplify the declaration of this function by taking a single pointer to
the whole `struct write_midx_context` instead of various members within
it. Since this function is now computing the entire result (populating
both `ctx->entries`, and `ctx->entries_nr`), rename it to something that
doesn't start with "get_" to make clear that this function has a
side-effect.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 29 May 2024 22:55:19 +0000 (18:55 -0400)]
midx-write.c: tolerate `--preferred-pack` without bitmaps
When passing a preferred pack to the MIDX write machinery, we ensure
that the given preferred pack is non-empty since 5d3cd09a808 (midx:
reject empty `--preferred-pack`'s, 2021-08-31).
However packs are only loaded (via `write_midx_internal()`, though a
subsequent patch will refactor this code out to its own function) when
the `MIDX_WRITE_REV_INDEX` flag is set.
So if a caller runs:
$ git multi-pack-index write --preferred-pack=...
with both (a) an existing MIDX, and (b) specifies a pack from that MIDX
as the preferred one, without passing `--bitmap`, then the check added
in 5d3cd09a808 will result in a segfault.
Note that packs loaded from disk which don't appear in an existing MIDX
do not trigger this issue, as those packs are loaded unconditionally. We
conditionally load packs from a MIDX since we tolerate MIDXs whose
packs do not resolve (i.e., via the MIDX write after removing
unreferenced packs via 'git multi-pack-index expire').
In practice, this isn't possible to trigger when running `git
multi-pack-index write` from `git repack`, as the latter always passes
`--stdin-packs`, which prevents us from loading an existing MIDX, as it
forces all packs to be read from disk.
But a future commit in this series will change that behavior to
unconditionally load an existing MIDX, even with `--stdin-packs`, making
this behavior trigger-able from 'repack' much more easily.
Prevent this from being an issue by removing the segfault altogether by
calling `prepare_midx_pack()` on packs loaded from an existing MIDX when
either the `MIDX_WRITE_REV_INDEX` flag is set *or* we specified a
`--preferred-pack`.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 30 May 2024 06:46:38 +0000 (02:46 -0400)]
mv: replace src_dir with a strvec
We manually manage the src_dir array with ALLOC_GROW. Using a strvec is
a little more ergonomic, and makes the memory ownership more clear. It
does mean that we copy the strings (which were otherwise just pointers
into the "sources" strvec), but using the same rationale as 9fcd9e4e72
(builtin/mv duplicate string list memory, 2024-05-27), it's just not
enough to be worth worrying about here.
As a bonus, this gets rid of some "int"s used for allocation management
(though in practice these were limited to command-line sizes and thus
not overflowable).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 30 May 2024 06:45:21 +0000 (02:45 -0400)]
mv: factor out empty src_dir removal
This pulls the loop added by b6f51e3db9 (mv: cleanup empty
WORKING_DIRECTORY, 2022-08-09) into a sub-function. That reduces clutter
in cmd_mv() and makes it easier to see that the lifetime of the
a_src_dir strbuf is limited to this code (and thus its cleanup doesn't
need to go after the "out" label).
Another option would be to just declare the strbuf inside the loop,
since it is only used there. But this refactor retains the existing
property that we can reuse the allocated buffer for each iteration of
the loop. That optimization is probably overkill, but I think the
sub-function is more readable anyway, and then keeping the optimization
is basically free.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 30 May 2024 06:44:22 +0000 (02:44 -0400)]
mv: move src_dir cleanup to end of cmd_mv()
Commit b6f51e3db9 (mv: cleanup empty WORKING_DIRECTORY, 2022-08-09)
added an auxiliary array where we store directory arguments that we see
while processing the incoming arguments. After actually moving things,
we then use that array to remove now-empty directories, and then
immediately free the array.
But if the actual move queues any errors in only_match_skip_worktree,
that can cause us to jump straight to the "out" label to clean up,
skipping the free() and leaking the array.
Let's push the free() down past the "out" label so that we always clean
up (the array is initialized to NULL, so this is always safe). We'll
hold on to the memory a little longer than necessary, but clarity is
more important than micro-optimizing here.
Note that the adjacent "a_src_dir" strbuf does not suffer the same
problem; it is only allocated during the removal step.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 30 May 2024 06:39:32 +0000 (02:39 -0400)]
t-strvec: use va_end() to match va_start()
Our check_strvec_loc() helper uses a variable argument list. When we
va_start(), we must be sure to va_end() before leaving the function.
This is required by the standard (though the effect of forgetting will
vary between platforms).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 30 May 2024 15:54:58 +0000 (08:54 -0700)]
Merge branch 'ps/leakfixes' into jk/leakfixes
* ps/leakfixes:
builtin/mv: fix leaks for submodule gitfile paths
builtin/mv: refactor to use `struct strvec`
builtin/mv duplicate string list memory
builtin/mv: refactor `add_slash()` to always return allocated strings
strvec: add functions to replace and remove strings
submodule: fix leaking memory for submodule entries
commit-reach: fix memory leak in `ahead_behind()`
builtin/credential: clear credential before exit
config: plug various memory leaks
config: clarify memory ownership in `git_config_string()`
builtin/log: stop using globals for format config
builtin/log: stop using globals for log config
convert: refactor code to clarify ownership of check_roundtrip_encoding
diff: refactor code to clarify memory ownership of prefixes
config: clarify memory ownership in `git_config_pathname()`
http: refactor code to clarify memory ownership
checkout: clarify memory ownership in `unique_tracking_name()`
strbuf: fix leak when `appendwholeline()` fails with EOF
transport-helper: fix leaking helper name
Chandra Pratap [Wed, 29 May 2024 16:59:31 +0000 (22:29 +0530)]
t: improve the test-case for parse_names()
In the existing test-case for parse_names(), the fact that empty
lines should be ignored is not obvious because the empty line is
immediately followed by end-of-string. This can be mistaken as the
empty line getting replaced by NULL. Improve this by adding a
non-empty line after the empty one to demonstrate the intended behavior.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Chandra Pratap [Wed, 29 May 2024 16:59:30 +0000 (22:29 +0530)]
t: add test for put_be16()
put_be16() is a function defined in reftable/basics.{c, h} for which
there are no tests in the current setup. Add a test for the same.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>