Jeff King [Wed, 6 Nov 2024 21:17:17 +0000 (16:17 -0500)]
describe: stop traversing when we run out of names
When trying to describe a commit, we'll traverse from the commit,
collecting candidate tags that point to its ancestors. But once we've
seen all of the tags in the repo, there's no point in traversing
further. There's nothing left to find!
For a default "git describe", this isn't usually a big problem. In a
large repo you'll probably have multiple tags, so we'll eventually find
10 candidates (the default for max_candidates) and stop there. And in a
small repo, it's quick to traverse to the root.
But you can imagine a large repo with few tags. Or, as we saw in a real
world case, explicitly limiting the set of matches like this (on
linux.git):
git describe --match=v6.12-rc4 HEAD
which goes all the way to the root before realizing that no, there are
no other tags under consideration besides the one we fed via --match.
If we add in "--candidates=1" there, it's much faster (at least as of
the previous commit).
But we should be able to speed this up without the user asking for it.
After expanding all matching tags, we know the total number of names. We
could just stop the traversal there, but as hinted at above we already
have a mechanism for doing that: the max_candidate limit. So we can just
reduce that limit to match the number of possible candidates.
Our p6100 test shows this off:
Test HEAD^ HEAD
---------------------------------------------------------------------------------------
6100.2: describe HEAD 0.71(0.65+0.06) 0.72(0.68+0.04) +1.4%
6100.3: describe HEAD with one max candidate 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0%
6100.4: describe HEAD with one tag 0.72(0.66+0.05) 0.01(0.00+0.00) -98.6%
Now we are fast automatically, just as if --candidates=1 were supplied
by the user.
Reported-by: Josh Poimboeuf <jpoimboe@kernel.org> Helped-by: Rasmus Villemoes <ravi@prevas.dk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 6 Nov 2024 21:17:14 +0000 (16:17 -0500)]
describe: stop digging for max_candidates+1
By default, describe considers only 10 candidate matches, and stops
traversing when we have enough. This makes things much faster in a large
repository, where collecting all candidates requires walking all the way
down to the root (or at least to the oldest tag). This goes all the way
back to 8713ab3079 (Improve git-describe performance by reducing
revision listing., 2007-01-13).
However, we don't stop immediately when we have enough candidates. We
keep traversing and only bail when we find one more candidate that we're
ignoring. Usually this is not too expensive, if the tags are sprinkled
evenly throughout history. But if you are unlucky, you might hit the max
candidate quickly, and then have a huge swath of history before finding
the next one.
Our p6100 test has exactly this unlucky case: with a max of "1", we find
a recent tag quickly and then have to go all the way to the root to find
the old tag that will be discarded.
A more interesting real-world case is:
git describe --candidates=1 --match=v6.12-rc4 HEAD
in the linux.git repo. There we restrict the set of tags to a single
one, so there is no older candidate to find at all! But despite
--candidates=1, we keep traversing to the root only to find nothing.
So why do we keep traversing after hitting thet max? There are two
reasons I can see:
1. In theory the extra information that there was another candidate
could be useful, and we record it in the gave_up_on variable. But
we only show this information with --debug.
2. After finding the candidate, there's more processing we do in our
loop. The most important of this is propagating the "within" flags
to our parent commits, and putting them in the commit_list we'll
use for finish_depth_computation().
That function continues the traversal until we've counted all
commits reachable from the starting point but not reachable from
our best candidate tag (so essentially counting "$tag..$start", but
avoiding re-walking over the bits we've seen). If we break
immediately without putting those commits into the list, our depth
computation will be wrong (in the worst case we'll count all the
way down to the root, not realizing those commits are included in
our tag).
But we don't need to find a new candidate for (2). As soon as we finish
the loop iteration where we hit max_candidates, we can then quit on the
next iteration. This should produce the same output as the original code
(which could, after all, find a candidate on the very next commit
anyway) but ends the traversal with less pointless digging.
We still have to set "gave_up_on"; we've popped it off the list and it
has to go back. An alternative would be to re-order the loop so that it
never gets popped, but it's perhaps still useful to show in the --debug
output, so we need to know it anyway. We do have to adjust the --debug
output since it's now just a commit where we stopped traversing, and not
the max+1th candidate.
p6100 shows the speedup using linux.git:
Test HEAD^ HEAD
---------------------------------------------------------------------------------------
6100.2: describe HEAD 0.70(0.63+0.06) 0.71(0.66+0.04) +1.4%
6100.3: describe HEAD with one max candidate 0.70(0.64+0.05) 0.01(0.00+0.00) -98.6%
6100.4: describe HEAD with one tag 0.70(0.67+0.03) 0.70(0.63+0.06) +0.0%
Reported-by: Josh Poimboeuf <jpoimboe@kernel.org> Helped-by: Rasmus Villemoes <ravi@prevas.dk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 6 Nov 2024 21:17:08 +0000 (16:17 -0500)]
t/perf: add tests for git-describe
We don't have a perf script for git-describe, despite it often being
accused of slowness. Let's add a few simple tests to start with.
Rather than use the existing tags from our test repo, we'll make our own
so that we have a known quantity and position. We'll add a "new" tag
near the tip of HEAD, and an "old" one that is at the very bottom. And
then our tests are:
1. Describing HEAD naively requires walking all the way down to the
old tag as we collect candidates. This gives us a baseline for what
"slow" looks like.
2. Doing the same with --candidates=1 can potentially be fast, because
we can quie after finding "new". But we don't, and it's also slow.
3. Likewise we should be able to quit when there are no more tags to
find. This can happen naturally if a repo has few tags, but also if
you restrict the set of tags with --match.
Here are the results running against linux.git. Note that I have a
commit-graph built for the repo, so "slow" here is ~700ms. Without a
commit graph it's more like 9s!
Test HEAD
--------------------------------------------------------------
6100.2: describe HEAD 0.70(0.66+0.04)
6100.3: describe HEAD with one max candidate 0.70(0.66+0.04)
6100.4: describe HEAD with one tag 0.70(0.64+0.06)
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 6 Nov 2024 21:16:58 +0000 (16:16 -0500)]
t6120: demonstrate weakness in disjoint-root handling
Commit 30b1c7ad9d (describe: don't abort too early when searching tags,
2020-02-26) tried to fix a problem that happens when there are disjoint
histories: to accurately compare the counts for different tags, we need
to keep walking the history longer in order to find a common base.
But its fix misses a case: we may still bail early if we hit the
max_candidates limit, producing suboptimal output. You can see this in
action by adding "--candidates=2" to the tests; we'll stop traversing as
soon as we see the second tag and will produce the wrong answer. I hit
this in practice while trying to teach git-describe not to keep looking
for candidates after we've seen all tags in the repo (effectively adding
--candidates=2, since these toy repos have only two tags each).
This is probably fixable by continuing to walk after hitting the
max-candidates limit, all the way down to a common ancestor of all
candidates. But it's not clear in practice what the preformance
implications would be (it would depend on how long the branches that
hold the candidates are).
So I'm punting on that for now, but I'd like to adjust the tests to be
more resilient, and to document the findings. So this patch:
1. Adds an extra tag at the bottom of history. This shouldn't change
the output, but does mean we are more resilient to low values of
--candidates (e.g., if we start reducing it to the total number of
tags). This is arguably closer to the real world anyway, where
you're not going to have just 2 tags, but an arbitrarily long
history going back in time, possibly with multiple irrelevant tags
in it (I called the new tag "H" here for "history").
2. Run the same tests with --candidates=2, which shows that even with
the current code they can fail if we end the traversal early. That
leaves a trail for anybody interested in trying to improve the
behavior.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
compat/mingw: support POSIX semantics for atomic renames
By default, Windows restricts access to files when those files have been
opened by another process. As explained in the preceding commits, these
restrictions can be loosened such that reads, writes and/or deletes of
files with open handles _are_ allowed.
While we set up those sharing flags in most relevant code paths now, we
still don't properly handle POSIX-style atomic renames in case the
target path is open. This is failure demonstrated by t0610, where one of
our tests spawns concurrent writes in a reftable-enabled repository and
expects all of them to succeed. This test fails most of the time because
the process that has acquired the "tables.list" lock is unable to rename
it into place while other processes are busy reading that file.
Windows 10 has introduced the `FILE_RENAME_FLAG_POSIX_SEMANTICS` flag
that allows us to fix this usecase [1]. When set, it is possible to
rename a file over a preexisting file even when the target file still
has handles open. Those handles must have been opened with the
`FILE_SHARE_DELETE` flag, which we have ensured in the preceding
commits.
Careful readers might have noticed that [1] does not mention the above
flag, but instead mentions `FILE_RENAME_POSIX_SEMANTICS`. This flag is
not for use with `SetFileInformationByHandle()` though, which is what we
use. And while the `FILE_RENAME_FLAG_POSIX_SEMANTICS` flag exists, it is
not documented on [2] or anywhere else as far as I can tell.
Unfortunately, we still support Windows systems older than Windows 10
that do not yet have this new flag. Our `_WIN32_WINNT` SDK version still
targets 0x0600, which is Windows Vista and later. And even though that
Windows version is out-of-support, bumping the SDK version all the way
to 0x0A00, which is Windows 10 and later, is not an option as it would
make it impossible to compile on Windows 8.1, which is still supported.
Instead, we have to manually declare the relevant infrastructure to make
this feature available and have fallback logic in place in case we run
on a Windows version that does not yet have this flag.
On another note: `mingw_rename()` has a retry loop that is used in case
deleting a file failed because it's still open in another process. One
might be pressed to not use this loop anymore when we can use POSIX
semantics. But unfortunately, we have to keep it around due to our
dependence on the `FILE_SHARE_DELETE` flag. While we know to set that
sharing flag now, other applications may not do so and may thus still
cause sharing violations when we try to rename a file.
This fixes concurrent writes in the reftable backend as demonstrated in
t0610, but may also end up fixing other usecases where Git wants to
perform renames.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Tue, 5 Nov 2024 19:24:19 +0000 (11:24 -0800)]
fetch-pack: die if in commit graph but not obj db
When fetching, there is a step in which sought objects are first checked
against the local repository; only objects that are not in the local
repository are then fetched. This check first looks up the commit graph
file, and returns "present" if the object is in there.
However, the action of first looking up the commit graph file is not
done everywhere in Git, especially if the type of the object at the time
of lookup is not known. This means that in a repo corruption situation,
a user may encounter an "object missing" error, attempt to fetch it, and
still encounter the same error later when they reattempt their original
action, because the object is present in the commit graph file but not in
the object DB.
Therefore, make it a fatal error when this occurs. (Note that we cannot
proceed to include this object in the list of objects to be fetched
without changing at least the fetch negotiation code: what would happen
is that the client will send "want X" and "have X" and when I tested
at $DAYJOB with a work server that uses JGit, the server reasonably
returned an empty packfile. And changing the fetch negotiation code to
only use the object DB when deciding what to report as "have" would be
an unnecessary slowdown, I think.)
This was discovered when a lazy fetch of a missing commit completed with
nothing actually fetched, and the writing of the commit graph file after
every fetch then attempted to read said missing commit, triggering a
lazy fetch of said missing commit, resulting in an infinite loop with no
user-visible indication (until they check the list of processes running
on their computer). With this fix, there is no infinite loop. Note that
although the repo corruption we discovered was caused by a bug in GC in
a partial clone, the behavior that this patch teaches Git to warn about
applies to any repo with commit graph enabled and with a missing commit,
whether it is a partial clone or not.
t5330, introduced in 3a1ea94a49 (commit-graph.c: no lazy fetch in
lookup_commit_in_graph(), 2022-07-01), tests that an interaction between
fetch and the commit graph does not cause an infinite loop. This patch
changes the exit code in that situation, so that test had to be changed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This revert simplifies the next patch in this patch set.
The commit message of that commit mentions that the new function "will
be used for the bundle-uri client in a subsequent commit", but it seems
that eventually it wasn't used.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Mon, 4 Nov 2024 19:02:44 +0000 (19:02 +0000)]
doc: correct misleading descriptions for --shallow-exclude
The documentation for the --shallow-exclude option to clone/fetch/etc.
claims that the option takes a revision, but it does not. As per
upload-pack.c's process_deepen_not(), it passes the option to
expand_ref() and dies if it does not find exactly one ref matching the
name passed. Further, this has always been the case ever since these
options were introduced by the commits merged in a460ea4a3cb1 (Merge
branch 'nd/shallow-deepen', 2016-10-10). Fix the documentation to
match the implementation.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
list-objects-filter-options: work around reported leak on error
This one is a little bit more curious. In t6112, we have a test that
exercises the `git rev-list --filter` option with invalid filters. We
execute git-rev-list(1) via `test_must_fail`, which means that we check
for leaks even though Git exits with an error code. This causes the
following leak:
Direct leak of 27 byte(s) in 1 object(s) allocated from:
#0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o
#1 0x5555558fb4b6 in xrealloc wrapper.c:137:8
#2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2
#3 0x5555558b7550 in strbuf_add strbuf.c:311:2
#4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2
#5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3
#6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3
#7 0x555555884e20 in setup_revisions revision.c:3014:11
#8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9
#9 0x5555555ec5e3 in run_builtin git.c:483:11
#10 0x5555555eb1e4 in handle_builtin git.c:749:13
#11 0x5555555ec001 in run_argv git.c:819:4
#12 0x5555555eaf94 in cmd_main git.c:954:19
#13 0x5555556fd569 in main common-main.c:64:11
#14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d)
#15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208)
#16 0x5555555ad064 in _start (git+0x59064)
This leak is valid, as we call `die()` and do not clean up the memory at
all. But what's curious is that this is the only leak reported, because
we don't clean up any other allocated memory, either, and I have no idea
why the leak sanitizer treats this buffer specially.
In any case, we can work around the leak by shuffling things around a
bit. Instead of calling `gently_parse_list_objects_filter()` and dying
after we have modified the filter spec, we simply do so beforehand. Like
this we don't allocate the buffer in the error case, which makes the
reported leak go away.
It's not pretty, but it manages to make t6112 leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/merge: release output buffer after performing merge
The `obuf` member of `struct merge_options` is used to buffer output in
some cases. In order to not discard its allocated memory we only release
its contents in `merge_finalize()` when we're not currently recursing
into a subtree.
This results in some situations where we seemingly do not release the
buffer reliably. We thus have calls to `strbuf_release()` for this
buffer scattered across the codebase. But we're missing one callsite in
git-merge(1), which causes a memory leak.
We should ideally refactor this interface so that callers don't have to
know about any such internals. But for now, paper over the issue by
adding one more `strbuf_release()` call.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir: fix leak when parsing "status.showUntrackedFiles"
We use `repo_config_get_string()` to read "status.showUntrackedFiles"
from the config subsystem. This function allocates the result, but we
never free the result after parsing it.
The value never leaves the scope of the calling function, so refactor it
to instead use `repo_config_get_string_tmp()`, which does not hand over
ownership to the caller.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/helper: stop re-initialization of `the_repository`
While "common-main.c" already initializes `the_repository` for us, we do
so a second time in the "read-cache" test helper. This causes a memory
leak because the old repository's contents isn't released.
Stop calling `initialize_repository()` to plug this leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we free the `fsmonitor_dirty` member of `struct index_state`, we
do not free the contents of that EWAH. Do so by using `ewah_free()`
instead of `FREE_AND_NULL()`.
This leak is exposed by t7519, but plugging it alone does not make the
test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are several cases where we invalidate untracked cache directory
entries where we do not free the underlying data, but reset the number
of entries. This causes us to leak memory because `free_untracked()`
will not iterate over any potential entries which we still had in the
array.
Fix this issue by freeing old entries. The leak is exposed by t7519, but
plugging it alone does not make the whole test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `cnt` variable tracks the number of lines in a patch diff. It can
happen though that there are no newlines, in which case we'd still end
up allocating our array of `sline`s. In fact, we always allocate it with
`cnt + 2` entries: one extra entry for the deletion hunk at the end, and
another entry that we don't seem to ever populate at all but acts as a
kind of sentinel value.
When we loop through the array to clear it at the end of this function
we only loop until `lno < cnt`, and thus we may not end up releasing
whatever the two extra `sline`s contain. While that shouldn't matter for
the sentinel value, it does matter for the extra deletion hunk sline.
Regardless of that, plug this memory leak by releasing both extra
entries, which makes the logic a bit easier to reason about.
While at it, fix the formatting of a local comment, which incidentally
also provides the necessary context for why we overallocate the `sline`
array.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cleanup string set by the config is leaking when it is being
overridden by an option. Fix this by tracking these via two separate
variables such that we can free the old value.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
trailer: fix leaking strbufs when formatting trailers
When formatting trailer lines we iterate through each of the trailers
and munge their respective token/value pairs according to the trailer
options. When formatting a trailer that has its `item->token` pointer
set we perform the munging in two local buffers. In the case where we
figure out that the value is empty and `trim_empty` is set we just skip
over the trailer item. But the buffers are local to the loop and we
don't release their contents, leading to a memory leak.
Plug this leak by lifting the buffers outside of the loop and releasing
them on function return. This fixes the memory leaks, but also optimizes
the loop as we don't have to reallocate the buffers on every single
iteration.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff-lib: fix leaking diffopts in `do_diff_cache()`
In `do_diff_cache()` we initialize a new `rev_info` and then overwrite
its `diffopt` with a user-provided set of options. This can leak memory
because `repo_init_revisions()` may end up allocating memory for the
`diffopt` itself depending on the configuration. And since that field is
overwritten we won't ever free it.
Plug the memory leak by releasing the diffopts before we overwrite them.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `grep_splice_or()` we search for the next `TRUE` node in our tree of
grep expressions and replace it with the given new expression. But we
don't free the old node, which causes a memory leak. Plug it.
This leak is exposed by t7810, but plugging it alone isn't sufficient to
make the test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Mon, 4 Nov 2024 19:02:43 +0000 (19:02 +0000)]
upload-pack: fix ambiguous error message
upload-pack.c takes any --shallow-exclude argument(s) from
clone/fetch/etc. and passes them through expand_ref(). If it does not
get back exactly one ref from the call to expand_ref(), it will die with
the following error:
fatal: git upload-pack: ambiguous deepen-not: %s
Given that the documentation suggests to users that --shallow-exclude
accepts a revision rather than a ref (which will be corrected in a
subsequent commit), users may try to pass a revision. In such a case,
expand_ref() will return 0 matches, but the error message we print will
be misleading since "ambiguous" suggests there are multiple matches.
Provide a clearer error message for such a case.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Fri, 1 Nov 2024 20:11:47 +0000 (13:11 -0700)]
t5300: move --window clamp test next to unclamped
A subsequent commit will change the behavior of "git index-pack
--promisor", which is exercised in "build pack index for an existing
pack", causing the unclamped and clamped versions of the --window
test to exhibit different behavior. Move the clamp test closer to the
unclamped test that it references.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Fri, 1 Nov 2024 20:11:46 +0000 (13:11 -0700)]
t0410: use from-scratch server
A subsequent commit will add functionality: when fetching from a
promisor remote, existing non-promisor objects that are ancestors of any
fetched object will be repacked into promisor packs (since if a promisor
remote has an object, it also has all its ancestors).
This means that sometimes, a fetch from a promisor remote results in 2
new promisor packs (instead of the 1 that you would expect). There is a
test that fetches a descendant of a local object from a promisor remote,
but also specifically tests that there is exactly 1 promisor pack as
a result of the fetch. This means that this test will fail when the
subsequent commit is added.
Since the ancestry of the fetched object is not the concern of this
test, make the fetched objects have no ancestry in common with the
objets in the client repo. This is done by making the server from
scratch, instead of using an existing repo that has objects in common
with the client.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Fri, 1 Nov 2024 20:11:45 +0000 (13:11 -0700)]
t0410: make test description clearer
Commit 9a4c507886 (t0410: test fetching from many promisor remotes,
2019-06-25) adds some tests that demonstrate not the automatic fetching
of missing objects, but the direct fetching from another promisor remote
(configured explicitly in one test and implicitly via --filter on the
"git fetch" CLI invocation in the other test) - thus demonstrating
support for multiple promisor remotes, as described in the commit
message.
Change the test descriptions accordingly to make this clearer.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Fri, 1 Nov 2024 16:53:31 +0000 (12:53 -0400)]
Merge branch 'jk/dumb-http-finalize'
The dumb-http code regressed when the result of re-indexing a pack
yielded an *.idx file that differs in content from the *.idx file it
downloaded from the remote. This has been corrected by no longer
relying on the *.idx file we got from the remote.
* jk/dumb-http-finalize:
packfile: use oidread() instead of hashcpy() to fill object_id
packfile: use object_id in find_pack_entry_one()
packfile: convert find_sha1_pack() to use object_id
http-walker: use object_id instead of bare hash
packfile: warn people away from parse_packed_git()
packfile: drop sha1_pack_index_name()
packfile: drop sha1_pack_name()
packfile: drop has_pack_index()
dumb-http: store downloaded pack idx as tempfile
t5550: count fetches in "previously-fetched .idx" test
midx: avoid duplicate packed_git entries
Taylor Blau [Fri, 1 Nov 2024 16:53:26 +0000 (12:53 -0400)]
Merge branch 'ua/atoi'
Replace various calls to atoi() with strtol_i() and strtoul_ui(), and
add improved error handling.
* ua/atoi:
imap: replace atoi() with strtol_i() for UIDVALIDITY and UIDNEXT parsing
merge: replace atoi() with strtol_i() for marker size validation
daemon: replace atoi() with strtoul_ui() and strtol_i()
Taylor Blau [Fri, 1 Nov 2024 16:53:19 +0000 (12:53 -0400)]
Merge branch 'rj/cygwin-exit'
Treat ECONNABORTED the same as ECONNRESET in 'git credential-cache' to
work around a possible Cygwin regression. This resolves a race condition
caused by changes in Cygwin's handling of socket closures, allowing the
client to exit cleanly when encountering ECONNABORTED.
* rj/cygwin-exit:
credential-cache: treat ECONNABORTED like ECONNRESET
Taylor Blau [Fri, 1 Nov 2024 16:53:16 +0000 (12:53 -0400)]
Merge branch 'ps/platform-compat-fixes'
Various platform compatibility fixes split out of the larger effort
to use Meson as the primary build tool.
* ps/platform-compat-fixes:
t6006: fix prereq handling with `test_format ()`
http: fix build error on FreeBSD
builtin/credential-cache: fix missing parameter for stub function
t7300: work around platform-specific behaviour with long paths on MinGW
t5500, t5601: skip tests which exercise paths with '[::1]' on Cygwin
t3404: work around platform-specific behaviour on macOS 10.15
t1401: make invocation of tar(1) work with Win32-provided one
t/lib-gpg: fix setup of GNUPGHOME in MinGW
t/lib-gitweb: test against the build version of gitweb
t/test-lib: wire up NO_ICONV prerequisite
t/test-lib: fix quoting of TEST_RESULTS_SAN_FILE
will produce output without any left-right markers, since the bitmap
traversal returns only a single set of reachable commits. Instead we
should refuse to use bitmaps here and produce the correct output using a
traditional traversal.
This is probably not the only remaining option that misbehaves with
bitmaps, but it's particularly egregious in that it feels like it
_could_ work. Doing two separate traversals for the left/right sides and
then taking the symmetric set differences should yield the correct
answer, but our traversal code doesn't know how to do that.
It's not clear if naively doing two separate traversals would always be
a performance win. A traditional traversal only needs to walk down to
the merge base, but bitmaps always fill out the full reachability set.
So depending on your bitmap coverage, we could end up walking old bits
of history twice to fill out the same uninteresting bits on both sides.
We'd also of course end up with a very large --boundary set, if the user
asked for that.
So this might or might not be something worth implementing later. But
for now, let's make sure we don't produce the wrong answer if somebody
tries it.
The test covers this, but also the same thing with "--count" (which is
what I originally tried in a real-world case). Ironically the
try_bitmap_count() code already realizes that "--left-right" won't work
there. But that just causes us to fall back to the regular bitmap
traversal code, which itself doesn't handle counting (we produce a list
of objects rather than a count).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
brian m. carlson [Thu, 31 Oct 2024 23:49:34 +0000 (23:49 +0000)]
Add additional CI jobs to avoid accidental breakage
In general, we'd like to make sure Git works on the LTS versions of
major Linux distributions. To do that, let's add CI jobs for the oldest
regular (non-extended) LTS versions of the major distributions: Ubuntu
20.04, Debian 11, and RHEL 8. Because RHEL isn't available to the
public at no charge, use AlmaLinux, which is binary compatible with it.
Note that Debian does not offer the language-pack packages, but suitable
locale support can be installed with the locales-all package.
Otherwise, use the set of installation instructions which exist and are
most similar to the existing supported distros.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
The planet keeps revolving, and CI definitions (even old ones) need to
be kept up to date, even if they worked unchanged before (because now
they don't).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
t7300: work around platform-specific behaviour with long paths on MinGW
Windows by default has a restriction in place to only allow paths up to
260 characters. This restriction can nowadays be lifted by setting a
registry key, but is still active by default.
In t7300 we have one test that exercises the behaviour of git-clean(1)
with such long paths. Interestingly enough, this test fails on my system
that uses Windows 10 with mingw-w64 installed via MSYS2: instead of
observing ENAMETOOLONG, we observe ENOENT. This behaviour is consistent
across multiple different environments I have tried.
I cannot say why exactly we observe a different error here, but I would
not be surprised if this was either dependent on the Windows version,
the version of MinGW, the current working directory of Git or any kind
of combination of these.
Work around the issue by handling both errors.
[Backported from 106834e34a2 (t7300: work around platform-specific
behaviour with long paths on MinGW, 2024-10-09).]
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Junio C Hamano [Sun, 12 May 2024 06:25:04 +0000 (23:25 -0700)]
compat/regex: fix argument order to calloc(3)
Windows compiler suddenly started complaining that calloc(3) takes
its arguments in <nmemb, size> order. Indeed, there are many calls
that has their arguments in a _wrong_ order.
mingw: drop bogus (and unneeded) declaration of `_pgmptr`
In 08809c09aa13 (mingw: add a helper function to attach GDB to the
current process, 2020-02-13), I added a declaration that was not needed.
Back then, that did not matter, but now that the declaration of that
symbol was changed in mingw-w64's headers, it causes the following
compile error:
CC compat/mingw.o
compat/mingw.c: In function 'open_in_gdb':
compat/mingw.c:35:9: error: function declaration isn't a prototype [-Werror=strict-prototypes]
35 | extern char *_pgmptr;
| ^~~~~~
In file included from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/mm_malloc.h:27,
from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/xmmintrin.h:34,
from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/immintrin.h:31,
from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/x86intrin.h:32,
from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/winnt.h:1658,
from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/minwindef.h:163,
from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/windef.h:9,
from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/windows.h:69,
from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/winsock2.h:23,
from compat/../git-compat-util.h:215,
from compat/mingw.c:1:
compat/mingw.c:35:22: error: '__p__pgmptr' redeclared without dllimport attribute: previous dllimport ignored [-Werror=attributes]
35 | extern char *_pgmptr;
| ^~~~~~~
Let's just drop the declaration and get rid of this compile error.
[Backported from 3c295c87c25 (mingw: drop bogus (and unneeded)
declaration of `_pgmptr`, 2024-06-19).]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Junio C Hamano [Mon, 9 Sep 2024 23:00:20 +0000 (16:00 -0700)]
ci: remove 'Upload failed tests' directories' step from linux32 jobs
Linux32 jobs seem to be getting:
Error: This request has been automatically failed because it uses a
deprecated version of `actions/upload-artifact: v1`. Learn more:
https://github.blog/changelog/2024-02-13-deprecation-notice-v1-and-v2-of-the-artifact-actions/
before doing anything useful. For now, disable the step.
Ever since actions/upload-artifact@v1 got disabled, mentioning the
offending version of it seems to stop anything from happening. At
least this should run the same build and test.
In df383b5842 (t/test-lib: wire up NO_ICONV prerequisite, 2024-10-16) we
have introduced a new NO_ICONV prerequisite that makes us skip tests in
case Git is not compiled with support for iconv. This change subtly
broke t6006: while the test suite still passes, some of its tests won't
execute because they run into an error.
./t6006-rev-list-format.sh: line 92: test_expect_%e: command not found
The broken tests use `test_format ()`, and the mentioned commit simply
prepended the new prerequisite to its arguments. But that does not work,
as the function is not aware of prereqs at all and will now treat all of
its arguments incorrectly.
Fix this by making the function aware of prereqs by accepting an
optional fourth argument. Adapt the callsites accordingly.
Reported-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
On Windows, we emulate open(3p) via `mingw_open()`. This function
implements handling of some platform-specific quirks that are required
to make it behave as closely as possible like open(3p) would, but for
most cases we just call the Windows-specific `_wopen()` function.
This function has a major downside though: it does not allow us to
specify the sharing mode. While there is `_wsopen()` that allows us to
pass sharing flags, those sharing flags are not the same `FILE_SHARE_*`
flags as `CreateFileW()` accepts. Instead, `_wsopen()` only allows
concurrent read- and write-access, but does not allow for concurrent
deletions. Unfortunately though, we have to allow concurrent deletions
if we want to have POSIX-style atomic renames on top of an existing file
that has open file handles.
Implement a new function that emulates open(3p) for existing files via
`CreateFileW()` such that we can set the required sharing flags.
While we have the same issue when calling open(3p) with `O_CREAT`,
implementing that mode would be more complex due to the required
permission handling. Furthermore, atomic updates via renames typically
write to exclusive lockfile and then perform the rename, and thus we
don't have to handle the case where the locked path has been created
with `O_CREATE`. So while it would be nice to have proper POSIX
semantics in all paths, we instead aim for a minimum viable fix here.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
compat/mingw: share file handles created via `CreateFileW()`
Unless told otherwise, Windows will keep other processes from reading,
writing and deleting files when one has an open handle that was created
via `CreateFileW()`. This behaviour can be altered via `FILE_SHARE_*`
flags:
- `FILE_SHARE_READ` allows a concurrent process to open the file for
reading.
- `FILE_SHARE_WRITE` allows a concurrent process to open the file for
writing.
- `FILE_SHARE_DELETE` allows a concurrent process to delete the file
or to replace it via an atomic rename.
This sharing mechanism is quite important in the context of Git, as we
assume POSIX semantics all over the place. But there are two callsites
where we don't pass all three of these flags:
- We don't set `FILE_SHARE_DELETE` when creating a file for appending
via `mingw_open_append()`. This makes it impossible to delete the
file from another process or to replace it via an atomic rename. The
function was introduced via d641097589 (mingw: enable atomic
O_APPEND, 2018-08-13) and has been using `FILE_SHARE_READ |
FILE_SHARE_WRITE` since the inception. There aren't any indicators
that the omission of `FILE_SHARE_DELETE` was intentional.
- We don't set any sharing flags in `mingw_utime()`, which changes the
access and modification of a file. This makes it impossible to
perform any kind of operation on this file at all from another
process. While we only open the file for a short amount of time to
update its timestamps, this still opens us up for a race condition
with another process.
`mingw_utime()` was originally implemented via `_wopen()`, which
doesn't give you full control over the sharing mode. Instead, it
calls `_wsopen()` with `_SH_DENYNO`, which ultimately translates to
`FILE_SHARE_READ | FILE_SHARE_WRITE`. It was then refactored via 090a3085bc (t/helper/test-chmtime: update mingw to support chmtime
on directories, 2022-03-02) to use `CreateFileW()`, but we stopped
setting any sharing flags at all, which seems like an unintentional
side effect. By restoring `FILE_SHARE_READ | FILE_SHARE_WRITE` we
thus fix this and get back the old behaviour of `_wopen()`.
The fact that we didn't set the equivalent of `FILE_SHARE_DELETE`
can be explained, as well: neither `_wopen()` nor `_wsopen()` allow
you to do so. So overall, it doesn't seem intentional that we didn't
allow deletions here, either.
Adapt both of these callsites to pass all three sharing flags.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 07:08:10 +0000 (03:08 -0400)]
packfile: use oidread() instead of hashcpy() to fill object_id
When chasing a REF_DELTA, we need to pull the raw hash bytes out of the
mmap'd packfile into an object_id struct. We do that with a raw
hashcpy() of the appropriate length (that happens directly now, though
before the previous commit it happened inside find_pack_entry_one(),
also using a hashcpy).
But I think this creates a potentially dangerous situation due to d4d364b2c7 (hash: convert `oidcmp()` and `oideq()` to compare whole
hash, 2024-06-14). When using sha1, we'll have uninitialized bytes in
the latter part of the object_id.hash buffer, which could fool oideq(),
etc.
We should use oidread() instead, which correctly zero-pads the extra
bytes, as of c98d762ed9 (global: ensure that object IDs are always
padded, 2024-06-14).
As far as I can see, this has not been a problem in practice because the
object_id we feed to find_pack_entry_one() is never used with oideq(),
etc. It is being compared to the bytes mmap'd from a pack idx file,
which of course do not have the extra padding bytes themselves. So
there's no bug here, but this just puzzled me while looking at the code.
We should do the more obviously safe thing, both for future-proofing and
to avoid confusing readers.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 07:06:06 +0000 (03:06 -0400)]
packfile: use object_id in find_pack_entry_one()
The main function we use to search a pack index for an object is
find_pack_entry_one(). That function still takes a bare pointer to the
hash, despite the fact that its underlying bsearch_pack() function needs
an object_id struct. And so we end up making an extra copy of the hash
into the struct just to do a lookup.
As it turns out, all callers but one already have such an object_id. So
we can just take a pointer to that struct and use it directly. This
avoids the extra copy and provides a more type-safe interface.
The one exception is get_delta_base() in packfile.c, when we are chasing
a REF_DELTA from inside the pack (and thus we have a pointer directly to
the mmap'd pack memory, not a struct). We can just bump the hashcpy()
from inside find_pack_entry_one() to this one caller that needs it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 07:05:03 +0000 (03:05 -0400)]
packfile: convert find_sha1_pack() to use object_id
The find_sha1_pack() function has a few problems:
- it's badly named, since it works with any object hash
- it takes the hash as a bare pointer rather than an object_id struct
We can fix both of these easily, as all callers actually have a real
object_id anyway.
I also found the existence of this function somewhat confusing, as it is
about looking in an arbitrary set of linked packed_git structs. It's
good for things like dumb-http which are looking in downloaded remote
packs, and not our local packs. But despite the name, it is not a good
way to find the pack which contains a local object (it skips the use of
the midx, the pack mru list, and so on).
So let's also add an explanatory comment above the declaration that may
point people in the right direction.
I suspect the calls in fast-import.c, which use the packed_git list from
the repository struct, could actually just be using find_pack_entry().
But since we'd need to keep it anyway for dumb-http, I didn't dig
further there. If we eventually drop dumb-http support, then it might be
worth examining them to see if we can get rid of the function entirely.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 07:03:42 +0000 (03:03 -0400)]
http-walker: use object_id instead of bare hash
We long ago switched most code to using object_id structs instead of
bare "unsigned char *" hashes. This gives us more type safety from the
compiler, and generally makes it easier to understand what we expect in
each parameter.
But the dumb-http code has lagged behind. And indeed, the whole "walker"
subsystem interface has the same problem, though http-walker is the only
user left.
So let's update the walker interface to pass object_id structs (which we
already have anyway at all call sites!), and likewise use those within
the http-walker methods that it calls.
This cleans up the dumb-http code a bit, but will also let us fix a few
more commonly used helper functions.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 07:02:22 +0000 (03:02 -0400)]
packfile: warn people away from parse_packed_git()
With a name like parse_packed_git(), you might think it's the right way
to access a local pack index and its associated objects. But not so!
It's a one-off used by the dumb-http code to access pack idx files we've
downloaded from the remote, but whose packs we might not have.
There's only one caller left for this function, and ideally we'd drop it
completely and just inline it there. But that would require exposing
other internals from packfile.[ch], like alloc_packed_git() and
check_packed_git_idx().
So let's leave it be for now, and just warn people that it's probably
not what they're looking for. Perhaps in the long run if we eventually
drop dumb-http support, we can remove the function entirely then.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 07:01:25 +0000 (03:01 -0400)]
packfile: drop sha1_pack_index_name()
Like sha1_pack_name() that we dropped in the previous commit, this
function uses an error-prone static strbuf and the somewhat misleading
name "sha1".
The only caller left is in pack-redundant.c. While this command is
marked for potential removal in our BreakingChanges document, we still
have it for now. But it's simple enough to convert it to use its own
strbuf with the underlying odb_pack_name() function, letting us drop the
otherwise obsolete function.
Note that odb_pack_name() does its own strbuf_reset(), so it's safe to
use directly within a loop like this.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 07:00:55 +0000 (03:00 -0400)]
packfile: drop sha1_pack_name()
The sha1_pack_name() function has a few ugly bits:
- it writes into a static strbuf (and not even a ring buffer of them),
which can lead to subtle invalidation problems
- it uses the term "sha1", but it's really using the_hash_algo, which
could be sha256
There's only one caller of it left. And in fact that caller is better
off using the underlying odb_pack_name() function itself, since it's
just copying the result into its own strbuf anyway.
Converting that caller lets us get rid of this now-obselete function.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 07:00:09 +0000 (03:00 -0400)]
packfile: drop has_pack_index()
The has_pack_index() function has several oddities that may make it
surprising if you are trying to find out if we have a pack with some
$hash:
- it is not looking for a valid pack that we found while searching
object directories. It just looks for any pack-$hash.idx file in the
pack directory.
- it only looks in the local directory, not any alternates
- it takes a bare "unsigned char" hash, which we try to avoid these
days
The only caller it has is in the dumb http code; it wants to know if we
already have the pack idx in question. This can happen if we downloaded
the pack (and generated its index) during a previous fetch.
Before the previous patch ("dumb-http: store downloaded pack idx as
tempfile"), it could also happen if we downloaded the .idx from the
remote but didn't get the matching .pack. But since that patch, we don't
hold on to those .idx files. So there's no need to look for the .idx
file in the filesystem; we can just scan through the packed_git list to
see if we have it.
That lets us simplify the dumb http code a bit, as we know that if we
have the .idx we have the matching .pack already. And it lets us get rid
of this odd function that is unlikely to be needed again.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 06:58:06 +0000 (02:58 -0400)]
dumb-http: store downloaded pack idx as tempfile
This patch fixes a regression in b1b8dfde69 (finalize_object_file():
implement collision check, 2024-09-26) where fetching a v1 pack idx file
over the dumb-http protocol would cause the fetch to fail.
The core of the issue is that dumb-http stores the idx we fetch from the
remote at the same path that will eventually hold the idx we generate
from "index-pack --stdin". The sequence is something like this:
0. We realize we need some object X, which we don't have locally, and
nor does the other side have it as a loose object.
1. We download the list of remote packs from objects/info/packs.
2. For each entry in that file, we download each pack index and store
it locally in .git/objects/pack/pack-$hash.idx (the $hash is not
something we can verify yet and is given to us by the remote).
3. We check each pack index we got to see if it has object X. When we
find a match, we download the matching .pack file from the remote
to a tempfile. We feed that to "index-pack --stdin", which
reindexes the pack, rather than trusting that it has what the other
side claims it does. In most cases, this will end up generating the
exact same (byte-for-byte) pack index which we'll store at the same
pack-$hash.idx path, because the index generation and $hash id are
computed based on what's in the packfile. But:
a. The other side might have used other options to generate the
index. For instance we use index v2 by default, but long ago
it was v1 (and you can still ask for v1 explicitly).
b. The other side might even use a different mechanism to
determine $hash. E.g., long ago it was based on the sorted
list of objects in the packfile, but we switched to using the
pack checksum in 1190a1acf8 (pack-objects: name pack files
after trailer hash, 2013-12-05).
The regression we saw in the real world was (3a). A recent client
fetching from a server with a v1 index downloaded that index, then
complained about trying to overwrite it with its own v2 index. This
collision is otherwise harmless; we know we want to replace the remote
version with our local one, but the collision check doesn't realize
that.
There are a few options to fix it:
- we could teach index-pack a command-line option to ignore only pack
idx collisions, and use it when the dumb-http code invokes
index-pack. This would be an awkward thing to expose users to and
would involve a lot of boilerplate to get the option down to the
collision code.
- we could delete the remote .idx file right before running
index-pack. It should be redundant at that point (since we've just
downloaded the matching pack). But it feels risky to delete
something from our own .git/objects based on what the other side has
said. I'm not entirely positive that a malicious server couldn't lie
about which pack-$hash.idx it has and get us to delete something
precious.
- we can stop co-mingling the downloaded idx files in our local
objects directory. This is a slightly bigger change but I think
fixes the root of the problem more directly.
This patch implements the third option. The big design questions are:
where do we store the downloaded files, and how do we manage their
lifetimes?
There are some additional quirks to the dumb-http system we should
consider. Remember that in step 2 we downloaded every pack index, but in
step 3 we may only download some of the matching packs. What happens to
those other idx files now? They sit in the .git/objects/pack directory,
possibly waiting to be used at a later date. That may save bandwidth for
a subsequent fetch, but it also creates a lot of weird corner cases:
- our local object directory now has semi-untrusted .idx files sitting
around, without their matching .pack
- in case 3b, we noted that we might not generate the same hash as the
other side. In that case even if we download the matching pack,
our index-pack invocation will store it in a different
pack-$hash.idx file. And the unmatched .idx will sit there forever.
- if the server repacks, it may delete the old packs. Now we have
these orphaned .idx files sitting around locally that will never be
used (nor deleted).
- if we repack locally we may delete our local version of the server's
pack index and not realize we have it. So we'll download it again,
even though we have all of the objects it mentions.
I think the right solution here is probably some more complex cache
management system: download the remote .idx files to their own storage
directory, mark them as "seen" when we get their matching pack (to avoid
re-downloading even if we repack), and then delete them when the
server's objects/info/refs no longer mentions them.
But since the dumb http protocol is so ancient and so inferior to the
smart http protocol, I don't think it's worth spending a lot of time
creating such a system. For this patch I'm just downloading the idx
files to .git/objects/tmp_pack_*, and marking them as tempfiles to be
deleted when we exit (and due to the name, any we miss due to a crash,
etc, should eventually be removed by "git gc" runs based on timestamps).
That is slightly worse for one case: if we download an idx but not the
matching pack, we won't retain that idx for subsequent runs. But the
flip side is that we're making other cases better (we never hold on to
useless idx files forever). I suspect that worse case does not even come
up often, since it implies that the packs are generated to match
distinct parts of history (i.e., in practice even in a repo with many
packs you're going to end up grabbing all of those packs to do a clone).
If somebody really cares about that, I think the right path forward is a
managed cache directory as above, and this patch is providing the first
step in that direction anyway (by moving things out of the objects/pack/
directory).
There are two test changes. One demonstrates the broken v1 index case
(it double-checks the resulting clone with fsck to be careful, but prior
to this patch it actually fails at the clone step). The other tweaks the
expectation for a test that covers the "slightly worse" case to
accommodate the extra index download.
The code changes are fairly simple. We stop using finalize_object_file()
to copy the remote's index file into place, and leave it as a tempfile.
We give the tempfile a real ".idx" name, since the packfile code expects
that, and thus we make sure it is out of the usual packs/ directory (so
we'd never mistake it for a real local .idx).
We also have to change parse_pack_index(), which creates a temporary
packed_git to access our index (we need this because all of the pack idx
code assumes we have that struct). It reads the index data from the
tempfile, but prior to this patch would speculatively write the
finalized name into the packed_git struct using the pack-$hash we expect
to use.
I was mildly surprised that this worked at all, since we call
verify_pack_index() on the packed_git which mentions the final name
before moving the file into place! But it works because
parse_pack_index() leaves the mmap-ed data in the struct, so the
lazy-open in verify_pack_index() never triggers, and we read from the
tempfile, ignoring the filename in the struct completely. Hacky, but it
works.
After this patch, parse_pack_index() now uses the index filename we pass
in to derive a matching .pack name. This is OK to change because there
are only two callers, both in the dumb http code (and the other passes
in an existing pack-$hash.idx name, so the derived name is going to be
pack-$hash.pack, which is what we were using anyway).
I'll follow up with some more cleanups in that area, but this patch is
sufficient to fix the regression.
Reported-by: fox <fox.gbr@townlong-yak.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 06:44:17 +0000 (02:44 -0400)]
t5550: count fetches in "previously-fetched .idx" test
We have a test in t5550 that looks at index fetching over dumb http. It
creates two branches, each of which is completely stored in its own
pack, then fetches the branches independently. What should (and does)
happen is that the first fetch grabs both .idx files and one .pack file,
and then the fetch of the second branch re-uses the previously
downloaded .idx files (fetching none) and grabs the now-required .pack
file.
Since the next few patches will be touching this area of the code, let's
beef up the test a little by checking that we're downloading the
expected items at each step.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Jeff King [Fri, 25 Oct 2024 06:43:40 +0000 (02:43 -0400)]
midx: avoid duplicate packed_git entries
When we scan a pack directory to load the idx entries we find into the
packed_git list, we skip any of them that are contained in a midx. We
then load them later lazily if we actually need to access the
corresponding pack, referencing them both from the midx struct and the
packed_git list.
The lazy-load in the midx code checks to see if the midx already
mentions the pack, but doesn't otherwise check the packed_git list. This
makes sense, since we should have added any pack to both lists.
But there's a loophole! If we call close_object_store(), that frees the
midx entirely, but _not_ the packed_git structs, which we must keep
around for Reasons[1]. If we then try to look up more objects, we'll
auto-load the midx again, which won't realize that we've already loaded
those packs, and will create duplicate entries in the packed_git list.
This is possibly inefficient, because it means we may open and map the
pack redundantly. But it can also lead to weird user-visible behavior.
The case I found is in "git repack", which closes and reopens the midx
after repacking and then calls update_server_info(). We end up writing
the duplicate entries into objects/info/packs.
We could obviously de-dup them while writing that file, but it seems
like a violation of more core assumptions that we end up with these
duplicate structs at all.
We can avoid the duplicates reasonably efficiently by checking their
names in the pack_map hash. This annoyingly does require a little more
than a straight hash lookup due to the naming conventions, but it should
only happen when we are about to actually open a pack. I don't think one
extra malloc will be noticeable there.
[1] I'm not entirely sure of all the details, except that we generally
assume the packed_git structs never go away. We noted this
restriction in the comment added by 6f1e9394e2 (object: fix leaking
packfiles when closing object store, 2024-08-08), but it's somewhat
vague. At any rate, if you try freeing the structs in
close_object_store(), you can observe segfaults all over the test
suite. So it might be fixable, but it's not trivial.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Taylor Blau [Fri, 25 Oct 2024 18:02:36 +0000 (14:02 -0400)]
Merge branch 'jc/a-commands-without-the-repo'
Commands that can also work outside Git have learned to take the
repository instance "repo" when we know we are in a repository, and
NULL when we are not, in a parameter. The uses of the_repository
variable in a few of them have been removed using the new calling
convention.
* jc/a-commands-without-the-repo:
archive: remove the_repository global variable
annotate: remove usage of the_repository global
git: pass in repo to builtin based on setup_git_directory_gently
Taylor Blau [Fri, 25 Oct 2024 18:01:21 +0000 (14:01 -0400)]
Merge branch 'ps/ci-gitlab-windows'
Enable Windows-based CI in GitLab.
* ps/ci-gitlab-windows:
gitlab-ci: exercise Git on Windows
gitlab-ci: introduce stages and dependencies
ci: handle Windows-based CI jobs in GitLab CI
ci: create script to set up Git for Windows SDK
t7300: work around platform-specific behaviour with long paths on MinGW
Usman Akinyemi [Thu, 24 Oct 2024 00:24:58 +0000 (00:24 +0000)]
imap: replace atoi() with strtol_i() for UIDVALIDITY and UIDNEXT parsing
Replace unsafe uses of atoi() with strtol_i() to improve error handling
when parsing UIDVALIDITY, UIDNEXT, and APPENDUID in IMAP commands.
Invalid values, such as those with letters, now trigger error messages and
prevent malformed status responses.
I did not add any test for this commit as we do not have any test
for git-imap-send(1) at this point.
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Usman Akinyemi [Thu, 24 Oct 2024 00:24:57 +0000 (00:24 +0000)]
merge: replace atoi() with strtol_i() for marker size validation
Replace atoi() with strtol_i() for parsing conflict-marker-size to
improve error handling. Invalid values, such as those containing letters
now trigger a clear error message.
Update the test to verify invalid input handling.
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Usman Akinyemi [Thu, 24 Oct 2024 00:24:56 +0000 (00:24 +0000)]
daemon: replace atoi() with strtoul_ui() and strtol_i()
Replace atoi() with strtoul_ui() for --timeout and --init-timeout
(non-negative integers) and with strtol_i() for --max-connections
(signed integers). This improves error handling and input validation
by detecting invalid values and providing clear error messages.
Update tests to ensure these arguments are properly validated.
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
Karthik Nayak [Thu, 24 Oct 2024 10:53:57 +0000 (12:53 +0200)]
CodingGuidelines: discourage arbitrary suffixes in function names
We often name functions with arbitrary suffixes like `_1` as an
extension of another existing function. This creates confusion and
doesn't provide good clarity into the functions purpose. Let's document
good function naming etiquette in our CodingGuidelines.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
`git mv a/a.txt a b/` is a nonsense instruction. Instead of failing
gracefully the command trips over itself,[1] leaving behind unfinished
work:
1. first it moves `a/a.txt` to `b/a.txt`; then
2. tries to move `a/`, including `a/a.txt`; then
3. figures out that it’s in a bad state (assertion); and finally
4. aborts.
Now you’re left with a partially-updated index.
The command should instead fail gracefully and make no changes to the
index until it knows that it can complete a sensible action.
For now just add a failing test since this has been known about for
a while.[2]
† 1: Caused by a `pos >= 0` assertion
[2]: https://lore.kernel.org/git/d1f739fe-b28e-451f-9e01-3d2e24a0fe0d@app.fastmail.com/
Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Taylor Blau <me@ttaylorr.com>