Taylor Blau [Tue, 3 Oct 2023 00:44:32 +0000 (20:44 -0400)]
builtin/repack.c: implement support for `--max-cruft-size`
Cruft packs are an alternative mechanism for storing a collection of
unreachable objects whose mtimes are recent enough to avoid being
pruned out of the repository.
When cruft packs were first introduced back in b757353676
(builtin/pack-objects.c: --cruft without expiration, 2022-05-20) and a7d493833f (builtin/pack-objects.c: --cruft with expiration,
2022-05-20), the recommended workflow consisted of:
- Repacking periodically, either by packing anything loose in the
repository (via `git repack -d`) or producing a geometric sequence
of packs (via `git repack --geometric=<d> -d`).
- Every so often, splitting the repository into two packs, one cruft
to store the unreachable objects, and another non-cruft pack to
store the reachable objects.
Repositories may (out of band with the above) choose periodically to
prune out some unreachable objects which have aged out of the grace
period by generating a pack with `--cruft-expiration=<approxidate>`.
This allowed repositories to maintain relatively few packs on average,
and quarantine unreachable objects together in a cruft pack, avoiding
the pitfalls of holding unreachable objects as loose while they age out
(for more, see some of the details in 3d89a8c118
(Documentation/technical: add cruft-packs.txt, 2022-05-20)).
This all works, but can be costly from an I/O-perspective when
frequently repacking a repository that has many unreachable objects.
This problem is exacerbated when those unreachable objects are rarely
(if every) pruned.
Since there is at most one cruft pack in the above scheme, each time we
update the cruft pack it must be rewritten from scratch. Because much of
the pack is reused, this is a relatively inexpensive operation from a
CPU-perspective, but is very costly in terms of I/O since we end up
rewriting basically the same pack (plus any new unreachable objects that
have entered the repository since the last time a cruft pack was
generated).
At the time, we decided against implementing more robust support for
multiple cruft packs. This patch implements that support which we were
lacking.
Introduce a new option `--max-cruft-size` which allows repositories to
accumulate cruft packs up to a given size, after which point a new
generation of cruft packs can accumulate until it reaches the maximum
size, and so on. To generate a new cruft pack, the process works like
so:
- Sort a list of any existing cruft packs in ascending order of pack
size.
- Starting from the beginning of the list, group cruft packs together
while the accumulated size is smaller than the maximum specified
pack size.
- Combine the objects in these cruft packs together into a new cruft
pack, along with any other unreachable objects which have since
entered the repository.
Once a cruft pack grows beyond the size specified via `--max-cruft-size`
the pack is effectively frozen. This limits the I/O churn up to a
quadratic function of the value specified by the `--max-cruft-size`
option, instead of behaving quadratically in the number of total
unreachable objects.
When pruning unreachable objects, we bypass the new code paths which
combine small cruft packs together, and instead start from scratch,
passing in the appropriate `--max-pack-size` down to `pack-objects`,
putting it in charge of keeping the resulting set of cruft packs sized
correctly.
This may seem like further I/O churn, but in practice it isn't so bad.
We could prune old cruft packs for whom all or most objects are removed,
and then generate a new cruft pack with just the remaining set of
objects. But this additional complexity buys us relatively little,
because most objects end up being pruned anyway, so the I/O churn is
well contained.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Tue, 3 Oct 2023 00:44:29 +0000 (20:44 -0400)]
builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDE
The repack builtin takes a `--max-pack-size` command-line argument which
it uses to feed into any of the pack-objects children that it may spawn
when generating a new pack.
This option is parsed with OPT_STRING, meaning that we'll accept
anything as input, punting on more fine-grained validation until we get
down into pack-objects.
This is fine, but it's wasteful to spend an entire sub-process just to
figure out that one of its option is bogus. Instead, parse the value of
`--max-pack-size` with OPT_MAGNITUDE in 'git repack', and then pass the
known-good result down to pack-objects.
Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 2 Oct 2023 18:20:00 +0000 (11:20 -0700)]
Merge branch 'jc/unresolve-removal'
"checkout --merge -- path" and "update-index --unresolve path" did
not resurrect conflicted state that was resolved to remove path,
but now they do.
* jc/unresolve-removal:
checkout: allow "checkout -m path" to unmerge removed paths
checkout/restore: add basic tests for --merge
checkout/restore: refuse unmerging paths unless checking out of the index
update-index: remove stale fallback code for "--unresolve"
update-index: use unmerge_index_entry() to support removal
resolve-undo: allow resurrecting conflicted state that resolved to deletion
update-index: do not read HEAD and MERGE_HEAD unconditionally
Junio C Hamano [Fri, 29 Sep 2023 16:04:15 +0000 (09:04 -0700)]
Merge branch 'jc/alias-completion'
The command line completion script (in contrib/) can be told to
complete aliases by including ": git <cmd> ;" in the alias to tell
it that the alias should be completed similar to how "git <cmd>" is
completed. The parsing code for the alias as been loosened to
allow ';' without an extra space before it.
* jc/alias-completion:
completion: loosen and document the requirement around completing alias
Junio C Hamano [Fri, 29 Sep 2023 16:04:14 +0000 (09:04 -0700)]
Merge branch 'jk/fsmonitor-unused-parameter'
Unused parameters in fsmonitor related code paths have been marked
as such.
* jk/fsmonitor-unused-parameter:
run-command: mark unused parameters in start_bg_wait callbacks
fsmonitor: mark unused hashmap callback parameters
fsmonitor/darwin: mark unused parameters in system callback
fsmonitor: mark unused parameters in stub functions
fsmonitor/win32: mark unused parameter in fsm_os__incompatible()
fsmonitor: mark some maybe-unused parameters
fsmonitor/win32: drop unused parameters
fsmonitor: prefer repo_git_path() to git_pathdup()
Junio C Hamano [Sat, 23 Sep 2023 00:01:36 +0000 (17:01 -0700)]
Merge branch 'jk/ort-unused-parameter-cleanups'
Code clean-up.
* jk/ort-unused-parameter-cleanups:
merge-ort: lowercase a few error messages
merge-ort: drop unused "opt" parameter from merge_check_renames_reusable()
merge-ort: drop unused parameters from detect_and_process_renames()
merge-ort: stop passing "opt" to read_oid_strbuf()
merge-ort: drop custom err() function
Junio C Hamano [Sat, 23 Sep 2023 00:01:36 +0000 (17:01 -0700)]
Merge branch 'la/trailer-cleanups'
Code clean-up.
Keep only the first three clean-ups, and discard the rest to be replaced later.
cf. <owly1qetjqo1.fsf@fine.c.googlers.com>
cf. <owlyzg1dsswr.fsf@fine.c.googlers.com>
* la/trailer-cleanups:
trailer: split process_command_line_args into separate functions
trailer: split process_input_file into separate pieces
trailer: separate public from internal portion of trailer_iterator
Jeff King [Thu, 21 Sep 2023 04:18:25 +0000 (00:18 -0400)]
test-lib: set UBSAN_OPTIONS to match ASan
For a long time we have used ASAN_OPTIONS to set abort_on_error. This is
important because we want to notice detected problems even in programs
which are expected to fail. But we never did the same for UBSAN_OPTIONS.
This means that our UBSan test suite runs might silently miss some
cases.
It also causes a more visible effect, which is that t4058 complains
about unexpected "fixes" (and this is how I noticed the issue):
$ make SANITIZE=undefined CC=gcc && (cd t && ./t4058-*)
...
ok 8 - git read-tree does not segfault # TODO known breakage vanished
ok 9 - reset --hard does not segfault # TODO known breakage vanished
ok 10 - git diff HEAD does not segfault # TODO known breakage vanished
The tests themselves aren't that interesting. We have a known bug where
these programs segfault, and they do when compiled without sanitizers.
With UBSan, when the test runs:
test_might_fail git read-tree --reset base
it gets:
cache-tree.c:935:9: runtime error: member access within misaligned address 0x5a5a5a5a5a5a5a5a for type 'struct cache_entry', which requires 8 byte alignment
So that's garbage memory which would _usually_ cause us to segfault, but
UBSan catches it and complains first about the alignment. That makes
sense, but the weird thing is that UBSan then exits instead of aborting,
so our test_might_fail call considers that an acceptable outcome and the
test "passes".
Curiously, this historically seems to have aborted, because I've run
"make test" with UBSan many times (and so did our CI) and we never saw
the problem. Even more curiously, I see an abort if I use clang with
ASan and UBSan together, like:
# this aborts!
make SANITIZE=undefined,address CC=clang
But not with just UBSan, and not with both when used with gcc:
# none of these do
make SANITIZE=undefined CC=gcc
make SANITIZE=undefined CC=clang
make SANITIZE=undefined,address CC=gcc
Likewise moving to older versions of gcc (I tried gcc-11 and gcc-12 on
my Debian system) doesn't abort. Nor does moving around in Git's
history. Neither this test nor the relevant code have been touched in a
while, and going back to v2.41.0 produces the same outcome (even though
many UBSan CI runs have passed in the meantime).
So _something_ changed on my system (and likely will soon on other
people's, since this is stock Debian unstable), but I didn't track
it further. I don't know why it ever aborted in the past, but we
definitely should be explicit here and tell UBSan what we want to
happen.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 20 Sep 2023 18:28:22 +0000 (11:28 -0700)]
completion: loosen and document the requirement around completing alias
Recently we started to tell users to spell ": git foo ;" with
space(s) around 'foo' for an alias to be completed similarly
to the 'git foo' command. It however is easy to also allow users to
spell it in a more natural way with the semicolon attached to 'foo',
i.e. ": git foo;". Also, add a comment to note that 'git' is optional
and writing ": foo;" would complete the alias just fine.
Junio C Hamano [Wed, 20 Sep 2023 17:45:12 +0000 (10:45 -0700)]
Merge branch 'jc/update-index-show-index-version'
"git update-index" learns "--show-index-version" to inspect
the index format version used by the on-disk index file.
* jc/update-index-show-index-version:
test-tool: retire "index-version"
update-index: add --show-index-version
update-index doc: v4 is OK with JGit and libgit2
Junio C Hamano [Wed, 20 Sep 2023 17:44:57 +0000 (10:44 -0700)]
Merge branch 'js/diff-cached-fsmonitor-fix'
"git diff --cached" codepath did not fill the necessary stat
information for a file when fsmonitor knows it is clean and ended
up behaving as if it is not clean, which has been corrected.
* js/diff-cached-fsmonitor-fix:
diff-lib: fix check_removed when fsmonitor is on
Han Young [Wed, 20 Sep 2023 13:27:31 +0000 (21:27 +0800)]
show doc: redirect user to git log manual instead of git diff-tree
While git show accepts options that apply to the git diff-tree command,
some options do not make sense in the context of git show.
The options of git show are handled using the machinery of git log.
The git log manual page is a better place to look into than git diff-tree
for options that are not in the git show manual page.
Signed-off-by: Han Young <hanyang.tony@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, `range-diff` shows the default notes if no notes-related
arguments are given. This is also how `log` behaves. But unlike
`range-diff`, `log` does *not* show the default notes if
`--notes=<custom>` are given. In other words, this:
git log --notes=custom
is equivalent to this:
git log --no-notes --notes=custom
While:
git range-diff --notes=custom
acts like this:
git log --notes --notes-custom
This can’t be how the user expects `range-diff` to behave given that the
man page for `range-diff` under `--[no-]notes[=<ref>]` says:
> This flag is passed to the `git log` program (see git-log(1)) that
> generates the patches.
This behavior also affects `format-patch` since it uses `range-diff` for
the cover letter. Unlike `log`, though, `format-patch` is not supposed
to show the default notes if no notes-related arguments are given.[1]
But this promise is broken when the range-diff happens to have something
to say about the changes to the default notes, since that will be shown
in the cover letter.
Remedy this by introducing `--show-notes-by-default` that `range-diff` can
use to tell the `log` subprocess what to do.
§ Authors
• Fix by Johannes
• Tests by Kristoffer
† 1: See e.g. 66b2ed09c2 (Fix "log" family not to be too agressive about
showing notes, 2010-01-20).
Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Sep 2023 22:33:43 +0000 (18:33 -0400)]
run-command: mark unused parameters in start_bg_wait callbacks
The start_bg_command() function takes a callback to tell when the
background-ed process is "ready". The callback receives the
child_process struct as well as an extra void pointer. But curiously,
neither of the two users of this interface look at either parameter!
This makes some sense. The only non-test user of the API is fsmonitor,
which uses fsmonitor_ipc__get_state() to connect to a single global
fsmonitor daemon (i.e., the one we just started!).
So we could just drop these parameters entirely. But it seems like a
pretty reasonable interface for the "wait" callback to have access to
the details of the spawned process, and to have room for passing extra
data through a void pointer. So let's leave these in place but mark the
unused ones so that -Wunused-parameter does not complain.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Sep 2023 22:33:14 +0000 (18:33 -0400)]
fsmonitor: mark unused hashmap callback parameters
Like many hashmap comparison functions, our cookies_cmp() does not look
at its extra void data parameter. This should have been annotated in 02c3c59e62 (hashmap: mark unused callback parameters, 2022-08-19), but
this new case was added around the same time (plus fsmonitor is not
built at all on Linux, so it is easy to miss there).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Sep 2023 22:32:56 +0000 (18:32 -0400)]
fsmonitor/darwin: mark unused parameters in system callback
We pass fsevent_callback() to the system FSEventStreamCreate() function
as a callback. So we must match the expected function signature, even
though we don't care about all of the parameters. Mark the unused ones
to satisfy -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Sep 2023 22:32:38 +0000 (18:32 -0400)]
fsmonitor: mark unused parameters in stub functions
The fsmonitor code has some platform-specific functions for which one or
more platforms implement noop or stub functions. We can't get rid of
these functions nor change their interface, since the point is to match
their equivalents in other platforms. But let's annotate their
parameters to quiet the compiler's -Wunused-parameter warning.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Sep 2023 22:32:05 +0000 (18:32 -0400)]
fsmonitor/win32: mark unused parameter in fsm_os__incompatible()
We never look at the "ipc" argument we receive. It was added in 8f44976882 (fsmonitor: avoid socket location check if using hook,
2022-10-04) to support the darwin fsmonitor code. The win32 code has to
match the same interface, but we should use an annotation to silence
-Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Sep 2023 22:31:48 +0000 (18:31 -0400)]
fsmonitor: mark some maybe-unused parameters
There's a bit of conditionally-compiled code in fsmonitor, so some
function parameters may be unused depending on the build options:
- in fsmonitor--daemon.c's try_to_run_foreground_daemon(), we take a
detach_console argument, but it's only used on Windows. This seems
intentional (and not mistakenly missing other platforms) based on
the discussion in c284e27ba7 (fsmonitor--daemon: implement 'start'
command, 2022-03-25), which introduced it.
- in fsmonitor-setting.c's check_for_incompatible(), we pass the "ipc"
flag down to the system-specific fsm_os__incompatible() helper. But
we can only do so if our platform has such a helper.
In both cases we can mark the argument as MAYBE_UNUSED. That annotates
it enough to suppress the compiler's -Wunused-parameter warning, but
without making it impossible to use the variable, as a regular UNUSED
annotation would.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Sep 2023 22:30:01 +0000 (18:30 -0400)]
fsmonitor/win32: drop unused parameters
A few helper functions (centered around file-watch events) take extra
fsmonitor state parameters that they don't use. These are static helpers
local to the win32 implementation, and don't need to conform to any
particular interface. We can just drop the extra parameters, which
simplifies the code and silences -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Sep 2023 22:29:40 +0000 (18:29 -0400)]
fsmonitor: prefer repo_git_path() to git_pathdup()
The fsmonitor_ipc__get_path() function ignores its repository argument.
It should use it when constructing repo paths (though in practice, it is
unlikely anything but the_repository is ever passed, so this is cleanup
and future proofing, not a bug fix).
Note that despite the lack of "dup" in the name, repo_git_path() behaves
like git_pathdup() and returns an allocated string.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 18 Sep 2023 20:53:13 +0000 (13:53 -0700)]
Merge branch 'js/complete-checkout-t'
The completion script (in contrib/) has been taught to treat the
"-t" option to "git checkout" and "git switch" just like the
"--track" option, to complete remote-tracking branches.
* js/complete-checkout-t:
completion(switch/checkout): treat --track and -t the same
Taylor Blau [Mon, 18 Sep 2023 16:35:53 +0000 (12:35 -0400)]
git-send-email.perl: avoid printing undef when validating addresses
When validating email addresses with `extract_valid_address_or_die()`,
we print out a helpful error message when the given input does not
contain a valid email address.
However, the pre-image of this patch looks something like:
my $address = shift;
$address = extract_valid_address($address):
die sprintf(__("..."), $address) if !$address;
which fails when given a bogus email address by trying to use $address
(which is undef) in a sprintf() expansion, like so:
$ git.compile send-email --to="pi <pi@pi>" /tmp/x/*.patch --force
Use of uninitialized value $address in sprintf at /home/ttaylorr/src/git/git-send-email line 1175.
error: unable to extract a valid address from:
This regression dates back to e431225569 (git-send-email: remove invalid
addresses earlier, 2012-11-22), but became more noticeable in a8022c5f7b
(send-email: expose header information to git-send-email's
sendemail-validate hook, 2023-04-19), which validates SMTP headers in
the sendemail-validate hook.
Avoid trying to format an undef by storing the given and cleaned address
separately. After applying this fix, the error contains the invalid
email address, and the warning disappears:
$ git.compile send-email --to="pi <pi@pi>" /tmp/x/*.patch --force
error: unable to extract a valid address from: pi <pi@pi>
Reported-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark Levedahl [Sun, 17 Sep 2023 19:24:31 +0000 (15:24 -0400)]
git-gui - use git-hook, honor core.hooksPath
git-gui currently runs some hooks directly using its own code written
before 2010, long predating git v2.9 that added the core.hooksPath
configuration to override the assumed location at $GIT_DIR/hooks. Thus,
git-gui looks for and runs hooks including prepare-commit-msg,
commit-msg, pre-commit, post-commit, and post-checkout from
$GIT_DIR/hooks, regardless of configuration. Commands (e.g., git-merge)
that git-gui invokes directly do honor core.hooksPath, meaning the
overall behaviour is inconsistent.
Furthermore, since v2.36 git exposes its hook execution machinery via
`git-hook run`, eliminating the need for others to maintain code
duplicating that functionality. Using git-hook will both fix git-gui's
current issues on hook configuration and (presumably) reduce the
maintenance burden going forward. So, teach git-gui to use git-hook.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff --stat: add config option to limit filename width
Add new configuration option diff.statNameWidth=<width> that is equivalent
to the command-line option --stat-name-width=<width>, but it is ignored
by format-patch. This follows the logic established by the already
existing configuration option diff.statGraphWidth=<width>.
Limiting the widths of names and graphs in the --stat output makes sense
for interactive work on wide terminals with many columns, hence the support
for these configuration options. They don't affect format-patch because
it already adheres to the traditional 80-column standard.
Update the documentation and add more tests to cover new configuration
option diff.statNameWidth=<width>. While there, perform a few minor code
and whitespace cleanups here and there, as spotted.
Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark Levedahl [Sat, 16 Sep 2023 21:01:31 +0000 (17:01 -0400)]
git-gui - re-enable use of hook scripts
Earlier, commit aae9560a introduced search in $PATH to find executables
before running them, avoiding an issue where on Windows a same named
file in the current directory can be executed in preference to anything
in a directory in $PATH. This search is intended to find an absolute
path for a bare executable ( e.g, a function "foo") by finding the first
instance of "foo" in a directory given in $PATH, and this search works
correctly. The search is explicitly avoided for an executable named
with an absolute path (e.g., /bin/sh), and that works as well.
Unfortunately, the search is also applied to commands named with a
relative path. A hook script (or executable) $HOOK is usually located
relative to the project directory as .git/hooks/$HOOK. The search for
this will generally fail as that relative path will (probably) not exist
on any directory in $PATH. This means that git hooks in general now fail
to run. Considerable mayhem could occur should a directory on $PATH be
git controlled. If such a directory includes .git/hooks/$HOOK, that
repository's $HOOK will be substituted for the one in the current
project, with unknown consequences.
This lookup failure also occurs in worktrees linked to a remote .git
directory using git-new-workdir. However, a worktree using a .git file
pointing to a separate git directory apparently avoids this: in that
case the hook command is resolved to an absolute path before being
passed down to the code introduced in aae9560a.
Fix this by replacing the test for an "absolute" pathname to a check for
a command name having more than one pathname component. This limits the
search and absolute pathname resolution to bare commands. The new test
uses tcl's "file split" command. Experiments on Linux and Windows, using
tclsh, show that command names with relative and absolute paths always
give at least two components, while a bare command gives only one.
Jeff King [Sat, 16 Sep 2023 22:11:46 +0000 (18:11 -0400)]
merge-ort: lowercase a few error messages
As noted in CodingGuidelines, error messages should not be capitalized.
Fix up a few of these that were copied verbatim from merge-recursive to
match our modern style.
We'll likewise fix up the matching ones from merge-recursive. We care a
bit less there, since the hope is that it will eventually go away. But
besides being the right thing to do in the meantime, it is necessary for
t6406 to pass both with and without GIT_TEST_MERGE_ALGORITHM set (one of
our CI jobs sets it to "recursive", which will use the merge-recursive.c
code). An alternative would be to use "grep -i" in the test to check
the message, but it's nice for the test suite to be be more exact (we'd
notice if the capitalization fix regressed).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `--type=<type>` was added as a prefered alias for `--<type>` by fb0dc3bac1 (builtin/config.c: support `--type=<type>` as preferred
alias for `--<type>`), the explanation for the path type was
reworded. Whereas the previous explanation said "expand a leading
`~`" this was changed to "adding a leading `~`". Change "adding" to
"expanding" to correctly explain the canonicalization.
Signed-off-by: Evan Gates <evan.gates@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 15 Sep 2023 11:34:43 +0000 (07:34 -0400)]
http: update curl http/2 info matching for curl 8.3.0
To redact header lines in http/2 curl traces, we have to parse past some
prefix bytes that curl sticks in the info lines it passes to us. That
changed once already, and we adapted in db30130165 (http: handle both
"h2" and "h2h3" in curl info lines, 2023-06-17).
Now it has changed again, in curl's fbacb14c4 (http2: cleanup trace
messages, 2023-08-04), which was released in curl 8.3.0. Running a build
of git linked against that version will fail to redact the trace (and as
before, t5559 notices and complains).
The format here is a little more complicated than the other ones, as it
now includes a "stream id". This is not constant but is always numeric,
so we can easily parse past it.
We'll continue to match the old versions, of course, since we want to
work with many different versions of curl. We can't even select one
format at compile time, because the behavior depends on the runtime
version of curl we use, not the version we build against.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 15 Sep 2023 11:33:16 +0000 (07:33 -0400)]
http: factor out matching of curl http/2 trace lines
We have to parse out curl's http/2 trace lines so we can redact their
headers. We already match two different types of lines from various
vintages of curl. In preparation for adding another (which will be
slightly more complex), let's pull the matching into its own function,
rather than doing it in the middle of a conditional.
While we're doing so, let's expand the comment a bit to describe the two
matches. That probably should have been part of db30130165 (http: handle
both "h2" and "h2h3" in curl info lines, 2023-06-17), but will become
even more important as we add new types.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 14 Sep 2023 09:40:04 +0000 (05:40 -0400)]
merge-ort: drop unused "opt" parameter from merge_check_renames_reusable()
The merge_options parameter has never been used since the function was
introduced in 64aceb6d73 (merge-ort: add code to check for whether
cached renames can be reused, 2021-05-20). In theory some merge options
might impact our decisions here, but that has never been the case so
far.
Let's drop it to appease -Wunused-parameter; it would be easy to add
back later if we need to (there is only one caller).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 14 Sep 2023 09:39:58 +0000 (05:39 -0400)]
merge-ort: drop unused parameters from detect_and_process_renames()
This function takes three trees representing the merge base and both
sides of the merge, but never looks at any of them. This is due to f78cf97617 (merge-ort: call diffcore_rename() directly, 2021-02-14).
Prior to that commit, we passed pairs of trees to diff_tree_oid(). But
after that commit, we collect a custom diff_queue for each pair in the
merge_options struct, and just run diffcore_rename() on the result. So
the function does not need to know about the original trees at all
anymore.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 14 Sep 2023 09:39:53 +0000 (05:39 -0400)]
merge-ort: stop passing "opt" to read_oid_strbuf()
This function doesn't look at its merge_options parameter. It used to
pass it down to err(), but that function no longer exists (and didn't
look at "opt" anyway). We can drop it here.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 14 Sep 2023 09:39:48 +0000 (05:39 -0400)]
merge-ort: drop custom err() function
The merge-ort code has an err() function, but it's really just error()
in disguise. It differs in two ways:
1. It takes a "struct merge_options" argument. But the function
completely ignores it! We can simply remove it.
2. It formats the error string into a strbuf, prepending "error: ",
and then feeds the result into error(). But this is wrong! The
error() function already adds the prefix, so we end up with:
error: error: Failed to execute internal merge
So let's just drop this function entirely and call error() directly, as
the functions are otherwise identical (note that they both always return
-1).
Presumably nobody noticed the bogus messages because they are quite hard
to trigger (they are mostly internal errors reading and writing
objects). However, one easy trigger is a custom merge driver which dies
by signal; we have a test already here, but we were not checking the
contents of stderr.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 14 Sep 2023 18:17:00 +0000 (11:17 -0700)]
Merge branch 'pw/rebase-i-after-failure'
Various fixes to the behaviour of "rebase -i" when the command got
interrupted by conflicting changes.
* pw/rebase-i-after-failure:
rebase -i: fix adding failed command to the todo list
rebase --continue: refuse to commit after failed command
rebase: fix rewritten list for failed pick
sequencer: factor out part of pick_commits()
sequencer: use rebase_path_message()
rebase -i: remove patch file after conflict resolution
rebase -i: move unlink() calls
Junio C Hamano [Thu, 14 Sep 2023 18:16:59 +0000 (11:16 -0700)]
Merge branch 'ak/pretty-decorate-more'
"git log --format" has been taught the %(decorate) placeholder.
* ak/pretty-decorate-more:
decorate: use commit color for HEAD arrow
pretty: add pointer and tag options to %(decorate)
pretty: add %(decorate[:<options>]) format
decorate: color each token separately
decorate: avoid some unnecessary color overhead
decorate: refactor format_decorations()
pretty-formats: enclose options in angle brackets
pretty-formats: define "literal formatting code"
Junio C Hamano [Thu, 14 Sep 2023 18:16:59 +0000 (11:16 -0700)]
Merge branch 'ks/ref-filter-sort-numerically'
"git for-each-ref --sort='contents:size'" sorts the refs according
to size numerically, giving a ref that points at a blob twelve-byte
(12) long before showing a blob hundred-byte (100) long.
* ks/ref-filter-sort-numerically:
ref-filter: sort numerically when ":size" is used
Taylor Blau [Wed, 13 Sep 2023 19:18:03 +0000 (15:18 -0400)]
builtin/repack.c: extract common cruft pack loop
When generating the list of packs to store in a MIDX (when given the
`--write-midx` option), we include any cruft packs both during
--geometric and non-geometric repacks.
But the rules for when we do and don't have to check whether any of
those cruft packs were queued for deletion differ slightly between the
two cases.
But the two can be unified, provided there is a little bit of extra
detail added in the comment to clarify when it is safe to avoid checking
for any pending deletions (and why it is OK to do so even when not
required).
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `->util` field corresponding to each string_list_item is used to
track the existence of some pack at the beginning of a repack operation
was originally intended to be used as a bitfield.
This bitfield tracked:
- (1 << 0): whether or not the pack should be deleted
- (1 << 1): whether or not the pack is cruft
The previous commit removed the use of the second bit, but a future
patch (from a different series than this one) will introduce a new use
of it.
So we could stop treating the util pointer as a bitfield and instead
start treating it as if it were a boolean. But this would require some
backtracking when that later patch is applied.
Instead, let's avoid touching the ->util field directly, and instead
introduce convenience functions like:
Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 13 Sep 2023 19:17:57 +0000 (15:17 -0400)]
builtin/repack.c: store existing cruft packs separately
When repacking with the `--write-midx` option, we invoke the function
`midx_included_packs()` in order to produce the list of packs we want to
include in the resulting MIDX.
This list is comprised of:
- existing .keep packs
- any pack(s) which were written earlier in the same process
- any unchanged packs when doing a `--geometric` repack
- any cruft packs
Prior to this patch, we stored pre-existing cruft and non-cruft packs
together (provided those packs are non-kept). This meant we needed an
additional bit to indicate which non-kept pack(s) were cruft versus
those that aren't.
But alternatively we can store cruft packs in a separate list, avoiding
the need for this extra bit, and simplifying the code below.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
- at least one pre-existing packfile (which is not marked as kept),
- repacking with the `-d` flag, and
- not doing a cruft repack
, then we pass a handful of additional options to the inner
`pack-objects` process, like `--unpack-unreachable`,
`--keep-unreachable`, and `--pack-loose-unreachable`, in addition to
marking any packs we just wrote for promisor remotes as kept in-core
(with `--keep-pack`, as opposed to the presence of a ".keep" file on
disk).
Because we store both cruft and non-cruft packs together in the same
`existing.non_kept_packs` list, it suffices to check its `nr` member to
see if it is zero or not.
But a following change will store cruft- and non-cruft packs separately,
meaning this check would break as a result. Prepare for this by
extracting this part of the check into a new helper function called
`has_existing_non_kept_packs()`.
This patch does not introduce any functional changes, but prepares us to
make a more isolated change in a subsequent patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 13 Sep 2023 19:17:51 +0000 (15:17 -0400)]
builtin/repack.c: extract redundant pack cleanup for existing packs
To remove redundant packs at the end of a repacking operation, Git uses
its `remove_redundant_pack()` function in a loop over the set of
pre-existing, non-kept packs.
In a later commit, we will split this list into two, one for
pre-existing cruft pack(s), and another for non-cruft pack(s). Prepare
for this by factoring out the routine to loop over and delete redundant
packs into its own function.
Instead of calling `remove_redundant_pack()` directly, we now will call
`remove_redundant_existing_packs()`, which itself dispatches a call to
`remove_redundant_packs_1()`. Note that the geometric repacking code
will still call `remove_redundant_pack()` directly, but see the previous
commit for more details.
Having `remove_redundant_packs_1()` exist as a separate function may
seem like overkill in this patch. However, a later patch will call
`remove_redundant_packs_1()` once over two separate lists, so this
refactoring sets us up for that.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 13 Sep 2023 19:17:49 +0000 (15:17 -0400)]
builtin/repack.c: extract redundant pack cleanup for --geometric
To reduce the complexity of the already quite-long `cmd_repack()`
implementation, extract out the parts responsible for deleting redundant
packs from a geometric repack out into its own sub-routine.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 13 Sep 2023 19:17:46 +0000 (15:17 -0400)]
builtin/repack.c: extract marking packs for deletion
At the end of a repack (when given `-d`), Git attempts to remove any
packs which have been made "redundant" as a result of the repacking
operation. For example, an all-into-one (`-A` or `-a`) repack makes
every pre-existing pack which is not marked as kept redundant. Geometric
repacks (with `--geometric=<n>`) make any packs which were rolled up
redundant, and so on.
But before deleting the set of packs we think are redundant, we first
check to see whether or not we just wrote a pack which is identical to
any one of the packs we were going to delete. When this is the case, Git
must avoid deleting that pack, since it matches a pack we just wrote
(so deleting it may cause the repository to become corrupt).
Right now we only process the list of non-kept packs in a single pass.
But a future change will split the existing non-kept packs further into
two lists: one for cruft packs, and another for non-cruft packs.
Factor out this routine to prepare for calling it twice on two separate
lists in a future patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 13 Sep 2023 19:17:35 +0000 (15:17 -0400)]
builtin/repack.c: extract structure to store existing packs
The repack machinery needs to keep track of which packfiles were present
in the repository at the beginning of a repack, segmented by whether or
not each pack is marked as kept.
The names of these packs are stored in two `string_list`s, corresponding
to kept- and non-kept packs, respectively. As a consequence, many
functions within the repack code need to take both `string_list`s as
arguments, leading to code like this:
Wrap up this pair of `string_list`s into a single structure that stores
both. This saves us from having to pass both string lists separately,
and prepares for adding additional fields to this structure.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 13 Sep 2023 17:07:56 +0000 (10:07 -0700)]
Merge branch 'jk/unused-post-2.42-part2'
Unused parameters to functions are marked as such, and/or removed,
in order to bring us closer to -Wunused-parameter clean.
* jk/unused-post-2.42-part2:
parse-options: mark unused parameters in noop callback
interpret-trailers: mark unused "unset" parameters in option callbacks
parse-options: add more BUG_ON() annotations
merge: do not pass unused opt->value parameter
parse-options: mark unused "opt" parameter in callbacks
parse-options: prefer opt->value to globals in callbacks
checkout-index: delay automatic setting of to_tempfile
format-patch: use OPT_STRING_LIST for to/cc options
merge: simplify parsing of "-n" option
merge: make xopts a strvec
Philippe Blain [Tue, 12 Sep 2023 17:02:15 +0000 (17:02 +0000)]
completion: improve doc for complex aliases
The completion code can be told to use a particular completion for
aliases that shell out by using ': git <cmd> ;' as the first command of
the alias. This only works if <cmd> and the semicolon are separated by a
space, since if the space is missing __git_aliased_command returns (for
example) 'checkout;' instead of just 'checkout', and then
__git_complete_command fails to find a completion for 'checkout;'.
The examples have that space but it's not clear if it's just for
style or if it's mandatory. Explicitly mention it.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Tue, 12 Sep 2023 17:30:27 +0000 (17:30 +0000)]
completion: commit: complete trailers tokens more robustly
In the previous commit, we added support for completing configured
trailer tokens in 'git commit --trailer'.
Make the implementation more robust by:
- using '__git' instead of plain 'git', as the rest of the completion
script does
- using a stricter pattern for --get-regexp to avoid false hits
- using 'cut' and 'rev' instead of 'awk' to account for tokens including
dots.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
sequencer: remove unreachable exit condition in pick_commits()
This was introduced by 56dc3ab04 ("sequencer (rebase -i): implement the
'edit' command", 2017-01-02), and was pointless from the get-go: all
early exits from the loop above are returns, so todo_list->current ==
todo_list->nr is an invariant after the loop.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t3404-rebase-interactive.sh: fix typos in title of a rewording test
This test was introduced by commit 0c164ae7a ("rebase -i: add another
reword test", 2021-08-20). I didn't quite get what it was meant to do,
so here's an explanation from Phillip:
The purpose of the test is to ensure that
(i) There are no uncommitted changes when the editor runs. i.e., we
commit without running the editor and then reword by amending
that commit. This ensures that we have the same user experience
whether or not the commit was fast-forwarded [1].
(ii) That the todo list is re-read after the commit has been reworded.
This is to allow the user to update the todo list while the rebase
is paused for editing the commit message.
Junio C Hamano [Tue, 12 Sep 2023 19:32:35 +0000 (12:32 -0700)]
test-tool: retire "index-version"
As "git update-index --show-index-version" can do the same thing,
the 'index-version' subcommand in the test-tool lost its reason to
exist. Remove it and replace its use with the end-user facing
'git update-index --show-index-version'.
Helped-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 12 Sep 2023 19:32:34 +0000 (12:32 -0700)]
update-index: add --show-index-version
"git update-index --index-version N" is used to set the index format
version to a specific version, but there was no way to query the
current version used in the on-disk index file.
Teach the command a new "--show-index-version" option, and also
teach the "--index-version N" option to report what the version was
when run with the "--verbose" option.
Helped-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 12 Sep 2023 19:32:33 +0000 (12:32 -0700)]
update-index doc: v4 is OK with JGit and libgit2
Being invented in late 2012 no longer makes the index v4 format
"relatively young".
The support for the index version 4 was added to libgit2 with their 5625d86b (index: support index v4, 2016-05-17) and to JGit with
their e9cb0a8e (DirCache: support index V4, 2020-08-10).
Let's update the paragraph that discouraged its use for folks overly
cautious about cross-tool compatibility.
Helped-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Helped-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Josip Sokcevic [Mon, 11 Sep 2023 17:09:02 +0000 (10:09 -0700)]
diff-lib: fix check_removed when fsmonitor is on
`git diff-index` may return incorrect deleted entries when fsmonitor
is used in a repository with git submodules. This can be observed on
Mac machines, but it can affect all other supported platforms too.
If fsmonitor is used, `stat *st` is not initialized if cache_entry has
CE_FSMONITOR_VALID set. But, there are three call sites that rely on stat
afterwards, which can result in incorrect results.
This change partially reverts commit 4f3d6d02 (fsmonitor: skip lstat
deletion check during git diff-index, 2021-03-17).
Signed-off-by: Josip Sokcevic <sokcevic@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
maintenance(systemd): support the Windows Subsystem for Linux
When running in the Windows Subsystem for Linux (WSL), it is usually
necessary to use the Git Credential Manager for authentication when
performing the background fetches.
This requires interoperability between the Windows Subsystem for Linux
and the Windows host to work, which uses so-called vsocks, i.e. sockets
intended for communcations between virtual machines and the host they
are running on.
However, when Git is configured to run background maintenance via
`systemd`, the address families available to those maintenance processes
are restricted, and did not include `AF_VSOCK`. This leads to problems
e.g. when a background fetch tries to access github.com:
systemd[437]: Starting Optimize Git repositories data...
git[747387]: WSL (747387) ERROR: UtilBindVsockAnyPort:285: socket failed 97
git[747381]: fatal: could not read Username for 'https://github.com': No such device or address
git[747381]: error: failed to prefetch remotes
git[747381]: error: task 'prefetch' failed
systemd[437]: git-maintenance@hourly.service: Main process exited, code=exited, status=1/FAILURE
systemd[437]: git-maintenance@hourly.service: Failed with result 'exit-code'.
systemd[437]: Failed to start Optimize Git repositories data.
Address this (pun intended) by adding the `AF_VSOCK` address family to
the allow list.
This fixes https://github.com/microsoft/git/issues/604.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sat, 9 Sep 2023 22:12:52 +0000 (00:12 +0200)]
diff --no-index: fix -R with stdin
When -R is given, queue_diff() swaps the mode and name variables of the
two files to produce a reverse diff. 1e3f26542a (diff --no-index:
support reading from named pipes, 2023-07-05) added variables that
indicate whether files are special, i.e named pipes or - for stdin.
These new variables were not swapped, though, which broke the handling
of stdin with with -R. Swap them like the other metadata variables.
Reported-by: Martin Storsjö <martin@martin.st> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
trailer: split process_input_file into separate pieces
Currently, process_input_file does three things:
(1) parse the input string for trailers,
(2) print text before the trailers, and
(3) calculate the position of the input where the trailers end.
Rename this function to parse_trailers(), and make it only do
(1). The caller of this function, process_trailers, becomes responsible
for (2) and (3). These items belong inside process_trailers because they
are both concerned with printing the surrounding text around
trailers (which is already one of the immediate concerns of
process_trailers).
Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
trailer: separate public from internal portion of trailer_iterator
The fields here are not meant to be used by downstream callers, so put
them behind an anonymous struct named as "internal" to warn against
their use. This follows the pattern in 576de3d956 (unpack_trees: start
splitting internal fields from public API, 2023-02-27).
Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion(switch/checkout): treat --track and -t the same
When `git switch --track ` is to be completed, only remote refs are
eligible because that is what the `--track` option targets.
And when the short-hand `-t` is used instead, the same _should_ happen.
Let's make it so.
Note that the bug exists both in the completions of `switch` and
`completion`, even if it manifests in slightly different ways: While
the completion of `git switch -t ` will not even look at remote refs,
the completion of `git checkout -t ` will look at both remote _and_
local refs. Both should look only at remote refs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>