promisor-remote: allow a server to advertise more fields
For now the "promisor-remote" protocol capability can only pass "name"
and "url" information from a server to a client in the form
"name=<remote_name>,url=<remote_url>".
To allow clients to make more informed decisions about which promisor
remotes they accept, let's make it possible to pass more information
by introducing a new "promisor.sendFields" configuration variable.
On the server side, information about a remote `foo` is stored in
configuration variables named `remote.foo.<variable-name>`. To make
it clearer and simpler, we use `field` and `field name` like this:
* `field name` refers to the <variable-name> part of such a
configuration variable, and
* `field` refers to both the `field name` and the value of such a
configuration variable.
The "promisor.sendFields" configuration variable should contain a
comma or space separated list of field names that will be looked up
in the configuration of the remote on the server to find the values
that will be passed to the client.
Only a set of predefined field names are allowed. The only field
names in this set are "partialCloneFilter" and "token". The
"partialCloneFilter" field name specifies the filter definition used
by the promisor remote, and the "token" field name can provide an
authentication credential for accessing it.
For example, if "promisor.sendFields" is set to "partialCloneFilter",
and the server has the "remote.foo.partialCloneFilter" config
variable set to a value, then that value will be passed in the
"partialCloneFilter" field in the form "partialCloneFilter=<value>"
after the "name" and "url" fields.
A following commit will allow the client to use the information to
decide if it accepts the remote or not. For now the client doesn't do
anything with the additional information it receives.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: refactor to get rid of 'struct strvec'
In a following commit, we will use the new 'promisor-remote' protocol
capability introduced by d460267613 (Add 'promisor-remote' capability
to protocol v2, 2025-02-18) to pass and process more information
about promisor remotes than just their name and url.
For that purpose, we will need to store information about other
fields, especially information that might or might not be available
for different promisor remotes. Unfortunately using 'struct strvec',
as we currently do, to store information about the promisor remotes
with one 'struct strvec' for each field like "name" or "url" does not
scale easily in that case. We would need one 'struct strvec' for each
new field, and then we would have to pass all these 'struct strvec'
around.
Let's refactor this and introduce a new 'struct promisor_info'.
It will only store promisor remote information in its members. For now
it has only a 'name' member for the promisor remote name and an 'url'
member for its URL. We will use a 'struct string_list' to store the
instances of 'struct promisor_info'. For each 'item' in the
string_list, 'item->string' will point to the promisor remote name and
'item->util' will point to the corresponding 'struct promisor_info'
instance.
Explicit members are used within 'struct promisor_info' for type
safety and clarity regarding the specific information being handled,
rather than a generic key-value store. We want to specify and document
each field and its content, so adding new members to the struct as
more fields are supported is fine.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adam Dinwoodie [Sat, 15 Feb 2025 21:19:03 +0000 (21:19 +0000)]
git-gui: sync Makefiles with git.git
In git.git, commit 5309c1e9fb39 (Makefile: set default goals in
makefiles, 2025-02-15) touched two Makefiles in the git-git/ directory.
Import these changes, so that the trees can converge again with the
next merge of this repository into git.git.
Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Johannes Sixt [Sat, 6 Sep 2025 09:59:09 +0000 (11:59 +0200)]
Merge branch 'js/ask-yesno'
* js/ask-yesno:
git-gui--askyesno (mingw): use Git for Windows' icon, if available
git-gui--askyesno: allow overriding the window title
git gui: set GIT_ASKPASS=git-gui--askpass if not set yet
git-gui: provide question helper for retry fallback on Windows
upload-pack: don't ACK non-commits repeatedly in protocol v2
When a client performs a fetch or clone they can optionally send "have"
lines to tell the server which objects they already have available
locally. These object IDs are stored by the server in an object array so
that it can remember any objects it doesn't have to include in the pack
sent to the client.
While there isn't any reason to do so, clients are free to send the same
"have" line repeatedly. git-upload-pack(1) already knows to handle this
well: every commit it has seen via a "have" line gets marked with the
`THEY_HAVE` flag, and if such a commit is seen repeatedly we know to not
process it another time. This also has the effect that we only store the
object ID once, only, in the `have_obj` array.
There is an edge case though: if the client sends an object ID that does
not refer to a commit we neither store nor check the `THEY_HAVE` flag.
This means that we repeatedly store the same object ID in our `have_obj`
array, with two consequences:
- In protocol v2 we deduplicate ACKs for commits, but not for any
other objects as we send ACKs for every object ID in the `have_obj`
array.
- The `have_obj` array can grow in size indefinitely with both
protocols.
The potentially-more-serious issue is the second one, as we basically
have a way for an adversary to allocate arbitrarily large buffers now.
Ultimately, this doesn't seem to be all that serious though: on my
machine, the growth of that array is at around 4MB/s, and after roughly
five minutes I was only at 1GB RSS. So this is concerning, but only
mildly so.
Fix this bug by storing the `THEY_HAVE` flag independent of the object
type so that we don't store duplicate object IDs in `have_obj` anymore.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The write_midx_internal() method uses gotos to jump to a cleanup section to
clear memory before returning 'result'. Since these jumps are more common
for error conditions, initialize 'result' to -1 and then only set it to 0
before returning with success. There are a couple places where we return
with success via a jump.
This has the added benefit that the method now returns -1 on error instead
of an inconsistent 1 or -1.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the remaining signed comparison warnings in midx-write.c so that
they can be enforced as errors in the future. After the previous change,
the remaining errors are due to iterator variables named 'i'.
The strategy here involves defining the variable within the for loop
syntax to make sure we use the appropriate bitness for the loop
sentinel. This matters in at least one method where the variable was
compared to uint32_t in some loops and size_t in others.
While adjusting these loops, there were some where the loop boundary was
checking against a uint32_t value _plus one_. These were replaced with
non-strict comparisons, but also the value is checked to not be
UINT32_MAX. Since the value is the number of incremental multi-pack-
indexes, this is not a meaningful restriction. The new die() is about
defensive programming more than it being realistically possible.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
midx-write.c has the DISABLE_SIGN_COMPARE_WARNINGS macro defined for a
few reasons, but the biggest one is the use of a signed
preferred_pack_idx member inside the write_midx_context struct. The code
currently uses -1 to indicate an unset preferred pack but pack int ids
are normally handled as uint32_t. There are also a few loops that search
for the preferred pack by name and those iterators will need updates to
uint32_t in the next change.
For now, replace the use of -1 with a 'NO_PREFERRED_PACK' macro and an
equality check. The macro stores the max value of a uint32_t, so we
cannot store a preferred pack that appears last in a list of 2^32 total
packs, but that's expected to be unreasonable already. Furthermore, with
this change we end up extending the range from 2^31 possible packs to
2^32-1.
There are some careful things to worry about with initializing the
preferred pack in the struct and using that value when searching for a
preferred pack that was already incorrect but accidentally working when
the index was initialized to zero.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
midx-write: use cleanup when incremental midx fails
The incremental mode of writing a multi-pack-index has a few extra
conditions that could lead to failure, but these are currently
short-ciruiting with 'return -1' instead of setting the method's
'result' variable and going to the cleanup tag.
Replace these returns with gotos to avoid memory issues when exiting
early due to error conditions.
Unfortunately, these error conditions are difficult to reproduce with
test cases, which is perhaps one reason why the memory loss was not
caught by existing test cases in memory tracking modes.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This instance of setting the result to 1 before going to cleanup was
accidentally removed in fcb2205b77 (midx: implement support for writing
incremental MIDX chains, 2024-08-06). Build upon a test that already deletes
a packfile to verify that this error propagates to full command failure.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fill_packs_from_midx() method was refactored in fcb2205b77 (midx:
implement support for writing incremental MIDX chains, 2024-08-06) to
allow for preferred packfiles and incremental multi-pack-indexes.
However, this led to some conditions that can cause improperly
initialized memory in the context's list of packfiles.
The conditions caring about the preferred pack name or the incremental
flag are currently necessary to load a packfile. But the context is
still being populated with pack_info structs based on the packfile array
for the existing multi-pack-index even if prepare_midx_pack() isn't
called.
Add a new test that breaks under --stress when compiled with
SANITIZE=address. The chosen number of 100 packfiles was selected to get
the --stress output to fail about 50% of the time, while 50 packfiles
could not get a failure in most --stress runs.
The test case is marked as EXPENSIVE not only because of the number of
packfiles it creates, but because some CI environments were reporting
errors during the test that I could not reproduce, specifically around
being unable to open the packfiles or their pack-indexes.
When it fails under SANITIZE=address, it provides the following error:
AddressSanitizer:DEADLYSIGNAL
=================================================================
==3263517==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000027
==3263517==The signal is caused by a READ memory access.
==3263517==Hint: address points to the zero page.
#0 0x562d5d82d1fb in close_pack_windows packfile.c:299
#1 0x562d5d82d3ab in close_pack packfile.c:354
#2 0x562d5d7bfdb4 in write_midx_internal midx-write.c:1490
#3 0x562d5d7c7aec in midx_repack midx-write.c:1795
#4 0x562d5d46fff6 in cmd_multi_pack_index builtin/multi-pack-index.c:305
...
This failure stack trace is disconnected from the real fix because the bad
pointers are accessed later when closing the packfiles from the context.
There are a few different aspects to this fix that are worth noting:
1. We return to the previous behavior of fill_packs_from_midx to not
rely on the incremental flag or existence of a preferred pack.
2. The behavior to scan all layers of an incremental midx is kept, so
this is not a full revert of the change.
3. We skip allocating more room in the pack_info array if the pack
fails prepare_midx_pack().
4. The method has always returned 0 for success and 1 for failure, but
the condition checking for error added a check for a negative result
for failure, so that is now updated.
5. The call to open_pack_index() is removed, but this is needed later
in the case of a preferred pack. That call is moved to immediately
before its result is needed (checking for the object count).
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit-graph: pass graphs that are to be merged as parameter
When determining whether or not we want to merge a commit graph chain we
retrieve the graph that is to be merged via the context's repository.
With an upcoming change though it will become a bit more complex to
figure out the commit graph, which would lead to code duplication.
Prepare for this change by passing the graph that is to be merged as a
parameter.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit-graph: return commit graph from `repo_find_commit_pos_in_graph()`
The function `repo_find_commit_pos_in_graph()` takes a commit as input
and tries to figure out whether the given repository has a commit graph
that contains that specific commit. If so, it returns the corresponding
position of that commit inside the graph.
Right now though we only return the position, but not the actual graph
that the commit has been found in. This is sensible as repositories
always have the graph in `struct repository::objects::commit_graph`.
Consequently, the caller always knows where to find it.
But in a subsequent change we're going to move the graph into the object
sources. This would require callers of the function to loop through all
sources to find the relevant commit graph.
Refactor the code so that we instead return the commit-graph that the
commit has been found with.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit-graph: return the prepared commit graph from `prepare_commit_graph()`
When making use of commit graphs, one needs to first prepare them by
calling `prepare_commit_graph()`. Once that function was called and the
commit graph was prepared successfully, the caller is now expected to
access the graph directly via `struct object_database::commit_graph`.
In a subsequent change, we're going to move the commit graph pointer
from `struct object_database` into `struct odb_source`. With this
change, semantics will change so that we use the commit graph of the
first source that has one. Consequently, all callers that currently
deference the `commit_graph` pointer would now have to loop around the
list of sources to find the commit graph.
This would become quite unwieldy. So instead of shifting the burden onto
such callers, adapt `prepare_commit_graph()` to return the prepared
commit graph, if any. Like this, callers are expected to call that
function and then use the returned commit graph.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When filtering down revisions by paths we know to use bloom filters from
the commit graph, if we have any. The entry point for this is in
`check_maybe_different_in_bloom_filter()`, where we first verify that:
- We do have a commit graph.
- That the commit is contained therein by checking that we have a
proper generation number.
- And that the graph contains a bloom filter.
The first check is somewhat redundant though: if we don't have a commit
graph, then the second check would already tell us that we don't have a
generation number for the specific commit.
In theory this could be seen as a performance optimization to
short-circuit for scenarios where there is no commit graph. But in
practice this shouldn't matter: if there is no commit graph, then the
commit graph data slab would also be unpopulated and thus a lookup of
the commit should happen in constant time.
Drop the unnecessary check.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our blaming subsystem knows to use bloom filters from commit graphs to
speed up the whole computation. The setup of this happens in
`setup_blame_bloom_data()`, where we first verify that we even have a
commit graph in the first place. This check is redundant though, as we
call `get_bloom_filter_settings()` immediately afterwards which, which
already knows to return a `NULL` pointer in case we don't have a commit
graph.
Drop the redundant check.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
alloc: fix dangling pointer in alloc_state cleanup
All callers of clear_alloc_state() immediately free what they
cleared, so currently it does not hurt anybody that the
alloc_state is left in an unreusable state, but it is an
error-prone API. Replace it with a new function that clears but
in addition frees the structure, as well as NULLing the pointer
that points at it and adjust existing callers.
As it is a moral equivalent of FREE_AND_NULL(), except that what it
frees has internal structure that needs to be cleaned, allow the
helper to be called twice in a row, by making a call with a pointer
to a pointer variable that already is NULLed.
While at it, rename allocate_alloc_state() and name the new
function alloc_state_free_and_null(), to follow more closely the
function naming convention specified in the CodingGuidelines
(namely, functions about S are named with S_ prefix and then
verb).
Signed-off-by: ノウラ | Flare <nouraellm@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Thu, 4 Sep 2025 17:58:25 +0000 (19:58 +0200)]
object-name: declare pointer type of extend_abbrev_len()'s 2nd parameter
Expose the expected type of the second parameter of extend_abbrev_len()
instead of casting a void pointer internally. Just a single caller
passes in a void pointer, the rest pass the correct type. Let the
compiler help keeping it that way.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The flag `--show-object-format` from git-rev-parse is used for
retrieving the object storage format. This way, it is used for
querying repository metadata, fitting in the purpose of git-repo-info.
Add a new field `objects.format` to the git-repo-info subcommand
containing that information.
Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
repo: add the flag -z as an alias for --format=nul
Other Git commands that have nul-terminated output (e.g. git-config,
git-status, git-ls-files) have a flag `-z` for using the null character
as the record separator.
Add the `-z` flag to git-repo-info as an alias for `--format=nul`,
making it consistent with the behavior of the other commands.
Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Toon Claes [Fri, 8 Aug 2025 09:59:43 +0000 (11:59 +0200)]
t0450: add allowlist for builtins with missing .adoc
Before we were silently skipping all builtins that don't have a matching
.adoc file. This is overly loose and might skip documentation files
when it shouldn't, for example when there was a typo in the filename.
To ensure no new builtins are added without documentation, add an
allowlist: t0450/adoc-missing. In this file only builtin commands that
do *not* have a corresponding .adoc file shall be listed. If there is a
mismatch, fail the test. This should force future contributions to
either add an .adoc, or add the builtin name to the allowlist file.
Signed-off-by: Toon Claes <toon@iotcl.com>
[jc: squashed Patrick's "missing file fix" in] Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Tue, 2 Sep 2025 18:24:52 +0000 (20:24 +0200)]
describe: use oidset in finish_depth_computation()
Depth computation can end early if all remaining commits are flagged.
The current code determines if that's the case by checking all queue
items each time it dequeues a flagged commit. This can cause
quadratic complexity.
We could simply count the flagged items in the queue and then update
that number as we add and remove items. That would provide a general
speedup, but leave one case where we have to scan the whole queue: When
we flag a previously seen, but unflagged commit. It could be on the
queue and then we'd have to decrease our count.
We could dedicate an object flag to track queue membership, but that
would leave less for candidate tags, affecting the results. So use a
hash table, specifically an oidset of commit hashes, to track that.
This avoids quadratic behaviour in all cases and provides a nice
performance boost over the previous commit, 08bb69d70f (describe: use
prio_queue_replace(), 2025-08-03):
Benchmark 1: ./git_08bb69d70f describe $(git rev-list v2.41.0..v2.47.0)
Time (mean ± σ): 855.3 ms ± 1.3 ms [User: 790.8 ms, System: 49.9 ms]
Range (min … max): 853.7 ms … 857.8 ms 10 runs
Benchmark 2: ./git describe $(git rev-list v2.41.0..v2.47.0)
Time (mean ± σ): 610.8 ms ± 1.7 ms [User: 546.9 ms, System: 49.3 ms]
Range (min … max): 608.9 ms … 613.3 ms 10 runs
Summary
./git describe $(git rev-list v2.41.0..v2.47.0) ran
1.40 ± 0.00 times faster than ./git_08bb69d70f describe $(git rev-list v2.41.0..v2.47.0)
Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Meet Soni [Tue, 26 Aug 2025 06:41:10 +0000 (12:11 +0530)]
t: add test for git refs exists subcommand
Add a test script, `t/t1462-refs-exists.sh`, for the `git refs exists`
command.
This script acts as a simple driver, leveraging the shared test library
created in the preceding commit. It works by overriding the
`$git_show_ref_exists` variable to "git refs exists" and then sourcing the
shared library (`t/show-ref-exists-tests.sh`).
This approach ensures that `git refs exists` is tested against the
entire comprehensive test suite of `git show-ref --exists`, verifying
that it acts as a compatible drop-in replacement.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Meet Soni [Tue, 26 Aug 2025 06:41:09 +0000 (12:11 +0530)]
t1422: refactor tests to be shareable
In preparation for adding tests for the `git refs exists` command,
refactor the existing t1422 test suite to make its logic shareable.
Move the core test logic from `t1422-show-ref-exists.sh` to
`show-ref-exists-tests.sh` file. Inside this script, replace hardcoded
calls to "git show-ref --exists" with the `$git_show_ref_exists`
variable.
The original `t1422-show-ref-exists.sh` script now becomes a simple
"driver". It is responsible for setting the default value of the
variable and then sourcing the test library.
This structure follows an established pattern for sharing tests and
prepares the test suite for the `refs exists` tests to be added in a
subsequent commit.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Meet Soni [Tue, 26 Aug 2025 06:41:08 +0000 (12:11 +0530)]
t1403: split 'show-ref --exists' tests into a separate file
The test file for git-show-ref(1), `t1403-show-ref.sh`, contains a group
of tests for the '--exists' flag. To improve organization and to prepare
for refactoring these tests to be shareable, move the '--exists' tests
and their corresponding setup logic into a self-contained test suite,
`t1422-show-ref-exists.sh`.
This is a pure code-movement refactoring with no change in test coverage
or behavior.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Meet Soni [Tue, 26 Aug 2025 06:41:07 +0000 (12:11 +0530)]
builtin/refs: add 'exists' subcommand
As part of the ongoing effort to consolidate reference handling,
introduce a new `exists` subcommand. This command provides the same
functionality and exit-code behavior as `git show-ref --exists`, serving
as its modern replacement.
The logic for `show-ref --exists` is minimal. Rather than creating a
shared helper function which would be overkill for ~20 lines of code,
its implementation is intentionally duplicated here. This contrasts with
`git refs list`, where sharing the larger implementation of
`for-each-ref` was necessary.
Documentation for the new subcommand is also added to the `git-refs(1)`
man page.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 2 Sep 2025 16:38:03 +0000 (09:38 -0700)]
Merge branch 'ps/object-store-midx-dedup-info' into ps/packfile-store
* ps/object-store-midx-dedup-info:
midx: compute paths via their source
midx: stop duplicating info redundant with its owning source
midx: write multi-pack indices via their source
midx: load multi-pack indices via their source
midx: drop redundant `struct repository` parameter
odb: simplify calling `link_alt_odb_entry()`
odb: return newly created in-memory sources
odb: consistently use "dir" to refer to alternate's directory
odb: allow `odb_find_source()` to fail
odb: store locality in object database sources
gitlab-ci: disable realtime monitoring to unbreak Windows jobs
The GitLab CI runners using Windows machines have realtime monitoring
via Windows Defender enabled by default. This has just now started to
cause issues in our CI jobs using Microsoft Visual Studio:
Program 'meson.exe' failed to run: Operation did not complete successfully because the file contains a virus or
potentially unwanted softwareAt line:356 char:1
+ meson setup build --vsenv -Dperl=disabled -Dbackend_max_links=1 -Dcre ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.
At line:356 char:1
+ meson setup build --vsenv -Dperl=disabled -Dbackend_max_links=1 -Dcre ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ResourceUnavailable: (:) [], ApplicationFailedException
+ FullyQualifiedErrorId : NativeCommandFailed
The detected issue is more likely than not completely bogus, but it
breaks the jobs.
Fix the issue by disabling realtime monitoring. Besides unbreaking CI,
it also improves our build times a bit:
- Building Git goes from 26 to 22 minutes.
- Executing tests goes from ~1h for one slice of tests to ~30 minutes.
This is still painfully slow, but the issue here is that the Windows
runners on GitLab CI are quite underwhelming overall.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 2 Sep 2025 15:21:27 +0000 (08:21 -0700)]
Merge branch 'ds/doc-ggg-pr-fork-clarify'
Update the instruction to use of GGG in the MyFirstContribution
document to say that a GitHub PR could be made against `git/git`
instead of `gitgitgadget/git`.
* ds/doc-ggg-pr-fork-clarify:
doc: clarify which remotes can be used with GitGitGadget
Johannes Sixt [Mon, 1 Sep 2025 18:20:08 +0000 (20:20 +0200)]
git-gui: fix error handling of Revert Changes command
The command Revert Changes has two different erroneous behaviors
depending on the Tcl version used.
The command uses a "chord" facility where different "notes" are
evaluated asynchronously and any error is reported after all of them
have finished. The intent is that a private namespace is used where
the notes can store the error state. Tcl 9 changed namespace handling
in a subtle way, as https://www.tcl-lang.org/software/tcltk/9.0.html
summarizes under "Notable incompatibilities":
Unqualified varnames resolved in current namespace, not global.
Note that in almost all cases where this causes a change, the
change is actually the removal of a latent bug.
And that's exactly what happens here.
- Under Tcl 9:
- When the command operates without any errors, the variable `err`
is never set. When the error handler wants to inspect `err` (in
the correct private namespace), it does not find it and a Tcl
error about an unset variable occurs. Incidentally, this is also
the case when the user cancels the operation with the option
"Do Nothing"!
On the other hand, when an error occurs during the operation, `err`
is set and found as intended.
Check for the existence of the variable `err` before the attempt to
read it.
- Under Tcl 8.6:
The error handler looks up `err` in the global namespace, which is
bogus and unintended. The variable is set due to the many
`catch ... err` that occur during startup in the global namespace.
- When the command operates without any errors, the error handler
finds the global `err`, which happens to be the empty string at
this point, and no error is reported.
On the other hand, when an error occurs during the operation, the
global `err` is set and found, so that an error is reported as
desired.
However, the value of `err` persists in the global namespace. When
the command is repeated, an error is reported again, even if there
was actually no error, and even "Do Nothing" was used to cancel
the operation.
Clear the global `err` before the operation begins.
The lingering error message is not a problem under Tcl 9, because a
prestine namespace is established every time the command is used.
This fixes https://github.com/j6t/git-gui/issues/21.
Helped-by: Igor Stepushchik Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Julia Evans [Fri, 29 Aug 2025 11:55:02 +0000 (11:55 +0000)]
doc: rephrase the purpose of the staging area
Git does not really "store the contents of the next commit"
anywhere; rather, you the user use the index to prepare it.
Signed-off-by: Julia Evans <julia@jvns.ca>
[jc; made the change relative to what is already in 'next'] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Paulo Casaretto [Fri, 29 Aug 2025 16:02:54 +0000 (16:02 +0000)]
range-diff: add configurable memory limit for cost matrix
When comparing large commit ranges (e.g., 250,000+ commits), range-diff
attempts to allocate an n×n cost matrix that can exhaust available
memory. For example, with 256,784 commits (n = 513,568), the matrix
would require approximately 256GB of memory (513,568² × 4 bytes),
causing either immediate segmentation faults due to integer overflow or
system hangs.
Add a memory limit check in get_correspondences() before allocating the
cost matrix. This check uses the total size in bytes (n² × sizeof(int))
and compares it against a configurable maximum, preventing both
excessive memory usage and integer overflow issues.
The limit is configurable via a new --max-memory option that accepts
human-readable sizes (e.g., "1G", "500M"). The default is 4GB for 64 bit
systems and 2GB for 32 bit systems. This allows comparing ranges of
approximately 32,000 (16,000) commits - generous for real-world use cases
while preventing impractical operations.
When the limit is exceeded, range-diff now displays a clear error
message showing both the requested memory size and the maximum allowed,
formatted in human-readable units for better user experience.
This approach was chosen over alternatives:
- Pre-counting commits: Would require spawning additional git processes
and reading all commits twice
- Limiting by commit count: Less precise than actual memory usage
- Streaming approach: Would require significant refactoring of the
current algorithm
This issue was previously discussed in:
https://lore.kernel.org/git/RFC-cover-v2-0.5-00000000000-20211210T122901Z-avarab@gmail.com/
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Paulo Casaretto <pcasaretto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 29 Aug 2025 16:44:37 +0000 (09:44 -0700)]
Merge branch 'jk/describe-blob'
"git describe <blob>" misbehaves and/or crashes in some corner
cases, which has been taught to exit with failure gracefully.
* jk/describe-blob:
describe: pass commit to describe_commit()
describe: handle blob traversal with no commits
describe: catch unborn branch in describe_blob()
describe: error if blob not found
describe: pass oid struct by const pointer
"git fetch" can clobber a symref that is dangling when the
remote-tracking HEAD is set to auto update, which has been
corrected.
* jk/no-clobber-dangling-symref-with-fetch:
refs: do not clobber dangling symrefs
t5510: prefer "git -C" to subshell for followRemoteHEAD tests
t5510: stop changing top-level working directory
t5510: make confusing config cleanup more explicit
Junio C Hamano [Fri, 29 Aug 2025 16:44:36 +0000 (09:44 -0700)]
Merge branch 'ps/reftable-libgit2-cleanup'
Code clean-ups.
* ps/reftable-libgit2-cleanup:
refs/reftable: always reload stacks when creating lock
reftable: don't second-guess errors from flock interface
reftable/stack: handle outdated stacks when compacting
reftable/stack: allow passing flags to `reftable_stack_add()`
reftable/stack: fix compiler warning due to missing braces
reftable/stack: reorder code to avoid forward declarations
reftable/writer: drop Git-specific `QSORT()` macro
reftable/writer: fix type used for number of records
Toon Claes [Tue, 5 Aug 2025 09:33:58 +0000 (11:33 +0200)]
last-modified: use Bloom filters when available
Our 'git last-modified' performs a revision walk, and computes a diff at
each point in the walk to figure out whether a given revision changed
any of the paths it considers interesting.
When changed-path Bloom filters are available, we can avoid computing
many such diffs. Before computing a diff, we first check if any of the
remaining paths of interest were possibly changed at a given commit by
consulting its Bloom filter. If any of them are, we are resigned to
compute the diff.
If none of those queries returned "maybe", we know that the given commit
doesn't contain any changed paths which are interesting to us. So, we
can avoid computing it in this case.
Toon Claes [Tue, 5 Aug 2025 09:33:57 +0000 (11:33 +0200)]
t/perf: add last-modified perf script
This just runs some simple last-modified commands. We already test
correctness in the regular suite, so this is just about finding
performance regressions from one version to another.
Based-on-patch-by: Jeff King <peff@peff.net> Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Toon Claes [Tue, 5 Aug 2025 09:33:56 +0000 (11:33 +0200)]
last-modified: new subcommand to show when files were last modified
Similar to git-blame(1), introduce a new subcommand
git-last-modified(1). This command shows the most recent modification to
paths in a tree. It does so by expanding the tree at a given commit,
taking note of the current state of each path, and then walking
backwards through history looking for commits where each path changed
into its final commit ID.
Based-on-patch-by: Jeff King <peff@peff.net> Improved-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-gui--askyesno: allow overriding the window title
"Question?" is maybe not the most informative thing to ask. In the
absence of better information, it is the best we can do, of course.
However, Git for Windows' auto updater just learned the trick to use
git-gui--askyesno to ask the user whether to update now or not. And in
this scripted scenario, we can easily pass a command-line option to
change the window title.
So let's support that with the new `--title <title>` option.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Heiko Voigt [Thu, 28 Aug 2025 08:58:47 +0000 (08:58 +0000)]
git-gui: provide question helper for retry fallback on Windows
Make use of the new environment variable GIT_ASK_YESNO to support the
recently implemented fallback in case unlink, rename or rmdir fail for
files in use on Windows. The added dialog will present a yes/no question
to the the user which will currently be used by the windows compat layer
to let the user retry a failed file operation.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
The compatObjectFormat extension is used to hide an incomplete
feature that is not yet usable for any purpose other than
developing the feature further. Document it as such to discourage
its use by mere mortals.
* bc/doc-compat-object-format-not-working:
docs: note that extensions.compatobjectformat is incomplete
Junio C Hamano [Thu, 28 Aug 2025 18:28:58 +0000 (11:28 -0700)]
Merge branch 'jk/fetch-check-graph-objects-fix'
Under a race against another process that is repacking the
repository, especially a partially cloned one, "git fetch" may
mistakenly think some objects we do have are missing, which has
been corrected.
* jk/fetch-check-graph-objects-fix:
fetch-pack: re-scan when double-checking graph objects
Junio C Hamano [Thu, 28 Aug 2025 18:28:57 +0000 (11:28 -0700)]
Merge branch 'sg/line-log-merge-optim'
"git log -L..." compared trees of multiple parents with the tree of the
merge result in an unnecessarily inefficient way.
* sg/line-log-merge-optim:
line-log: simplify condition checking for merge commits
line-log: initialize diff queue in process_ranges_ordinary_commit()
line-log: get rid of the parents array in process_ranges_merge_commit()
line-log: avoid unnecessary tree diffs when processing merge commits
Junio C Hamano [Thu, 28 Aug 2025 18:28:57 +0000 (11:28 -0700)]
Merge branch 'js/progress-delay-fix'
The start_delayed_progress() function in the progress eye-candy API
did not clear its internal state, making an initial delay value
larger than 1 second ineffective, which has been corrected.
* js/progress-delay-fix:
progress: pay attention to (customized) delay time
Derrick Stolee [Fri, 15 Aug 2025 16:12:53 +0000 (16:12 +0000)]
ls-files: conditionally leave index sparse
When running 'git ls-files' with a pathspec, the index entries get
filtered according to that pathspec before iterating over them in
show_files(). In 78087097b8 (ls-files: add --sparse option,
2021-12-22), this iteration was prefixed with a check for the '--sparse'
option which allows the command to output directory entries; this
created a pre-loop call to ensure_full_index().
However, when a user runs 'git ls-files' where the pathspec matches
directories that are recursively matched in the sparse-checkout, there
are not any sparse directories that match the pathspec so they would not
be written to the output. The expansion in this case is just a
performance drop for no behavior difference.
Replace this global check to expand the index with a check inside the
loop for a matched sparse directory. If we see one, then expand the
index and continue from the current location. This is safe since the
previous entries in the index did not have any sparse directories and
thus would remain stable in this expansion.
A test in t1092 confirms that this changes the behavior.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 27 Aug 2025 08:07:02 +0000 (04:07 -0400)]
curl: add support for curl_global_trace() components
In addition to the regular trace information produced by
CURLOPT_VERBOSE, recent curl versions can enable or disable tracing of
specific subsystems using a call to curl_global_trace().
This level of detail may or may not be useful for us in Git as mere
users of libcurl, but there's one case where we need it for a test. In
t5564, we set up a socks proxy, access it with GIT_TRACE_CURL set, and
expect to find socks-related messages in the output. This test is broken
in the release candidates for libcurl 8.16, as those socks messages are
no longer produced in the trace.
The problem bisects to curl's commit ab5e0bfddc (pytest: add SOCKS tests
and scoring, 2025-07-21). There the socks messages were moved from
generic infof() messages to the component-specific CURL_TRC_CF() system.
And so we do not see them by default, but only if "socks" is enabled as
a logging component.
Teach Git's http code to accept a component list from the
environment and pass it into curl_global_trace(). We can then use
that in the test to enable the correct component.
It should be safe to do so unconditionally. In older versions of curl
which don't support this call, setting the environment variable is a
noop. Likewise, any versions of curl which don't recognize the "socks"
component should silently ignore it. The manpage for curl_global_trace()
says this:
The config string is a list of comma-separated component names. Names
are case-insensitive and unknown names are ignored. The special name
"all" applies to all components. Names may be prefixed with '+' or '-'
to enable or disable detailed logging for a component.
The list of component names is not part of curl's public API. Names may
be added or disappear in future versions of libcurl. Since unknown
names are silently ignored, outdated log configurations does not cause
errors when upgrading libcurl. Given that, some names can be expected
to be fairly stable and are listed below for easy reference.
So this should let us make the test work on all versions without
worrying about confusing older (or newer) versions. For the same reason,
I've opted not to document this interface. This is deep internal voodoo
for which we can make no promises to users. In fact, I was tempted to
simply hard-code "socks" to let our test pass and not expose anything.
But I suspect a little run-time flexibility may come in handy in the
future when debugging or dealing with similar logging issues.
I also considered just putting "all" into such a hard-coded default. But
if you try it, you will see that many of the components are quite
verbose and likely not interesting. They would clutter up our trace
output if we enabled them by default.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ruoyu Zhong [Wed, 27 Aug 2025 02:12:19 +0000 (10:12 +0800)]
gitk: fix trackpad scrolling for Tcl/Tk 8.7+
TIP 684 [1] introduced TouchpadScroll events in Tcl/Tk 8.7, separating
trackpad gestures from traditional MouseWheel events. This broke
trackpad scrolling in gitk where trackpads generate TouchpadScroll
events instead of MouseWheel events.
Fix that by adding TouchpadScroll event bindings for all scrollable
widgets following the TIP 684 specification. Implement a new
precisescrollval proc to handle the smaller delta values from
TouchpadScroll events, using appropriate scaling factors that seem
sensible on my MacBook.
David Aguilar [Tue, 26 Aug 2025 23:35:25 +0000 (16:35 -0700)]
Makefile: build libgit-rs and libgit-sys serially
"make -JN" with INCLUDE_LIBGIT_RS enabled causes cargo lock warnings
and can trigger ld errors during the build.
The build errors are caused by two inner "make" invocations getting
triggered concurrently: once inside of libgit-sys and another inside of
libgit-rs.
Make libgit-rs depend on libgit-sys so that "make" prevents them
from running concurrently. Apply the same logic to the test invocations.
Use cargo's "--manifest-path" option instead of "cd" in the recipes.
Signed-off-by: David Aguilar <davvid@gmail.com> Acked-by: Kyle Lippincott <spectral@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Tue, 26 Aug 2025 12:19:28 +0000 (14:19 +0200)]
Documentation: note styling for bit fields
Our codebase uses a lot of bit field variables, generally to mark
boolean type variables. While there is a formatting rule in the
'.clang-format', there is no guideline specified in the
'CodingGuidelines'.
Since the '.clang-format' is not yet enforced, let's also add a
guideline with the same rule as mentioned in the '.clang-format', which
is to not use any spaces around the colon, like so:
Aditya Garg [Tue, 26 Aug 2025 15:09:22 +0000 (15:09 +0000)]
docs: update sendmail docs to use more secure SMTP server for Gmail
Earlier recommendation by IETF with RFC 2595 was to deprecate
implicit TLS in preference for upgrade an initially unencrypted
connection with STARTTLS command. These days, however, IETF
recommends that connections be made using "Implicit TLS", in
preference to STARTTLS and the like, completely reversing their
earlier position, in RFC8314.
Update the GMail example to use the implicit TLS to match the
current recommendation at port 465.
Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phillip Wood [Tue, 26 Aug 2025 13:35:28 +0000 (14:35 +0100)]
commit: print advice when core.commentString=auto
Add some advice on how to change the config settings when
"core.commentString=auto" or "core.commentChar=auto". The advice
includes instructions for clearing the config setting or setting a
fixed comment string. To try and be as specific as possible, the advice
is customized based on the user's config. If "core.commentString=auto"
is set in the system config and the user does not have write
access then the advice omits the instructions to clear the config
and recommends changing the global config instead. An alternative
approach would be to advise the user to run "git config --show-origin"
and leave them to figure out how to fix it themselves but that seems
rather unfriendly. As we're forcing them to update their config we
should try and make that as easy as possible.
In order to generate this advice we need to record each file where
either of the config keys is set and whether a key occurs more that
once in a given file. This lets us generate the list of commands to
remove all the keys and also tells us which key the "auto" setting
comes from.
As we want the user to update their config we do not provide a way
for this advice to be disabled other than changing the value of
"core.commentChar" or "core.commentString".
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phillip Wood [Tue, 26 Aug 2025 13:35:27 +0000 (14:35 +0100)]
config: warn on core.commentString=auto
As support for this setting was deprecated in the last commit print a
warning (or die when WITH_BREAKING_CHANGES is enabled) if it is set.
Avoid bombarding the user with warnings by only printing it (a) when
running commands that call "git commit" and (b) only once per command.
Some scaffolding is added to repo_read_config() to allow it to
detect deprecated config settings and warn about them. As both
"core.commentChar" and "core.commentString" set the comment
character we record which one of them is used and tailor the
warning message appropriately.
Note the odd combination of die_message() followed by die(NULL)
is to allow the next commit to insert a call to advise() in the middle.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phillip Wood [Tue, 26 Aug 2025 13:35:26 +0000 (14:35 +0100)]
breaking-changes: deprecate support for core.commentString=auto
When "core.commentString" is set to "auto" then "git commit" will
automatically select the comment character ensuring that it is not the
first character on any of the lines in the commit message. This was
introduced by commit 84c9dc2c5a2 (commit: allow core.commentChar=auto
for character auto selection, 2014-05-17). The motivation seems to be
to avoid commenting out lines from the existing message when amending
a commit that was created with a message from a file.
Unfortunately this feature does not work with:
* commit message templates that contain comments.
* prepare-commit-msg hooks that introduce comments.
* "git commit --cleanup=strip --edit -F <file>" which means that it
is incompatible with
- the "fixup" and "squash" commands of "git rebase -i" as the
comments added by those commands are then treated as part of
the commit message.
- the conflict comments added to the commit message by "git
cherry-pick", "git rebase" etc. as these comments are then
treated as part of the commit message.
It is also ignored by "git notes" when amending a note.
The issues with comments coming from a template, hook or file are a
consequence of the design of this feature and are therefore hard to
fix.
As the costs of this feature outweigh the benefits, deprecate it and
remove it in Git 3.0. If someone comes up with some patches that fix
all the issues in a maintainable way then I'd be happy to see this
change reverted.
The next commits will add a warning and some advice for users on how
they can update their config settings.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
brian m. carlson [Mon, 25 Aug 2025 22:11:01 +0000 (22:11 +0000)]
docs: note that extensions.compatobjectformat is incomplete
The compatibility object format is only implemented for loose objects,
not packed objects, so anyone attempting to push or fetch data into a
repository with this option will likely not see it work as expected. In
addition, the underlying storage of loose object mapping is likely to
change because the current format is inefficient and does not handle
important mapping information such as that of submodules.
It would have been preferable to initially document that this was not
yet ready for prime time, but we did not do so. We hinted at the fact
that this functionality is incomplete in the description, but did not
say so explicitly. Let's do so now: indicate that this feature is
incomplete and subject to change and that the option is not designed to
be used by end users.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt [Mon, 25 Aug 2025 19:16:12 +0000 (21:16 +0200)]
progress: pay attention to (customized) delay time
Using one of the start_delayed_*() functions, clients of the progress
API can request that a progress meter is only shown after some time.
To do that, the implementation intends to count down the number of
seconds stored in struct progress by observing flag progress_update,
which the timer interrupt handler sets when a second has elapsed. This
works during the first second of the delay. But the code forgets to
reset the flag to zero, so that subsequent calls of display_progress()
think that another second has elapsed and decrease the count again
until zero is reached. Due to the frequency of the calls, this happens
without an observable delay in practice, so that the effective delay is
always just one second.
This bug has been with us since the inception of the feature. Despite
having been touched on various occasions, such as 8aade107dd84
(progress: simplify "delayed" progress API), 9c5951cacf5c (progress:
drop delay-threshold code), and 44a4693bfcec (progress: create
GIT_PROGRESS_DELAY), the short delay went unnoticed.
Copy the flag state into a local variable and reset the global flag
right away so that we can detect the next clock tick correctly.
Since we have not had any complaints that the delay of one second is
too short nor that GIT_PROGRESS_DELAY is ignored, people seem to be
comfortable with the status quo. Therefore, set the default to 1 to
keep the current behavior.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 25 Aug 2025 21:22:03 +0000 (14:22 -0700)]
Merge branch 'lo/repo-info'
A new subcommand "git repo" gives users a way to grab various
repository characteristics.
* lo/repo-info:
repo: add the --format flag
repo: add the field layout.shallow
repo: add the field layout.bare
repo: add the field references.format
repo: declare the repo command
Junio C Hamano [Mon, 25 Aug 2025 21:22:03 +0000 (14:22 -0700)]
Merge branch 'ps/commit-graph-wo-globals'
Remove dependency on the_repository and other globals from the
commit-graph code, and other changes unrelated to de-globaling.
* ps/commit-graph-wo-globals:
commit-graph: stop passing in redundant repository
commit-graph: stop using `the_repository`
commit-graph: stop using `the_hash_algo`
commit-graph: refactor `parse_commit_graph()` to take a repository
commit-graph: store the hash algorithm instead of its length
commit-graph: stop using `the_hash_algo` via macros
Junio C Hamano [Mon, 25 Aug 2025 21:22:01 +0000 (14:22 -0700)]
Merge branch 'ja/doc-lint-sections-and-synopsis'
Doc lint updates to encourage the newer and easier-to-use
`synopsis` format, with fixes to a handful of existing uses.
* ja/doc-lint-sections-and-synopsis:
doc lint: check that synopsis manpages have synopsis inlines
doc:git-for-each-ref: fix styling and typos
doc: check for absence of the form --[no-]parameter
doc: check for absence of multiple terms in each entry of desc list
doc: check well-formedness of delimited sections
doc: test linkgit macros for well-formedness
Junio C Hamano [Mon, 25 Aug 2025 21:22:00 +0000 (14:22 -0700)]
Merge branch 'tc/diff-tree-max-depth'
"git diff-tree" learned "--max-depth" option.
* tc/diff-tree-max-depth:
diff: teach tree-diff a max-depth parameter
within_depth: fix return for empty path
combine-diff: zero memory used for callback filepairs
doc: config: replace backtick with apostrophe for possessive
Revert back to “Git's” which was used before d30c5cc4592 (doc: convert
git-mergetool options to new synopsis style, 2025-05-25) accidentally
changed it.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sun, 24 Aug 2025 05:00:40 +0000 (01:00 -0400)]
fetch-pack: re-scan when double-checking graph objects
The fetch code tries to avoid asking the remote side for an object we
already have. It does this by traversing recent commits reachable from
our refs looking for matches. Commit 5d4cc78f72 (fetch-pack: die if in
commit graph but not obj db, 2024-11-05) introduced an extra check
there: if we think we have an object because it's in the commit graph,
we double-check that we actually have it in our object database with a
call to odb_has_object().
But that call does not pass any flags, and so the function won't call
reprepared_packed_git() if it does not find the object. That opens us up
to the usual race against some other process repacking the odb:
1. We scan the list of packs in objects/pack but haven't yet opened them.
2. Somebody else packs the object into a new pack (which we don't know
about), and deletes the old pack it was in.
3. Our odb_has_object() calls tries to open that old pack, but finds it
is gone. We declare that we don't have the object.
And this causes us to erroneously complain and abort the fetch, thinking
our commit-graph and object database are out of sync. Instead, we should
pass HAS_OBJECT_RECHECK_PACKED, which will add a new step:
4. We re-scan the pack directory again, find the new pack, and locate
the object.
Often the fetch code tries to avoid these kinds of re-scans if it's
likely that we won't have the object. If the other side has told us
about object X and we want to know if we have it, we'll skip the re-scan
(to avoid spending a lot of effort when there are many such objects). We
can accept the racy false negative in that case because the worst case
is that we ask the other side to send us the object.
But this is not one of those cases. These are objects which are
accessible from _our_ refs, and which we already found in the commit
graph file. We should have them, and if we don't, we'll die()
immediately. So the performance impact is negligible, and getting the
right answer is important.
There's no test here because it's inherently racy. In fact, I had
trouble even developing a minimal test. The problem seen in the wild can
be produced like this:
# Any git.git mirror which supports partial clones; I think this
# should work with any repo that contains submodules, but note that
# $obj below is specific to this repo
url=https://github.com/git/git.git
# This is a commit that is not at the tip of any branches (so after
# we have it, we'll still have some commits to fetch).
obj=cf6f63ea6bf35173e02e18bdc6a4ba41288acff9
What happens here is that the initial fetch grabs that older commit (and
its ancestors) but no trees or blobs, and the subsequent checkout grabs
the necessary trees and blobs just for that commit. The final fetch
spawns a long sequence of child fetches due to fetch_submodules(), which
wants to check whether there have been any gitlink modifications which
should trigger a fetch of the related submodule (we'll leave aside the
irony that we did not even check out any submodules yet).
That series of fetches causes us to accumulate packs, which eventually
triggers background maintenance to run. That repacks all-into-one, and
the pack containing $obj goes away in favor of a new pack. And then the
fetch eventually fails with:
In the scenario above, the race becomes likely because of the long
series of quick fetches. But I _think_ the bug is independent of partial
clones entirely, and you could run into the same thing with a single
fetch, some other process running "git repack" simultaneously, and a bit
of bad luck. I haven't been able to reproduce, though. I'm not sure if
that's because there's some mis-analysis above, or if the race window is
just small enough that it's hard to trigger.
At any rate, re-scanning here seems like an obviously correct thing to
do with no downside, and it does fix the partial-clone case shown above.
Reported-by: Дилян Палаузов <dilyan.palauzov@aegee.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt [Fri, 22 Aug 2025 18:52:24 +0000 (20:52 +0200)]
doc/format-patch: adjust Thunderbird MUA hint to new add-on
There are three tips how to compose a non-line-wrapped patch with
Thunderbird. The first one suggests use of an add-on. The one
referenced has long been superseded by a different one. Update the
link to the new one. Mention that additional configuration is
required to make the add-on work.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Fri, 22 Aug 2025 21:35:00 +0000 (16:35 -0500)]
bulk-checkin: use repository variable from transaction
The bulk-checkin subsystem depends on `the_repository`. Adapt functions
and call sites to access the repository through `struct odb_transaction`
instead. The `USE_THE_REPOSITORY_VARIBALE` is still required as the
`pack_compression_level` and `pack_size_limit_cfg` globals are still
used.
Also adapt functions using packfile state to instead access it through
the transaction. This makes some function parameters redundant and go
away.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Fri, 22 Aug 2025 21:34:59 +0000 (16:34 -0500)]
bulk-checkin: require transaction for index_blob_bulk_checkin()
The bulk-checkin subsystem provides a mechanism to write blobs directly
to a packfile via `index_blob_bulk_checkin()`. If there is an ongoing
transaction when invoked, objects written via this function are stored
in the same packfile. The packfile is not flushed until the transaction
itself is flushed. If there is no transaction, the single object is
written to a packfile and immediately flushed. This complicates
`index_blob_bulk_checkin()` as it cannot reliably use the provided
transaction to get the associated repository.
Update `index_blob_bulk_checkin()` to assume that a valid transaction is
always provided. Callers are now expected to ensure a transaction is set
up beforehand. With this simplification, `deflate_blob_bulk_checkin()`
is no longer needed as a standalone internal function and is combined
with `index_blob_bulk_checkin()`. The single call site in
`object-file.c:index_fd()` is updated accordingly. Due to how
`{begin,end}_odb_transaction()` handles nested transactions, a new
transaction is only created and committed if there is not already an
ongoing transaction.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Fri, 22 Aug 2025 21:34:58 +0000 (16:34 -0500)]
bulk-checkin: remove global transaction state
Object database transactions in the bulk-checkin subsystem rely on
global state to track transaction status. Stop relying on global state
and instead store the transaction in the `struct object_database`.
Functions that operate on transactions are updated to now wire
transaction state.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Object database transaction state is stored across several global
variables in the bulk-checkin subsystem. Consolidate this state into a
single `struct odb_transaction` global. In a subsequent commit, the
transactional interfaces will be updated to wire this structure instead
of relying on a global variable.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Daniele Sassoli [Sat, 23 Aug 2025 09:12:11 +0000 (09:12 +0000)]
doc: clarify which remotes can be used with GitGitGadget
The docs mostly point to using git/git as one's remote, however, when it
comes to Sending a PR to GitGitGadget section, the reader is told to use
gitgitgadget/git, with no mention of git/git, potentially leading to
some confusion.
Clarify that both gitgitgadget/git and git/git can be used, albeit with
some differences.
Signed-off-by: Daniele Sassoli <danielesassoli@gmail.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 25 Aug 2025 12:49:57 +0000 (12:49 +0000)]
path-walk: create initializer for path lists
The previous change fixed a bug in 'git repack -adf --path-walk' that
was due to an update to how path lists are initialized and missing some
important cases when processing the pending objects.
This change takes the three critical places where path lists are
initialized and combines them into a static method. This simplifies the
callers somewhat while also helping to avoid a missed update in the
future.
The other places where a path list (struct type_and_oid_list) is
initialized is for the following "fixed" lists:
These lists are created and consumed in different ways, with only the
root trees being passed into the logic that cares about the
"maybe_interesting" bit. It is appropriate to keep these uses separate.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 25 Aug 2025 12:49:56 +0000 (12:49 +0000)]
path-walk: fix setup of pending objects
Users reported an issue where objects were missing from their local
repositories after a full repack using 'git repack -adf --path-walk'.
This was alarming and took a while to create a reproducer. Here, we fix
the bug and include a test case that would fail without this fix.
The root cause is that certain objects existed in the index and had no
second versions. These objects are usually blobs, though trees can be
included if a cache-tree exists. The issue is that the revision walk
adds these objects to the "pending" list and the path-walk API forgets
to mark the lists it creates at this point as "maybe_interesting". If
these paths only ever have a single version in the history of the repo
(including the current staged version) then the parent directory never
tries to add a new object to the list and mark the list as
"maybe_interesting". Thus, when walking the list later, the group is
skipped as it is expected that no objects are interesting. This happens
even when there are actually no UNINTERESTING objects at all! This is
based on the optimization enabled by the pack.useSparse=true config
option, which is the default.
Thus, we create a test case that demonstrates the many cases of this
issue for reproducibility:
1. File a/b/c has only one committed version.
2. Files a/i and x/y only exist as staged changes.
3. Tree x/ only exists in the cache-tree.
After performing a non-path-walk repack to force all loose objects into
packfiles, run a --path-walk repack followed by 'git fsck'. This fsck is
what fails with the following errors:
error: invalid object 100644 f2e41136... for 'a/b/c'
This is the dropped instance of the single-versioned a/b/c file.
Note that since the staged root tree is missing, the fsck output cannot
even report that the staged x/ tree is missing as well.
The core problem here is that the "maybe_interesting" member of 'struct
type_and_oid_list' is not initialized to '1'. This member was added in 6333e7ae0b (path-walk: mark trees and blobs as UNINTERESTING,
2024-12-20) in a way to help when creating packfiles for a small commit
range using the sparse path algorithm (enabled by pack.useSparse=true).
The idea here is that the list is marked as "maybe_interesting" if an
object is added that does not have the UNINTERESTING flag on it. Later,
this is checked again in case all objects in the list were marked
UNINTERESTING after that point in time. In this case, the algorithm
skips the list as there is no reason to visit it.
This leads to the problem where the "maybe_interesting" member was not
appropriately initialized when the list is created from pending objects.
Initializing this in the correct places fixes the bug.
To reduce risk of similar bugs around initializing this structure, a
follow-up change will make initializing lists use a shared method.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Sun, 24 Aug 2025 19:06:44 +0000 (21:06 +0200)]
line-log: simplify condition checking for merge commits
In process_ranges_arbitrary_commit() the condition deciding whether
the given commit is not a merge, i.e. that it doesn't have more than
one parent, is head-scratchingly backwards, flip it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Sun, 24 Aug 2025 19:06:43 +0000 (21:06 +0200)]
line-log: initialize diff queue in process_ranges_ordinary_commit()
process_ranges_ordinary_commit() uses a local diff queue variable,
which it leaves uninitialized before passing its address to
queue_diffs(). This is not an issue, because at the end of that
function the contents of an other diff queue is moved into it by
simply overwriting whatever is in there, i.e. without reading any
uninitialized memory.
Still, seeing the uninitialized diff queue being passed around scared
me more than once, so out of caution let's make sure that it's
initialized.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Sun, 24 Aug 2025 19:06:42 +0000 (21:06 +0200)]
line-log: get rid of the parents array in process_ranges_merge_commit()
We can easily iterate through the parents of a merge commit without
turning the list of parents into a dynamically allocated array of
parents, so let's do so. This way we can avoid a memory allocation
for each processed merge commit, though its effect on runtime seems to
be unmeasurable.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Sun, 24 Aug 2025 19:06:41 +0000 (21:06 +0200)]
line-log: avoid unnecessary tree diffs when processing merge commits
In process_ranges_merge_commit(), the line-level log first creates an
array of diff queues by iterating over all parents of a merge commit
and computing a tree diff for each. Then in a second loop it iterates
over those diff queues, and if it finds that none of the interesting
paths were modified in one of them, then it will return early. This
means that when none of the interesting paths were modified between a
merge and its first parent, then the tree diff between the merge and
its second (Nth...) parent was computed in vain.
Unify these two loops, so when it iterates over all parents of a merge
commit, then it first computes the tree diff between the merge and
that particular parent and then processes the resulting diff queue
right away. This way we can spare some tree diff computing, thereby
speeding up line-level log in repositories with mergy history:
# git.git, 25.8% of commits are merges:
Benchmark 1: ./git_v2.51.0 -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0
Time (mean ± σ): 1.001 s ± 0.009 s [User: 0.906 s, System: 0.095 s]
Range (min … max): 0.991 s … 1.023 s 10 runs
Benchmark 2: ./git -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0
Time (mean ± σ): 445.5 ms ± 3.4 ms [User: 358.8 ms, System: 84.3 ms]
Range (min … max): 440.1 ms … 450.3 ms 10 runs
Summary
'./git -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0' ran
2.25 ± 0.03 times faster than './git_v2.51.0 -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0'
# linux.git, 7.5% of commits are merges:
Benchmark 1: ./git_v2.51.0 -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16
Time (mean ± σ): 3.246 s ± 0.007 s [User: 2.835 s, System: 0.409 s]
Range (min … max): 3.232 s … 3.255 s 10 runs
Benchmark 2: ./git -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16
Time (mean ± σ): 2.467 s ± 0.014 s [User: 2.113 s, System: 0.353 s]
Range (min … max): 2.455 s … 2.505 s 10 runs
Summary
'./git -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16' ran
1.32 ± 0.01 times faster than './git_v2.51.0 -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16'
And since now each iteration computes a tree diff and processes its
result, there is no reason to store the diff queues for each merge
parent anymore, so replace that diff queue array with a loop-local
diff queue variable. With this change the static free_diffqueues()
helper function in 'line-log.c' has no more callers left, remove it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ruoyu Zhong [Sun, 24 Aug 2025 10:07:58 +0000 (18:07 +0800)]
gitk: use <Button-3> for ctx menus on macOS with Tcl 8.7+
Commit d277e89f87fda01daa1e1a35fc1f7118678faa1f added special handling
on macOS (OS X) that makes button 2 the right mouse button. As per TIP
474 [1], Tcl 8.7 has swapped buttons 2 and 3 such that button 3 is made
the right mouse button as in other platforms. Therefore, the logic
should be updated to use button 3 on macOS with Tcl 8.7+.