Junio C Hamano [Fri, 29 Aug 2025 17:23:53 +0000 (10:23 -0700)]
Merge branch 'am/xdiff-hash-tweak' into next
Inspired by Ezekiel's recent effort to showcase Rust interface, the
hash function implementation used to hash lines have been updated
to the one used for ELF symbol lookup by Glibc.
The compatObjectFormat extension is used to hide an incomplete
feature that is not yet usable for any purpose other than
developing the feature further. Document it as such to discourage
its use by mere mortals.
* bc/doc-compat-object-format-not-working:
docs: note that extensions.compatobjectformat is incomplete
Junio C Hamano [Thu, 28 Aug 2025 18:28:58 +0000 (11:28 -0700)]
Merge branch 'jk/fetch-check-graph-objects-fix'
Under a race against another process that is repacking the
repository, especially a partially cloned one, "git fetch" may
mistakenly think some objects we do have are missing, which has
been corrected.
* jk/fetch-check-graph-objects-fix:
fetch-pack: re-scan when double-checking graph objects
Junio C Hamano [Thu, 28 Aug 2025 18:28:57 +0000 (11:28 -0700)]
Merge branch 'sg/line-log-merge-optim'
"git log -L..." compared trees of multiple parents with the tree of the
merge result in an unnecessarily inefficient way.
* sg/line-log-merge-optim:
line-log: simplify condition checking for merge commits
line-log: initialize diff queue in process_ranges_ordinary_commit()
line-log: get rid of the parents array in process_ranges_merge_commit()
line-log: avoid unnecessary tree diffs when processing merge commits
Junio C Hamano [Thu, 28 Aug 2025 18:28:57 +0000 (11:28 -0700)]
Merge branch 'js/progress-delay-fix'
The start_delayed_progress() function in the progress eye-candy API
did not clear its internal state, making an initial delay value
larger than 1 second ineffective, which has been corrected.
* js/progress-delay-fix:
progress: pay attention to (customized) delay time
Junio C Hamano [Wed, 27 Aug 2025 01:18:17 +0000 (18:18 -0700)]
Merge branch 'bc/doc-compat-object-format-not-working' into next
The compatObjectFormat extension is used to hide an incomplete
feature that is not yet usable for any purpose other than
developing the feature further. Document it as such to discourage
its use by mere mortals.
* bc/doc-compat-object-format-not-working:
docs: note that extensions.compatobjectformat is incomplete
Junio C Hamano [Wed, 27 Aug 2025 01:18:17 +0000 (18:18 -0700)]
Merge branch 'jk/fetch-check-graph-objects-fix' into next
Under a race against another process that is repacking the
repository, especially a partially cloned one, "git fetch" may have
mistakenly think some objects we do have are missing, which has
been corrected.
* jk/fetch-check-graph-objects-fix:
fetch-pack: re-scan when double-checking graph objects
Junio C Hamano [Wed, 27 Aug 2025 01:18:17 +0000 (18:18 -0700)]
Merge branch 'sg/line-log-merge-optim' into next
"git log -L..." compared trees of multiple parents with the tree of the
merge result in an unnecessarily inefficient way.
* sg/line-log-merge-optim:
line-log: simplify condition checking for merge commits
line-log: initialize diff queue in process_ranges_ordinary_commit()
line-log: get rid of the parents array in process_ranges_merge_commit()
line-log: avoid unnecessary tree diffs when processing merge commits
Junio C Hamano [Wed, 27 Aug 2025 01:18:16 +0000 (18:18 -0700)]
Merge branch 'js/progress-delay-fix' into next
The start_delayed-Progress() function in the progress eye-candy API
did not clear its internal state, making an initial delay value
larger than 1 second ineffective, which has been corrected.
* js/progress-delay-fix:
progress: pay attention to (customized) delay time
David Aguilar [Tue, 26 Aug 2025 23:35:25 +0000 (16:35 -0700)]
Makefile: build libgit-rs and libgit-sys serially
"make -JN" with INCLUDE_LIBGIT_RS enabled causes cargo lock warnings
and can trigger ld errors during the build.
The build errors are caused by two inner "make" invocations getting
triggered concurrently: once inside of libgit-sys and another inside of
libgit-rs.
Make libgit-rs depend on libgit-sys so that "make" prevents them
from running concurrently. Apply the same logic to the test invocations.
Use cargo's "--manifest-path" option instead of "cd" in the recipes.
Signed-off-by: David Aguilar <davvid@gmail.com> Acked-by: Kyle Lippincott <spectral@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
brian m. carlson [Mon, 25 Aug 2025 22:11:01 +0000 (22:11 +0000)]
docs: note that extensions.compatobjectformat is incomplete
The compatibility object format is only implemented for loose objects,
not packed objects, so anyone attempting to push or fetch data into a
repository with this option will likely not see it work as expected. In
addition, the underlying storage of loose object mapping is likely to
change because the current format is inefficient and does not handle
important mapping information such as that of submodules.
It would have been preferable to initially document that this was not
yet ready for prime time, but we did not do so. We hinted at the fact
that this functionality is incomplete in the description, but did not
say so explicitly. Let's do so now: indicate that this feature is
incomplete and subject to change and that the option is not designed to
be used by end users.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt [Mon, 25 Aug 2025 19:16:12 +0000 (21:16 +0200)]
progress: pay attention to (customized) delay time
Using one of the start_delayed_*() functions, clients of the progress
API can request that a progress meter is only shown after some time.
To do that, the implementation intends to count down the number of
seconds stored in struct progress by observing flag progress_update,
which the timer interrupt handler sets when a second has elapsed. This
works during the first second of the delay. But the code forgets to
reset the flag to zero, so that subsequent calls of display_progress()
think that another second has elapsed and decrease the count again
until zero is reached. Due to the frequency of the calls, this happens
without an observable delay in practice, so that the effective delay is
always just one second.
This bug has been with us since the inception of the feature. Despite
having been touched on various occasions, such as 8aade107dd84
(progress: simplify "delayed" progress API), 9c5951cacf5c (progress:
drop delay-threshold code), and 44a4693bfcec (progress: create
GIT_PROGRESS_DELAY), the short delay went unnoticed.
Copy the flag state into a local variable and reset the global flag
right away so that we can detect the next clock tick correctly.
Since we have not had any complaints that the delay of one second is
too short nor that GIT_PROGRESS_DELAY is ignored, people seem to be
comfortable with the status quo. Therefore, set the default to 1 to
keep the current behavior.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 25 Aug 2025 21:22:03 +0000 (14:22 -0700)]
Merge branch 'lo/repo-info'
A new subcommand "git repo" gives users a way to grab various
repository characteristics.
* lo/repo-info:
repo: add the --format flag
repo: add the field layout.shallow
repo: add the field layout.bare
repo: add the field references.format
repo: declare the repo command
Junio C Hamano [Mon, 25 Aug 2025 21:22:03 +0000 (14:22 -0700)]
Merge branch 'ps/commit-graph-wo-globals'
Remove dependency on the_repository and other globals from the
commit-graph code, and other changes unrelated to de-globaling.
* ps/commit-graph-wo-globals:
commit-graph: stop passing in redundant repository
commit-graph: stop using `the_repository`
commit-graph: stop using `the_hash_algo`
commit-graph: refactor `parse_commit_graph()` to take a repository
commit-graph: store the hash algorithm instead of its length
commit-graph: stop using `the_hash_algo` via macros
Junio C Hamano [Mon, 25 Aug 2025 21:22:01 +0000 (14:22 -0700)]
Merge branch 'ja/doc-lint-sections-and-synopsis'
Doc lint updates to encourage the newer and easier-to-use
`synopsis` format, with fixes to a handful of existing uses.
* ja/doc-lint-sections-and-synopsis:
doc lint: check that synopsis manpages have synopsis inlines
doc:git-for-each-ref: fix styling and typos
doc: check for absence of the form --[no-]parameter
doc: check for absence of multiple terms in each entry of desc list
doc: check well-formedness of delimited sections
doc: test linkgit macros for well-formedness
Junio C Hamano [Mon, 25 Aug 2025 21:22:00 +0000 (14:22 -0700)]
Merge branch 'tc/diff-tree-max-depth'
"git diff-tree" learned "--max-depth" option.
* tc/diff-tree-max-depth:
diff: teach tree-diff a max-depth parameter
within_depth: fix return for empty path
combine-diff: zero memory used for callback filepairs
Junio C Hamano [Mon, 25 Aug 2025 21:15:50 +0000 (14:15 -0700)]
Merge branch 'ds/doc-ggg-pr-fork-clarify' into next
Update the instruction to use of GGG in the MyFirstContribution
document to say that a GitHub PR could be made against `git/git`
instead of `gitgitgadget/git`.
* ds/doc-ggg-pr-fork-clarify:
doc: clarify which remotes can be used with GitGitGadget
doc: config: replace backtick with apostrophe for possessive
Revert back to “Git's” which was used before d30c5cc4592 (doc: convert
git-mergetool options to new synopsis style, 2025-05-25) accidentally
changed it.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sun, 24 Aug 2025 05:00:40 +0000 (01:00 -0400)]
fetch-pack: re-scan when double-checking graph objects
The fetch code tries to avoid asking the remote side for an object we
already have. It does this by traversing recent commits reachable from
our refs looking for matches. Commit 5d4cc78f72 (fetch-pack: die if in
commit graph but not obj db, 2024-11-05) introduced an extra check
there: if we think we have an object because it's in the commit graph,
we double-check that we actually have it in our object database with a
call to odb_has_object().
But that call does not pass any flags, and so the function won't call
reprepared_packed_git() if it does not find the object. That opens us up
to the usual race against some other process repacking the odb:
1. We scan the list of packs in objects/pack but haven't yet opened them.
2. Somebody else packs the object into a new pack (which we don't know
about), and deletes the old pack it was in.
3. Our odb_has_object() calls tries to open that old pack, but finds it
is gone. We declare that we don't have the object.
And this causes us to erroneously complain and abort the fetch, thinking
our commit-graph and object database are out of sync. Instead, we should
pass HAS_OBJECT_RECHECK_PACKED, which will add a new step:
4. We re-scan the pack directory again, find the new pack, and locate
the object.
Often the fetch code tries to avoid these kinds of re-scans if it's
likely that we won't have the object. If the other side has told us
about object X and we want to know if we have it, we'll skip the re-scan
(to avoid spending a lot of effort when there are many such objects). We
can accept the racy false negative in that case because the worst case
is that we ask the other side to send us the object.
But this is not one of those cases. These are objects which are
accessible from _our_ refs, and which we already found in the commit
graph file. We should have them, and if we don't, we'll die()
immediately. So the performance impact is negligible, and getting the
right answer is important.
There's no test here because it's inherently racy. In fact, I had
trouble even developing a minimal test. The problem seen in the wild can
be produced like this:
# Any git.git mirror which supports partial clones; I think this
# should work with any repo that contains submodules, but note that
# $obj below is specific to this repo
url=https://github.com/git/git.git
# This is a commit that is not at the tip of any branches (so after
# we have it, we'll still have some commits to fetch).
obj=cf6f63ea6bf35173e02e18bdc6a4ba41288acff9
What happens here is that the initial fetch grabs that older commit (and
its ancestors) but no trees or blobs, and the subsequent checkout grabs
the necessary trees and blobs just for that commit. The final fetch
spawns a long sequence of child fetches due to fetch_submodules(), which
wants to check whether there have been any gitlink modifications which
should trigger a fetch of the related submodule (we'll leave aside the
irony that we did not even check out any submodules yet).
That series of fetches causes us to accumulate packs, which eventually
triggers background maintenance to run. That repacks all-into-one, and
the pack containing $obj goes away in favor of a new pack. And then the
fetch eventually fails with:
In the scenario above, the race becomes likely because of the long
series of quick fetches. But I _think_ the bug is independent of partial
clones entirely, and you could run into the same thing with a single
fetch, some other process running "git repack" simultaneously, and a bit
of bad luck. I haven't been able to reproduce, though. I'm not sure if
that's because there's some mis-analysis above, or if the race window is
just small enough that it's hard to trigger.
At any rate, re-scanning here seems like an obviously correct thing to
do with no downside, and it does fix the partial-clone case shown above.
Reported-by: Дилян Палаузов <dilyan.palauzov@aegee.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt [Fri, 22 Aug 2025 18:52:24 +0000 (20:52 +0200)]
doc/format-patch: adjust Thunderbird MUA hint to new add-on
There are three tips how to compose a non-line-wrapped patch with
Thunderbird. The first one suggests use of an add-on. The one
referenced has long been superseded by a different one. Update the
link to the new one. Mention that additional configuration is
required to make the add-on work.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Daniele Sassoli [Sat, 23 Aug 2025 09:12:11 +0000 (09:12 +0000)]
doc: clarify which remotes can be used with GitGitGadget
The docs mostly point to using git/git as one's remote, however, when it
comes to Sending a PR to GitGitGadget section, the reader is told to use
gitgitgadget/git, with no mention of git/git, potentially leading to
some confusion.
Clarify that both gitgitgadget/git and git/git can be used, albeit with
some differences.
Signed-off-by: Daniele Sassoli <danielesassoli@gmail.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 25 Aug 2025 12:49:57 +0000 (12:49 +0000)]
path-walk: create initializer for path lists
The previous change fixed a bug in 'git repack -adf --path-walk' that
was due to an update to how path lists are initialized and missing some
important cases when processing the pending objects.
This change takes the three critical places where path lists are
initialized and combines them into a static method. This simplifies the
callers somewhat while also helping to avoid a missed update in the
future.
The other places where a path list (struct type_and_oid_list) is
initialized is for the following "fixed" lists:
These lists are created and consumed in different ways, with only the
root trees being passed into the logic that cares about the
"maybe_interesting" bit. It is appropriate to keep these uses separate.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 25 Aug 2025 12:49:56 +0000 (12:49 +0000)]
path-walk: fix setup of pending objects
Users reported an issue where objects were missing from their local
repositories after a full repack using 'git repack -adf --path-walk'.
This was alarming and took a while to create a reproducer. Here, we fix
the bug and include a test case that would fail without this fix.
The root cause is that certain objects existed in the index and had no
second versions. These objects are usually blobs, though trees can be
included if a cache-tree exists. The issue is that the revision walk
adds these objects to the "pending" list and the path-walk API forgets
to mark the lists it creates at this point as "maybe_interesting". If
these paths only ever have a single version in the history of the repo
(including the current staged version) then the parent directory never
tries to add a new object to the list and mark the list as
"maybe_interesting". Thus, when walking the list later, the group is
skipped as it is expected that no objects are interesting. This happens
even when there are actually no UNINTERESTING objects at all! This is
based on the optimization enabled by the pack.useSparse=true config
option, which is the default.
Thus, we create a test case that demonstrates the many cases of this
issue for reproducibility:
1. File a/b/c has only one committed version.
2. Files a/i and x/y only exist as staged changes.
3. Tree x/ only exists in the cache-tree.
After performing a non-path-walk repack to force all loose objects into
packfiles, run a --path-walk repack followed by 'git fsck'. This fsck is
what fails with the following errors:
error: invalid object 100644 f2e41136... for 'a/b/c'
This is the dropped instance of the single-versioned a/b/c file.
Note that since the staged root tree is missing, the fsck output cannot
even report that the staged x/ tree is missing as well.
The core problem here is that the "maybe_interesting" member of 'struct
type_and_oid_list' is not initialized to '1'. This member was added in 6333e7ae0b (path-walk: mark trees and blobs as UNINTERESTING,
2024-12-20) in a way to help when creating packfiles for a small commit
range using the sparse path algorithm (enabled by pack.useSparse=true).
The idea here is that the list is marked as "maybe_interesting" if an
object is added that does not have the UNINTERESTING flag on it. Later,
this is checked again in case all objects in the list were marked
UNINTERESTING after that point in time. In this case, the algorithm
skips the list as there is no reason to visit it.
This leads to the problem where the "maybe_interesting" member was not
appropriately initialized when the list is created from pending objects.
Initializing this in the correct places fixes the bug.
To reduce risk of similar bugs around initializing this structure, a
follow-up change will make initializing lists use a shared method.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Sun, 24 Aug 2025 19:06:44 +0000 (21:06 +0200)]
line-log: simplify condition checking for merge commits
In process_ranges_arbitrary_commit() the condition deciding whether
the given commit is not a merge, i.e. that it doesn't have more than
one parent, is head-scratchingly backwards, flip it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Sun, 24 Aug 2025 19:06:43 +0000 (21:06 +0200)]
line-log: initialize diff queue in process_ranges_ordinary_commit()
process_ranges_ordinary_commit() uses a local diff queue variable,
which it leaves uninitialized before passing its address to
queue_diffs(). This is not an issue, because at the end of that
function the contents of an other diff queue is moved into it by
simply overwriting whatever is in there, i.e. without reading any
uninitialized memory.
Still, seeing the uninitialized diff queue being passed around scared
me more than once, so out of caution let's make sure that it's
initialized.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Sun, 24 Aug 2025 19:06:42 +0000 (21:06 +0200)]
line-log: get rid of the parents array in process_ranges_merge_commit()
We can easily iterate through the parents of a merge commit without
turning the list of parents into a dynamically allocated array of
parents, so let's do so. This way we can avoid a memory allocation
for each processed merge commit, though its effect on runtime seems to
be unmeasurable.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Sun, 24 Aug 2025 19:06:41 +0000 (21:06 +0200)]
line-log: avoid unnecessary tree diffs when processing merge commits
In process_ranges_merge_commit(), the line-level log first creates an
array of diff queues by iterating over all parents of a merge commit
and computing a tree diff for each. Then in a second loop it iterates
over those diff queues, and if it finds that none of the interesting
paths were modified in one of them, then it will return early. This
means that when none of the interesting paths were modified between a
merge and its first parent, then the tree diff between the merge and
its second (Nth...) parent was computed in vain.
Unify these two loops, so when it iterates over all parents of a merge
commit, then it first computes the tree diff between the merge and
that particular parent and then processes the resulting diff queue
right away. This way we can spare some tree diff computing, thereby
speeding up line-level log in repositories with mergy history:
# git.git, 25.8% of commits are merges:
Benchmark 1: ./git_v2.51.0 -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0
Time (mean ± σ): 1.001 s ± 0.009 s [User: 0.906 s, System: 0.095 s]
Range (min … max): 0.991 s … 1.023 s 10 runs
Benchmark 2: ./git -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0
Time (mean ± σ): 445.5 ms ± 3.4 ms [User: 358.8 ms, System: 84.3 ms]
Range (min … max): 440.1 ms … 450.3 ms 10 runs
Summary
'./git -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0' ran
2.25 ± 0.03 times faster than './git_v2.51.0 -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0'
# linux.git, 7.5% of commits are merges:
Benchmark 1: ./git_v2.51.0 -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16
Time (mean ± σ): 3.246 s ± 0.007 s [User: 2.835 s, System: 0.409 s]
Range (min … max): 3.232 s … 3.255 s 10 runs
Benchmark 2: ./git -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16
Time (mean ± σ): 2.467 s ± 0.014 s [User: 2.113 s, System: 0.353 s]
Range (min … max): 2.455 s … 2.505 s 10 runs
Summary
'./git -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16' ran
1.32 ± 0.01 times faster than './git_v2.51.0 -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16'
And since now each iteration computes a tree diff and processes its
result, there is no reason to store the diff queues for each merge
parent anymore, so replace that diff queue array with a loop-local
diff queue variable. With this change the static free_diffqueues()
helper function in 'line-log.c' has no more callers left, remove it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Julia Evans [Sat, 23 Aug 2025 00:43:02 +0000 (00:43 +0000)]
doc: git-rebase: update discussion of internals
- make it clearer that we're talking about a multistep process
- give a more technically accurate description how rebase works with the
merge backend.
- condense the explanation of how git rebase skips commits with the same
textual changes into a single bullet point and remove the explanatory
diagram. Lots of things which are more complicated are already being
explained without a diagram.
- remove the explanation of how exactly `--fork-point` and `--root`
work since that information is in the OPTIONS section
- put all discussion of `ORIG_HEAD` inside the note
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Julia Evans [Sat, 23 Aug 2025 00:43:01 +0000 (00:43 +0000)]
doc: git-rebase: move --onto explanation down
There's a very clear explanation with examples of using --onto which is
currently buried in the very long DESCRIPTION section. This moves it to
its own section, so that we can reference the explanation from the
`--onto` option by name.
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Julia Evans [Sat, 23 Aug 2025 00:42:58 +0000 (00:42 +0000)]
doc: git-rebase: start with an example
- Start with an example that mirrors the example in the `git-merge` man
page, to make it easier for folks to understand the difference between
a rebase and a merge.
- Mention that rebase can combine or reorder commits
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various options to "git diff" that makes comparison ignore certain
aspects of the differences (like "space changes are ignored",
"differences in lines that match these regular expressions are
ignored") did not work well with "--name-only" and friends.
* ly/diff-name-only-with-diff-from-content:
diff: ensure consistent diff behavior with ignore options
Junio C Hamano [Fri, 22 Aug 2025 20:13:21 +0000 (13:13 -0700)]
Merge branch 'ac/deglobal-fmt-merge-log-config'
Code clean-up.
* ac/deglobal-fmt-merge-log-config:
builtin/fmt-merge-msg: stop depending on 'the_repository'
environment: remove the global variable 'merge_log_config'
Junio C Hamano [Fri, 22 Aug 2025 20:13:20 +0000 (13:13 -0700)]
Merge branch 'jc/diff-no-index-in-subdir'
"git diff --no-index" run inside a subdirectory under control of a
Git repository operated at the top of the working tree and stripped
the prefix from the output, and oddballs like "-" (stdin) did not
work correctly because of it. Correct the set-up by undoing what
the set-up sequence did to cwd and prefix.
* jc/diff-no-index-in-subdir:
diff: --no-index should ignore the worktree
Junio C Hamano [Fri, 22 Aug 2025 20:13:20 +0000 (13:13 -0700)]
Merge branch 'ms/refs-list'
The "list" subcommand of "git refs" acts as a front-end for
"git for-each-ref".
* ms/refs-list:
t: add test for git refs list subcommand
t6300: refactor tests to be shareable
builtin/refs: add list subcommand
builtin/for-each-ref: factor out core logic into a helper
builtin/for-each-ref: align usage string with the man page
doc: factor out common option
Junio C Hamano [Thu, 21 Aug 2025 21:35:32 +0000 (14:35 -0700)]
Merge branch 'jk/describe-blob' into next
"git describe <blob>" misbehaves and/or crashes in some corner
cases, which has been taught to exit with failure gracefully.
* jk/describe-blob:
describe: pass commit to describe_commit()
describe: handle blob traversal with no commits
describe: catch unborn branch in describe_blob()
describe: error if blob not found
describe: pass oid struct by const pointer
Junio C Hamano [Thu, 21 Aug 2025 21:35:31 +0000 (14:35 -0700)]
Merge branch 'jk/no-clobber-dangling-symref-with-fetch' into next
"git fetch" can clobber a symref that is dangling when the
remote-tracking HEAD is set to auto update, which has been
corrected.
* jk/no-clobber-dangling-symref-with-fetch:
refs: do not clobber dangling symrefs
t5510: prefer "git -C" to subshell for followRemoteHEAD tests
t5510: stop changing top-level working directory
t5510: make confusing config cleanup more explicit
Junio C Hamano [Thu, 21 Aug 2025 21:35:30 +0000 (14:35 -0700)]
Merge branch 'ps/reftable-libgit2-cleanup' into next
Code clean-ups.
* ps/reftable-libgit2-cleanup:
refs/reftable: always reload stacks when creating lock
reftable: don't second-guess errors from flock interface
reftable/stack: handle outdated stacks when compacting
reftable/stack: allow passing flags to `reftable_stack_add()`
reftable/stack: fix compiler warning due to missing braces
reftable/stack: reorder code to avoid forward declarations
reftable/writer: drop Git-specific `QSORT()` macro
reftable/writer: fix type used for number of records
Revision traversal limited with pathspec, like "git log dir/*",
used to ignore changed-paths Bloom filter when the pathspec
contained wildcards; now they take advantage of the filter when
they can.
* ly/changed-path-traversal-with-magic-pathspec:
bloom: enable bloom filter with wildcard pathspec in revision traversal
Junio C Hamano [Thu, 21 Aug 2025 20:47:02 +0000 (13:47 -0700)]
Merge branch 'en/ort-rename-fixes'
Various bugs about rename handling in "ort" merge strategy have
been fixed.
* en/ort-rename-fixes:
merge-ort: fix directory rename on top of source of other rename/delete
merge-ort: fix incorrect file handling
merge-ort: clarify the interning of strings in opt->priv->path
t6423: fix missed staging of file in testcases 12i,12j,12k
t6423: document two bugs with rename-to-self testcases
merge-ort: drop unnecessary temporary in check_for_directory_rename()
merge-ort: update comments to modern testfile location
Junio C Hamano [Thu, 21 Aug 2025 20:47:01 +0000 (13:47 -0700)]
Merge branch 'ua/t1517-short-help-tests'
Test shuffling.
* ua/t1517-short-help-tests:
t5304: move `prune -h` test from t1517
t5200: move `update-server-info -h` test from t1517
t/t1517: automate `git subcmd -h` tests outside a repository
Junio C Hamano [Thu, 21 Aug 2025 20:47:00 +0000 (13:47 -0700)]
Merge branch 'dl/push-missing-object-error'
"git push" had a code path that led to BUG() but it should have
been a die(), as it is a response to a usual but invalid end-user
action to attempt pushing an object that does not exist.
* dl/push-missing-object-error:
remote.c: convert if-else ladder to switch
remote.c: remove BUG in show_push_unqualified_ref_name_error()
t5516: remove surrounding empty lines in test bodies
Junio C Hamano [Thu, 21 Aug 2025 20:47:00 +0000 (13:47 -0700)]
Merge branch 'jc/strbuf-split'
Arrays of strbuf is often a wrong data structure to use, and
strbuf_split*() family of functions that create them often have
better alternatives.
Update several code paths and replace strbuf_split*().
* jc/strbuf-split:
trace2: do not use strbuf_split*()
trace2: trim_trailing_newline followed by trim is a no-op
sub-process: do not use strbuf_split*()
environment: do not use strbuf_split*()
config: do not use strbuf_split()
notes: do not use strbuf_split*()
merge-tree: do not use strbuf_split*()
clean: do not use strbuf_split*() [part 2]
clean: do not pass the whole structure when it is not necessary
clean: do not use strbuf_split*() [part 1]
clean: do not pass strbuf by value
wt-status: avoid strbuf_split*()
Junio C Hamano [Thu, 21 Aug 2025 20:46:59 +0000 (13:46 -0700)]
Merge branch 'jc/string-list-split'
string_list_split*() family of functions have been extended to
simplify common use cases.
* jc/string-list-split:
string-list: split-then-remove-empty can be done while splitting
string-list: optionally omit empty string pieces in string_list_split*()
diff: simplify parsing of diff.colormovedws
string-list: optionally trim string pieces split by string_list_split*()
string-list: unify string_list_split* functions
string-list: align string_list_split() with its _in_place() counterpart
string-list: report programming error with BUG
Junio C Hamano [Thu, 21 Aug 2025 20:46:58 +0000 (13:46 -0700)]
Merge branch 'ps/remote-rename-fix'
"git remote rename origin upstream" failed to move origin/HEAD to
upstream/HEAD when origin/HEAD is unborn and performed other
renames extremely inefficiently, which has been corrected.
* ps/remote-rename-fix:
builtin/remote: only iterate through refs that are to be renamed
builtin/remote: rework how remote refs get renamed
builtin/remote: determine whether refs need renaming early on
builtin/remote: fix sign comparison warnings
refs: simplify logic when migrating reflog entries
refs: pass refname when invoking reflog entry callback
Junio C Hamano [Thu, 21 Aug 2025 20:46:57 +0000 (13:46 -0700)]
Merge branch 'ps/reflog-migrate-fixes'
"git refs migrate" to migrate the reflog entries from a refs
backend to another had a handful of bugs squashed.
* ps/reflog-migrate-fixes:
refs: fix invalid old object IDs when migrating reflogs
refs: stop unsetting REF_HAVE_OLD for log-only updates
refs/files: detect race when generating reflog entry for HEAD
refs: fix identity for migrated reflogs
ident: fix type of string length parameter
builtin/reflog: implement subcommand to write new entries
refs: export `ref_transaction_update_reflog()`
builtin/reflog: improve grouping of subcommands
Documentation/git-reflog: convert to use synopsis type
Jean-Noël Avila [Wed, 20 Aug 2025 21:23:19 +0000 (23:23 +0200)]
doc: fix asciidoc format compatibility in pretty-formats.adoc
Asciidoc.py and Asciidoctor do not process the '+' verbatim the same way. A
span is detected when the format sign (here '+')is preceded by a non-word
character. It seems that '{nbsp}' is considered a non-word sign by
Asciidoc.py, but not by Asciidoctor.
Using a double format-sign opens 'unconstrained' span, independent on the
preceding character in both engines.
The '+' sign is used instead of the backtick '`' because it is not processed
as synopsis in asciidoc.py. Unfortunately, the post-processing of verbatim
synopsis in asciidoctor cannot be bypassed and formatting of the parentheses
is forced in syntax sign instead of keywords, unless a proper grammar
analyzer is used.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Aug 2025 21:04:17 +0000 (17:04 -0400)]
describe: pass commit to describe_commit()
There's a call in describe_commit() to lookup_commit_reference(), but we
don't check the return value. If it returns NULL, we'll segfault as we
immediately dereference the result.
In practice this can never happen, since all callers pass an oid which
came from a "struct commit" already. So we can make this more obvious
by just taking that commit struct in the first place.
Reported-by: Cheng <prophecheng@stu.pku.edu.cn> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 20 Aug 2025 06:30:34 +0000 (02:30 -0400)]
describe: handle blob traversal with no commits
When describing a blob, we traverse from HEAD, remembering each commit
we saw, and then checking each blob to report the containing commit.
But if we haven't seen any commits at all, we'll segfault (we store the
"current" commit as an oid initialized to the null oid, causing
lookup_commit_reference() to return NULL).
This shouldn't be able to happen normally. We always start our traversal
at HEAD, which must be a commit (a property which is enforced by the
refs code). But you can trigger the segfault like this:
We can instead catch this case and return an empty result, which hits
the usual "we didn't find $blob while traversing HEAD" error.
This is a minor lie in that we did "find" the blob. And this even hints
at a bigger problem in this code: what if the traversal pointed to the
blob as _not_ part of a commit at all, but we had previously filled in
the recorded "current commit"? One could imagine this happening due to a
tag pointing directly to the blob in question.
But that can't happen, because we only traverse from HEAD, never from
any other refs. And the intent of the blob-describing code is to find
blobs within commits.
So I think this matches the original intent as closely as we can (and
again, this segfault cannot be triggered without corrupting your
repository!).
The test here does not use the formula above, which works only for the
files backend (and not reftables). Instead we use another loophole to
create the bogus state using only Git commands. See the comment in the
test for details.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt [Wed, 20 Aug 2025 06:16:05 +0000 (08:16 +0200)]
doc/gitk: update reference to the external project
Gitk is now maintained by Johannes Sixt and the repository can be
cloned from a new URL. b59358100c20 (Update the official repo of
gitk, 2024-12-24) could have updated this instance in the manual,
too, but the opportunity was missed. Update it now. Do give credit
to Paul Mackerras as the inventor of the program.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 19 Aug 2025 23:38:23 +0000 (16:38 -0700)]
Merge branch 'lo/repo-info' into next
A new subcommand "git repo" gives users a way to grab various
repository characteristics.
* lo/repo-info:
repo: add the --format flag
repo: add the field layout.shallow
repo: add the field layout.bare
repo: add the field references.format
repo: declare the repo command
Jeff King [Tue, 19 Aug 2025 19:29:34 +0000 (15:29 -0400)]
refs: do not clobber dangling symrefs
When given an expected "before" state, the ref-writing code will avoid
overwriting any ref that does not match that expected state. We use the
null oid as a sentinel value for "nothing should exist", and likewise
that is the sentinel value we get when trying to read a ref that does
not exist.
But there's one corner case where this is ambiguous: dangling symrefs.
Trying to read them will yield the null oid, but there is potentially
something of value there: the dangling symref itself.
For a normal recursive write, this is OK. Imagine we have a symref
"FOO_HEAD" that points to a ref "refs/heads/bar" that does not exist,
and we try to write to it with a create operation like:
The attempt to resolve FOO_HEAD will actually resolve "bar", yielding
the null oid. That matches our expectation, and the write proceeds. This
is correct, because we are not writing FOO_HEAD at all, but writing its
destination "bar", which in fact does not exist.
But what if the operation asked not to dereference symrefs? Like this:
Resolving FOO_HEAD would still result in a null oid, and the write will
proceed. But it will overwrite FOO_HEAD itself, removing the fact that
it ever pointed to "bar".
This case is a little esoteric; we are clobbering a symref with a
no-deref write of a regular ref value. But the same problem occurs when
writing symrefs. For example:
The "create" operation asked us to create FOO_HEAD only if it did not
exist. But we silently overwrite the existing value.
You can trigger this without using update-ref via the fetch
followRemoteHEAD code. In "create" mode, it should not overwrite an
existing value. But if you manually create a symref pointing to a value
that does not yet exist (either via symbolic-ref or with "remote add
-m"), create mode will happily overwrite it.
Instead, we should detect this case and refuse to write. The correct
specification to overwrite FOO_HEAD in this case is to provide an
expected target ref value, like:
Note that the non-symref "update" directive does not allow you to do
this (you can only specify an oid). This is a weakness in the update-ref
interface, and you'd have to overwrite unconditionally, like:
Likewise other symref operations like symref-delete do not accept the
"ref" keyword. You should be able to do:
echo "symref-delete FOO_HEAD ref refs/heads/bar"
but cannot (and can only delete unconditionally). This patch doesn't
address those gaps. We may want to do so in a future patch for
completeness, but it's not clear if anybody actually wants to perform
those operations. The symref update case (specifically, via
followRemoteHEAD) is what I ran into in the wild.
The code for the fix is relatively straight-forward given the discussion
above. But note that we have to implement it independently for the files
and reftable backends. The "old oid" checks happen as part of the
locking process, which is implemented separately for each system. We may
want to factor this out somehow, but it's beyond the scope of this
patch. (Another curiosity is that the messages in the reftable code are
marked for translation, but the ones in the files backend are not. I
followed local convention in each case, but we may want to harmonize
this at some point).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 19 Aug 2025 19:27:16 +0000 (15:27 -0400)]
t5510: prefer "git -C" to subshell for followRemoteHEAD tests
These tests set config within a sub-repo using (cd two && git config),
and then a separate test_when_finished outside the subshell to clean it
up. We can't use test_config to do this, because the cleanup command it
registers inside the subshell would be lost. Nor can we do it before
entering the subshell, because the config has to be set after some other
commands are run.
Let's switch these tests to use "git -C" for each command instead of a
subshell. That lets us use test_config (with -C also) at the appropriate
part of the test. And we no longer need the manual cleanup command.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 19 Aug 2025 19:26:06 +0000 (15:26 -0400)]
t5510: stop changing top-level working directory
Several tests in t5510 do a bare "cd subrepo", not in a subshell. This
changes the working directory for subsequent tests. As a result, almost
every test has to start with "cd $D" to go back to the top-level.
Our usual style is to do per-test environment changes like this in a
subshell, so that tests can assume they are starting at the top-level
$TRASH_DIRECTORY.
Let's switch to that style, which lets us drop all of that extra
path-handling.
Most cases can switch to using a subshell, but in a few spots we can
simplify by doing "git init foo && git -C foo ...". We do have to make
sure that we weren't intentionally touching the environment in any code
which was moved into a subshell (e.g., with a test_when_finished), but
that isn't the case for any of these tests.
All of the references to the $D variable can go away, replaced generally
with $PWD or $TRASH_DIRECTORY (if we use it inside a chdir'd subshell).
Note in one test, "fetch --prune prints the remotes url", we make sure
to use $(pwd) to get the Windows-style path on that platform (for the
other tests, the exact form doesn't matter).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 19 Aug 2025 19:24:55 +0000 (15:24 -0400)]
t5510: make confusing config cleanup more explicit
Several tests set a config variable in a sub-repo we chdir into via a
subshell, like this:
(
cd "$D" &&
cd two &&
git config foo.bar baz
)
But they also clean up the variable with a when_finished directive
outside of the subshell, like this:
test_when_finished "git config unset foo.bar"
At first glance, this shouldn't work! The cleanup clause cannot be run
from the subshell (since environment changes there are lost by the time
the test snippet finishes). But since the cleanup command runs outside
the subshell, our working directory will not have been switched into
"two".
But it does work. Why?
The answer is that an earlier test does a "cd two" that moves the whole
test's working directory out of $TRASH_DIRECTORY and into "two". So the
subshell is a bit of a red herring; we are already in the right
directory! That's why we need the "cd $D" at the top of the shell, to
put us back to a known spot.
Let's make this cleanup code more explicitly specify where we expect the
config command to run. That makes the script more robust against running
a subset of the tests, and ultimately will make it easier to refactor
the script to avoid these top-level chdirs.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Julia Evans [Tue, 19 Aug 2025 20:46:10 +0000 (20:46 +0000)]
doc: git-add: simplify discussion of ignored files
- Mention the --force option earlier
- Remove the explanation of shell globbing vs git's internal glob
system, since users are confused by it and there's a clearer
discussion in the EXAMPLES section.
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Julia Evans [Tue, 19 Aug 2025 20:46:09 +0000 (20:46 +0000)]
doc: git-add: clarify intro & add an example
- Add a basic example of how "git add" is normally used
- It's not technically true that you *must* use the `add` command to
add changes before running `git commit`, because `git commit -a`
exists. Instead say that you *can* use the `add` command.
- Mention early on that "index" is another word for "staging area",
since Git very rarely uses the word "index" in its output
(`git status`) uses the term "staged", and many Git users are
unfamiliar with the term "index"
- Remove "It typically adds" (it's not clear what "typically" means),
and instead mention that `git add -p` can be used to add
partial contents
- Currently the introduction is somewhat repetitive ("to prepare the
content staged for the next commit" ... "this snapshot that is taken
as the contents of the next commit."), replace with a single sentence
("The "index" [...] is where Git stores the contents of the next
commit.")
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adam Dinwoodie [Tue, 19 Aug 2025 07:43:29 +0000 (08:43 +0100)]
t/t1517: mark tests that fail with GIT_TEST_INSTALLED
The changes added by 39fc408562 (t/t1517: automate `git subcmd -h` tests
outside a repository, 2025-08-08) to automatically loop over all "main"
Git commands will, when run against an installed build using
GIT_TEST_INSTALLED rather than the build in the build directory, include
some extra git-gui commands that are installed by `make install`, or
credential helpers that might be installed manually from the contrib
directories. These fail the test, so record them as such.
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Aug 2025 21:01:54 +0000 (17:01 -0400)]
describe: catch unborn branch in describe_blob()
When describing a blob, we search for it by traversing from HEAD. We do
this by feeding the name HEAD to setup_revisions(). But if we are on an
unborn branch, this will fail with a confusing message:
$ git describe $blob
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
It is OK for this to be an error (we cannot find $blob in an empty
traversal, so we'd eventually complain about that). But the error
message could be more helpful.
Let's resolve HEAD ourselves and pass the resolved object id to
setup_revisions(). If resolving fails, then we can print a more useful
message.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Aug 2025 21:01:25 +0000 (17:01 -0400)]
describe: error if blob not found
If describe_blob() does not find the blob in question, it returns an
empty strbuf, and we print an empty line. This differs from
describe_commit(), which always either returns an answer or calls die()
itself. As the blob function was bolted onto the command afterwards, I
think its behavior is not intentional, and it is just a bug that it does
not report an error.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 18 Aug 2025 20:59:29 +0000 (16:59 -0400)]
describe: pass oid struct by const pointer
We pass a "struct object_id" to describe_blob() by value. This isn't
wrong, as an oid is composed only of copy-able values. But it's unusual;
typically we pass structs by const pointer, including object_ids. Let's
do so.
It similarly makes sense for us to hold that pointer in the callback
data (rather than yet another copy of the oid).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
xdl_hash_record_verbatim uses modified djb2 hash with XOR instead of ADD
for combining. The ADD-based variant is used as the basis of the modern
("GNU") symbol lookup scheme in ELF. Glibc dynamic loader received an
optimized version of this hash function thanks to Noah Goldstein [1].
Switch xdl_hash_record_verbatim to additive hashing and implement
an optimized loop following the scheme suggested by Noah.
Timing 'git log --oneline --shortstat v2.0.0..v2.5.0' under perf, I got
version | cycles, bn | instructions, bn
---------------------------------------
A 6.38 11.3
B 6.21 10.89
C 5.80 9.95
D 5.83 8.74
---------------------------------------
A: baseline (git master at e4ef0485fd78)
B: plus 'xdiff: refactor xdl_hash_record()'
C: and plus this patch
D: with 'xdiff: use xxhash' by Phillip Wood
The resulting speedup for xdl_hash_record_verbatim itself is about 1.5x.
Junio C Hamano [Sun, 17 Aug 2025 16:27:08 +0000 (09:27 -0700)]
Merge branch 'ps/commit-graph-wo-globals' into next
Remove dependency on the_repository and other globals from the
commit-graph code, and other changes unrelated to de-globaling.
* ps/commit-graph-wo-globals:
commit-graph: stop passing in redundant repository
commit-graph: stop using `the_repository`
commit-graph: stop using `the_hash_algo`
commit-graph: refactor `parse_commit_graph()` to take a repository
commit-graph: store the hash algorithm instead of its length
commit-graph: stop using `the_hash_algo` via macros