Phillip Wood [Mon, 26 Jan 2026 10:48:52 +0000 (10:48 +0000)]
xdiff: remove unused data from xdlclass_t
Prior to commit 6d507bd41a (xdiff: delete fields ha, line, size
in xdlclass_t in favor of an xrecord_t, 2025-09-26) xdlclass_t
carried a copy of all the fields in xrecord_t. That commit embedded
xrecord_t in xdlclass_t to make it easier to change the types of
the fields in xrecord_t. However commit 6a26019c81 (xdiff: split
xrecord_t.ha into line_hash and minimal_perfect_hash, 2025-11-18)
added the "minimal_perfect_hash" field to xrecord_t which is not
used by xdlclass_t. To avoid wasting space stop copying the whole
of xrecord_t and just copy the pointer and length that we need to
intern the line. Together with the previous commit this effectively
reverts 6d507bd41a.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phillip Wood [Mon, 26 Jan 2026 10:48:51 +0000 (10:48 +0000)]
xdiff: remove "line_hash" field from xrecord_t
Prior to commit 6a26019c81 (xdiff: split xrecord_t.ha into line_hash
and minimal_perfect_hash, 2025-11-18) the "ha" field of xrecord_t
initially held the "line_hash" value and once the line had been
interned that field was updated to hold the "minimal_perfect_hash". The
"line_hash" is only used to intern the line so there is no point in
storing it after all the input lines have been interned.
Removing the "line_hash" field from xrecord_t and storing it in
xdlclass_t where it is actually used makes it clearer that it is a
temporary value and it should not be used once we're calculated the
"minimal_perfect_hash". This also reduces the size of xrecord_t by 25%
on 64-bit platforms and 40% on 32-bit platforms. While the struct is
small we create one instance per input line so any saving is welcome.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: drop unused `for_each_{loose,packed}_object()` functions
We have converted all callers of `for_each_loose_object()` and
`for_each_packed_object()` to use their new replacement functions
instead. We can thus remove them now.
Do so and inline `packfile_store_for_each_object_internal()` now that it
only has a single callsite again. This makes it a bit easier to follow
the callback indirection that is happening there.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
To figure out which objects expired objects we enumerate all loose and
packed objects individually so that we can figure out their respective
mtimes. Refactor the code to instead use `odb_for_each_object()` with a
request that ask for the object mtime instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/pack-objects: use `packfile_store_for_each_object()`
When enumerating objects that are supposed to be stored in a new cruft
pack we use `for_each_packed_object()` and then derive each object's
mtime individually. Refactor this logic to instead use the new
`packfile_store_for_each_object()` function with an object info request
that asks for the respective mtimes.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: introduce mtime fields for object info requests
There are some use cases where we need to figure out the mtime for
objects. Most importantly, this is the case when we want to prune
unreachable objects. But getting at that data requires users to manually
derive the info either via the loose object's mtime, the packfiles'
mtime or via the ".mtimes" file.
Introduce a new `struct object_info::mtimep` pointer that allows callers
to request an object's mtime. This new field will be used in a
subsequent commit.
Note that the concept of "mtime" is ambiguous: given an object, it may
be stored multiple times in the object database, and each of these
instances may have a different mtime. Disambiguating these mtimes is
nothing that can happen on the generic ODB layer: the caller may search
for the oldest object, the newest object, or even the relation of object
mtimes depending on the specific source they are located in. As such, it
is the responsibility of the caller to disambiguate mtimes.
A consequence of this is that it's most likely incorrect to look up the
mtime via `odb_read_object_info()`, as this interface does not give us
enough information to disambiguate the mtime. Document this accordingly
and tell users to use `odb_for_each_object()` instead.
Even with this gotcha though it's sensible to have this request as part
of the object info, as the mtime is a property of the object storage
format. If we for example had a "black-box" storage backend, we'd still
need to be able to query it for the mtime info in a generic way.
We could introduce a safety mechanism that for example calls `BUG()` in
case we look up the mtime outside of `odb_for_each_object()`. But that
feels somewhat heavy-handed.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
treewide: drop uses of `for_each_{loose,packed}_object()`
We're using `for_each_loose_object()` and `for_each_packed_object()` at
a couple of callsites to enumerate all loose and packed objects,
respectively. These functions will be removed in a subsequent commit in
favor of the newly introduced `odb_source_loose_for_each_object()` and
`packfile_store_for_each_object()` replacements.
Prepare for this by refactoring the sites accordingly.
Note that ideally, we'd convert all callsites to use the generic
`odb_for_each_object()` function already. But for some callers this is
not possible (yet), and it would require some significant refactorings
to make this work. Converting these site will thus be deferred to a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
treewide: enumerate promisor objects via `odb_for_each_object()`
We have multiple callsites where we enumerate all promisor objects in
the object database via `for_each_packed_object()`. This is done by
passing the `ODB_FOR_EACH_OBJECT_PROMISOR_ONLY` flag, which causes us to
skip over all non-promisor objects.
These callsites can be trivially converted to `odb_for_each_object()` as
we know to skip enumeration of loose objects in case the `PROMISOR_ONLY`
flag was passed by the caller.
Refactor the sites accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/fsck: refactor to use `odb_for_each_object()`
In git-fsck(1) we have two callsites where we iterate over all objects
via `for_each_loose_object()` and `for_each_packed_object()`. Both of
these are trivially convertible with `odb_for_each_object()`.
Refactor these callsites accordingly.
Note that `odb_for_each_object()` may iterate over the same object
multiple times, for example when it exists both in packed and loose
format. But this has already been the case beforehand, so this does not
result in a change in behaviour.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new function `odb_for_each_object()` that knows to iterate
through all objects part of a given object database. This function is
essentially a simple wrapper around the object database sources.
Subsequent commits will adapt callers to use this new function.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
packfile: introduce function to iterate through objects
Introduce a new function `packfile_store_for_each_object()`. This
function is equivalent to `odb_source_loose_for_each_object()`, except
that it:
- Works on a single packfile store instead of working on the object
database level. Consequently, it will only yield packed objects of a
single object database source.
- Passes a `struct object_info` to the callback function.
As such, it provides the same callback interface as we already provide
for loose objects now. These functions will be used in a subsequent step
to implement `odb_for_each_object()`.
The `for_each_packed_object()` function continues to exist for now, but
it will be removed at the end of this patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
packfile: extract function to iterate through objects of a store
In the next commit we're about to introduce a new function that knows to
iterate through objects of a given packfile store. Same as with the
equivalent function for loose objects, this new function will also be
agnostic of backends by using a `struct object_info`.
Prepare for this by extracting a new shared function to iterate through
a single packfile store.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-file: introduce function to iterate through objects
We have multiple divergent interfaces to iterate through objects of a
specific backend:
- `for_each_loose_object()` yields all loose objects.
- `for_each_packed_object()` (somewhat obviously) yields all packed
objects.
These functions have different function signatures, which makes it hard
to create a common abstraction layer that covers both of these.
Introduce a new function `odb_source_loose_for_each_object()` to plug
this gap. This function doesn't take any data specific to loose objects,
but instead it accepts a `struct object_info` that will be populated the
exact same as if `odb_source_loose_read_object()` was called.
The benefit of this new interface is that we can continue to pass
backend-specific data, as `struct object_info` contains a union for
these exact use cases. This will allow us to unify how we iterate
through objects across both loose and packed objects in a subsequent
commit.
The `for_each_loose_object()` function continues to exist for now, but
it will be removed at the end of this patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-file: extract function to read object info from path
Extract a new function that allows us to read object info for a specific
loose object via a user-supplied path. This function will be used in a
subsequent commit.
Note that this also allows us to drop `stat_loose_object()`, which is
a simple wrapper around `odb_loose_path()` plus lstat(3p).
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `flags` parameter accepted by various `for_each_object()` functions
is a bitfield of multiple flags. Such parameters are typically unsigned
in the Git codebase, but we use `enum odb_for_each_object_flags` in
some places.
Adapt these function signatures to use the correct type.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the `FOR_EACH_OBJECT_*` flags to have an `ODB_` prefix. This
prepares us for a new upcoming `odb_for_each_object()` function and
ensures that both the function and its flags have the same prefix.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Sun, 25 Jan 2026 22:52:41 +0000 (23:52 +0100)]
fetch: delay user information post committing of transaction
In Git 2.50 and earlier, we would display failure codes and error
message as part of the status display:
$ git fetch . v1.0.0:refs/heads/foo
error: cannot update ref 'refs/heads/foo': trying to write non-commit object f665776185ad074b236c00751d666da7d1977dbe to branch 'refs/heads/foo'
From .
! [new tag] v1.0.0 -> foo (unable to update local ref)
With the addition of batched updates, this information is no longer
shown to the user:
$ git fetch . v1.0.0:refs/heads/foo
From .
* [new tag] v1.0.0 -> foo
error: cannot update ref 'refs/heads/foo': trying to write non-commit object f665776185ad074b236c00751d666da7d1977dbe to branch 'refs/heads/foo'
Since reference updates are batched and processed together at the end,
information around the outcome is not available during individual
reference parsing.
To overcome this, collate and delay the output to the end. Introduce
`ref_update_display_info` which will hold individual update's
information and also whether the update failed or succeeded. This
finally allows us to iterate over all such updates and print them to the
user.
Using an dynamic array and strmap does add some overhead to
'git-fetch(1)', but from benchmarking this seems to be not too bad:
Benchmark 1: fetch: many refs (refformat = files, refcount = 1000, revision = master)
Time (mean ± σ): 42.6 ms ± 1.2 ms [User: 13.1 ms, System: 29.8 ms]
Range (min … max): 40.1 ms … 45.8 ms 47 runs
Benchmark 2: fetch: many refs (refformat = files, refcount = 1000, revision = HEAD)
Time (mean ± σ): 43.1 ms ± 1.2 ms [User: 12.7 ms, System: 30.7 ms]
Range (min … max): 40.5 ms … 45.8 ms 48 runs
Summary
fetch: many refs (refformat = files, refcount = 1000, revision = master) ran
1.01 ± 0.04 times faster than fetch: many refs (refformat = files, refcount = 1000, revision = HEAD)
Another approach would be to move the status printing logic to be
handled post the transaction being committed. That however would require
adding an iterator to the ref transaction that tracks both the outcome
(success/failure) and the original refspec information for each update,
which is more involved infrastructure work compared to the strmap
approach here.
Helped-by: Phillip Wood <phillip.wood123@gmail.com> Reported-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Sun, 25 Jan 2026 22:52:40 +0000 (23:52 +0100)]
receive-pack: utilize rejected ref error details
In 9d2962a7c4 (receive-pack: use batched reference updates, 2025-05-19),
git-receive-pack(1) switched to using batched reference updates. This also
introduced a regression wherein instead of providing detailed error
messages for failed referenced updates, the users were provided generic
error messages based on the error type.
Now that the updates also contain detailed error message, propagate
those to the client via 'rp_error'. The detailed error messages can be
very verbose, for e.g. in the files backend, when trying to write a
non-commit object to a branch, you would see:
Here the refname is repeated multiple times due to how error messages
are propagated and filled over the code stack. This potentially can be
cleaned up in a future commit.
Reported-by: Elijah Newren <newren@gmail.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Sun, 25 Jan 2026 22:52:39 +0000 (23:52 +0100)]
fetch: utilize rejected ref error details
In 0e358de64a (fetch: use batched reference updates, 2025-05-19),
git-fetch(1) switched to using batched reference updates. This also
introduced a regression wherein instead of providing detailed error
messages for failed referenced updates, the users were provided generic
error messages based on the error type.
Similar to the previous commit, switch to using detailed error messages
if present for failed reference updates to fix this regression.
Reported-by: Elijah Newren <newren@gmail.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Sun, 25 Jan 2026 22:52:38 +0000 (23:52 +0100)]
update-ref: utilize rejected error details if available
When git-update-ref(1) received the '--update-ref' flag, the error
details generated in the refs namespace wasn't propagated with failed
updates. Instead only an error code pertaining to the type of rejection
was noted.
This missed detailed error message which the user can act upon. The
previous commits added the required code to propagate these detailed
error messages from the refs namespace. Now that additional details are
available, let's output this additional details to stderr. This allows
users to have additional information over the already present machine
parsable output.
While we're here, improve the existing tests for the machine parsable
output by checking for the entire output string and not just the
rejection reason.
Reported-by: Elijah Newren <newren@gmail.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Sun, 25 Jan 2026 22:52:37 +0000 (23:52 +0100)]
refs: add rejection detail to the callback function
The previous commit started storing the rejection details alongside the
error code for rejected updates. Pass this along to the callback
function `ref_transaction_for_each_rejected_update()`. Currently the
field is unused, but will be integrated in the upcoming commits.
Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Sun, 25 Jan 2026 22:52:36 +0000 (23:52 +0100)]
refs: skip to next ref when current ref is rejected
In `refs_verify_refnames_available()` we have two nested loops: the
outer loop iterates over all references to check, while the inner loop
checks for filesystem conflicts for a given ref by breaking down its
path.
With batched updates, when we detect a filesystem conflict, we mark the
update as rejected and execute 'continue'. However, this only skips to
the next iteration of the inner loop, not the outer loop as intended.
This causes the same reference to be repeatedly rejected. Fix this by
using a goto statement to skip to the next reference in the outer loop.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sun, 25 Jan 2026 17:08:06 +0000 (09:08 -0800)]
Merge branch 'master' of https://github.com/j6t/git-gui
* 'master' of https://github.com/j6t/git-gui:
git-gui: mark *.po files at any directory level as UTF-8
git-gui i18n: Update Bulgarian translation (558t)
git-gui i18n: Update Bulgarian translation (557t)
Johannes Sixt [Sun, 25 Jan 2026 09:46:23 +0000 (10:46 +0100)]
git-gui: mark *.po files at any directory level as UTF-8
When a commit is viewed in Gitk that changes a file in po/glossary, the
patch text shows mojibake instead of correctly decoded UTF-8 text.
Gitk retrieves the encoding attribute to decide how to treat the bytes
that make up the patch text. There is an attribute definition that all
files are US-ASCII, and a later attribute definition overrides this.
But the override, which specifies UTF-8, applies only to *.po files in
directory po/ and does not apply to subdirectories.
Widen the pattern to apply to all directory levels.
* dk/replay-doc-omit-irrelevant-rev-list-options:
lint-gitlink: preemptively ignore all /ifn?def|endif/ macros
replay: drop rev-list formatting options from manual
Junio C Hamano [Fri, 23 Jan 2026 21:34:36 +0000 (13:34 -0800)]
Merge branch 'js/symlink-windows'
Upstream symbolic link support on Windows from Git-for-Windows.
* js/symlink-windows:
mingw: special-case index entries for symlinks with buggy size
mingw: emulate `stat()` a little more faithfully
mingw: try to create symlinks without elevated permissions
mingw: add support for symlinks to directories
mingw: implement basic `symlink()` functionality (file symlinks only)
mingw: implement `readlink()`
mingw: allow `mingw_chdir()` to change to symlink-resolved directories
mingw: support renaming symlinks
mingw: handle symlinks to directories in `mingw_unlink()`
mingw: add symlink-specific error codes
mingw: change default of `core.symlinks` to false
mingw: factor out the retry logic
mingw: compute the correct size for symlinks in `mingw_lstat()`
mingw: teach dirent about symlinks
mingw: let `mingw_lstat()` error early upon problems with reparse points
mingw: drop the separate `do_lstat()` function
mingw: implement `stat()` with symlink support
mingw: don't call `GetFileAttributes()` twice in `mingw_lstat()`
Junio C Hamano [Fri, 23 Jan 2026 21:34:36 +0000 (13:34 -0800)]
Merge branch 'js/ci-leak-skip-svn'
Dscho observed that SVN tests are taking too much time in CI leak
checking tasks, but most time is spent not in our code but in libsvn
code (which happen to be written in Perl), whose leaks have little
value to discover for us. Skip SVN, P4, and CVS tests in the leak
checking tasks.
* js/ci-leak-skip-svn:
ci: skip CVS and P4 tests in leaks job, too
ci(*-leaks): skip the git-svn tests to save time
Paulo Casaretto [Thu, 22 Jan 2026 19:23:35 +0000 (19:23 +0000)]
lockfile: add PID file for debugging stale locks
When a lock file is held, it can be helpful to know which process owns
it, especially when debugging stale locks left behind by crashed
processes. Add an optional feature that creates a companion PID file
alongside each lock file, containing the PID of the lock holder.
For a lock file "foo.lock", the PID file is named "foo~pid.lock". The
tilde character is forbidden in refnames and allowed in Windows
filenames, which guarantees no collision with the refs namespace
(e.g., refs "foo" and "foo~pid" cannot both exist). The file contains
a single line in the format "pid <value>" followed by a newline.
The PID file is created when a lock is acquired (if enabled), and
automatically cleaned up when the lock is released (via commit or
rollback). The file is registered as a tempfile so it gets cleaned up
by signal and atexit handlers if the process terminates abnormally.
When a lock conflict occurs, the code checks for an existing PID file
and, if found, uses kill(pid, 0) to determine if the process is still
running. This allows providing context-aware error messages:
Lock is held by process 12345. Wait for it to finish, or remove
the lock file to continue.
Or for a stale lock:
Lock was held by process 12345, which is no longer running.
Remove the stale lock file to continue.
The feature is controlled via core.lockfilePid configuration (boolean).
Defaults to false. When enabled, PID files are created for all lock
operations.
Existing PID files are always read when displaying lock errors,
regardless of the core.lockfilePid setting. This ensures helpful
diagnostics even when the feature was previously enabled and later
disabled.
Signed-off-by: Paulo Casaretto <pcasaretto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 22 Jan 2026 16:05:58 +0000 (16:05 +0000)]
revision: add --maximal-only option
When inspecting a range of commits from some set of starting references, it
is sometimes useful to learn which commits are not reachable from any other
commits in the selected range.
One such application is in the creation of a sequence of bundles for the
bundle URI feature. Creating a stack of bundles representing different
slices of time includes defining which references to include. If all
references are used, then this may be overwhelming or redundant. Instead,
selecting commits that are maximal to the range could help defining a
smaller reference set to use in the bundle header.
Add a new '--maximal-only' option to restrict the output of a revision range
to be only the commits that are not reachable from any other commit in the
range, based on the reachability definition of the walk.
This is accomplished by adding a new 28th bit flag, CHILD_VISITED, that is
set as we walk. This does extend the bit range in object.h, but using an
earlier bit may collide with another feature.
The tests demonstrate the behavior of the feature with a positive-only
range, ranges with negative references, and walk-modifying flags like
--first-parent and --exclude-first-parent-only.
Since the --boundary option would not increase any results when used with
the --maximal-only option, mark them as incompatible.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 22 Jan 2026 00:16:28 +0000 (16:16 -0800)]
Merge branch 'rs/tree-wo-the-repository'
Remove implicit reliance on the_repository global in the APIs
around tree objects and make it explicit which repository to work
in.
* rs/tree-wo-the-repository:
cocci: remove obsolete the_repository rules
cocci: convert parse_tree functions to repo_ variants
tree: stop using the_repository
tree: use repo_parse_tree()
path-walk: use repo_parse_tree_gently()
pack-bitmap-write: use repo_parse_tree()
delta-islands: use repo_parse_tree()
bloom: use repo_parse_tree()
add-interactive: use repo_parse_tree_indirect()
tree: add repo_parse_tree*()
environment: move access to core.maxTreeDepth into repo settings
Junio C Hamano [Thu, 22 Jan 2026 00:16:27 +0000 (16:16 -0800)]
Merge branch 'tb/midx-write-corrupt-checksum-fix'
The logic that avoids reusing MIDX files with a wrong checksum was
broken, which has been corrected.
* tb/midx-write-corrupt-checksum-fix:
midx-write.c: assume checksum-invalid MIDXs require an update
t/t5319-multi-pack-index.sh: drop early 'test_done'
"git repack --geometric" did not work with promisor packs, which
has been corrected.
* ps/geometric-repacking-with-promisor-remotes:
builtin/repack: handle promisor packs with geometric repacking
repack-promisor: extract function to remove redundant packs
repack-promisor: extract function to finalize repacking
repack-geometry: extract function to compute repacking split
builtin/pack-objects: exclude promisor objects with "--stdin-packs"
t5500: simplify test implementation and fix git exit code suppression
The 'shallow since with commit graph and already-seen commit”
test uses a convoluted here-doc that combines manual input
construction with packetize, echo and embedded Git commands.
This structure hides failures from the git commands,
as their exit codes are suppressed inside echo command
substitution and being on the upstream side of pipes.
Instead of using here-doc to construct the pack
protocol that is directly sent to the
'git upload-pack' command being tested,
capture the outputs of the git commands upfront
and use the 'test-tool pkt-line pack'
tool to construct the input in a temporary file,
and then feed it to the command.
This has a few advantages:
* Executing the git commands outside the here-doc
avoids suppressing their exit codes and makes
debugging easier.
* It removes the need to manually count and
manage pkt-line lengths to keep in line with
the v2 protocol, as the tool handles this internally.
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amisha Chhajed [Wed, 21 Jan 2026 13:00:05 +0000 (18:30 +0530)]
sparse-checkout: optimize string_list construction and add tests to verify deduplication.
Improve O(n^2) complexity to O(n log n) while building a sorted
'string_list' by constructing it unsorted then sorting it
followed by removing duplicates.
sparse-checkout deduplicates repeated cone-mode patterns,
but this behaviour was previously untested, add tests that
verify that sparse-checkout file contain each cone
pattern only once and sparse-checkout list reports each pattern
only once.
Signed-off-by: Amisha Chhajed <amishhhaaaa@gmail.com> Acked-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 21 Jan 2026 16:29:00 +0000 (08:29 -0800)]
Merge branch 'js/prep-symlink-windows'
Further preparation to upstream symbolic link support on Windows.
* js/prep-symlink-windows:
trim_last_path_component(): avoid hard-coding the directory separator
strbuf_readlink(): support link targets that exceed 2*PATH_MAX
strbuf_readlink(): avoid calling `readlink()` twice in corner-cases
init: do parse _all_ core.* settings early
mingw: do resolve symlinks in `getcwd()`
Junio C Hamano [Wed, 21 Jan 2026 16:29:00 +0000 (08:29 -0800)]
Merge branch 'ps/read-object-info-improvements'
The object-info API has been cleaned up.
* ps/read-object-info-improvements:
packfile: drop repository parameter from `packed_object_info()`
packfile: skip unpacking object header for disk size requests
packfile: disentangle return value of `packed_object_info()`
packfile: always populate pack-specific info when reading object info
packfile: extend `is_delta` field to allow for "unknown" state
packfile: always declare object info to be OI_PACKED
object-file: always set OI_LOOSE when reading object info
Junio C Hamano [Wed, 21 Jan 2026 16:28:58 +0000 (08:28 -0800)]
Merge branch 'ps/packfile-store-in-odb-source'
The packfile_store data structure is moved from object store to odb
source.
* ps/packfile-store-in-odb-source:
packfile: move MIDX into packfile store
packfile: refactor `find_pack_entry()` to work on the packfile store
packfile: inline `find_kept_pack_entry()`
packfile: only prepare owning store in `packfile_store_prepare()`
packfile: only prepare owning store in `packfile_store_get_packs()`
packfile: move packfile store into object source
packfile: refactor misleading code when unusing pack windows
packfile: refactor kept-pack cache to work with packfile stores
packfile: pass source to `prepare_pack()`
packfile: create store via its owning source
Junio C Hamano [Wed, 21 Jan 2026 16:28:58 +0000 (08:28 -0800)]
Merge branch 'ps/ref-consistency-checks'
Update code paths that check data integrity around refs subsystem.
cf. <CAOLa=ZShPP3BPXa=YnC-vuX4zF=pUTFdUidZwOdna8bfVTNM9w@mail.gmail.com>
* ps/ref-consistency-checks:
builtin/fsck: drop `fsck_head_link()`
builtin/fsck: move generic HEAD check into `refs_fsck()`
builtin/fsck: move generic object ID checks into `refs_fsck()`
refs/reftable: introduce generic checks for refs
refs/reftable: fix consistency checks with worktrees
refs/reftable: extract function to retrieve backend for worktree
refs/reftable: adapt includes to become consistent
refs/files: introduce function to perform normal ref checks
refs/files: extract generic symref target checks
fsck: drop unused fields from `struct fsck_ref_report`
refs/files: perform consistency checks for root refs
refs/files: improve error handling when verifying symrefs
refs/files: extract function to check single ref
refs/files: remove useless indirection
refs/files: remove `refs_check_dir` parameter
refs/files: move fsck functions into global scope
refs/files: simplify iterating through root refs
Junio C Hamano [Wed, 21 Jan 2026 16:28:57 +0000 (08:28 -0800)]
Merge branch 'tb/macos-iconv-workarounds'
The iconv library on macOS fails to correctly handle stateful
ISO/IEC 2022 encoded strings. Work it around instead of replacing
it wholesale from homebrew.
* tb/macos-iconv-workarounds:
utf8.c: enable workaround for iconv under macOS 14/15
utf8.c: prepare workaround for iconv under macOS 14/15
Jean-Noël Avila [Wed, 21 Jan 2026 13:27:05 +0000 (14:27 +0100)]
lint-gitlink: preemptively ignore all /ifn?def|endif/ macros
Instead of testing if the macro name is ifn?def:: as if it were a inline
macro, it is faster and safer to just ignore such block macro lines before
hand.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Toon Claes [Tue, 20 Jan 2026 21:47:11 +0000 (22:47 +0100)]
last-modified: change default max-depth to 0
By default git-last-modified(1) doesn't recurse into subtrees. So when
the pathspec contained a path in a subtree, the command would only print
the commit information about the parent tree of the path, like:
Toon Claes [Tue, 20 Jan 2026 21:47:10 +0000 (22:47 +0100)]
last-modified: document option '--max-depth'
Option --max-depth is supported by git-last-modified(1), because it was
added to the diff machinery in a1dfa5448d (diff: teach tree-diff a
max-depth parameter, 2025-08-07).
This option is useful for everyday use of the git-last-modified(1)
command, so document it's existence in the man page.
To have it also appear in the help output of `git last-modified -h`,
move the handling of '--max-depth' to parse_options() in
builtin/last-modified.c itself. This prepares for the change in default
behavior in the next commit.
Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Toon Claes [Tue, 20 Jan 2026 21:47:09 +0000 (22:47 +0100)]
last-modified: document option '-z'
The command git-last-modified(1) already recognizes the option '-z', and
similar to many other commands this will make the output NUL-terminated
instead of using newlines. Although, this option is missing from the
documentation, so add it.
In addition to that, to have '-z' also appear in the help output of `git
last-modified -h`, move the handling of '-z' to parse_options() in
builtin/last-modified.c itself.
Before, the parsing of option '-z' was done by diff_opt_parse(), which
is called by setup_revisions(). That would fill in `struct
diff_options::line_termination`, but that field was not used by the diff
machinery itself. Thus it makes more sense to have the handling of that
option completely in builtin/last-modified.c.
Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Toon Claes [Tue, 20 Jan 2026 21:47:08 +0000 (22:47 +0100)]
last-modified: clarify in the docs the command takes a pathspec
The documentation mentions git-last-modified(1) takes `<path>...`, but
that argument actually accepts a pathspec. Reword the documentation to
reflect that.
Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
D. Ben Knoble [Tue, 20 Jan 2026 14:05:57 +0000 (09:05 -0500)]
replay: drop rev-list formatting options from manual
The rev-list options in our manuals are quite long; git-replay's manual
is no exception. Since replay doesn't use the formatting options at all
(it has its own output format), drop them.
This is the first time we have needed compound tests [1] for if[n]def in
our documentation:
For both ifdef and ifndef, the "," takes on the intuitive meaning:
- ifdef: if any of the listed attributes are set…
- ifndef: unless any of the listed attributes are set
(Use "+" for "all".)
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt [Wed, 14 Jan 2026 18:27:53 +0000 (19:27 +0100)]
gitk: fix highlighted remote prefix of branches with directories
The decoration of a remote ref is colored in two parts: (1) the prefix
that mentions the remove (including "remote/"); and (2) the branch name.
To extract the prefix from the ref name, a regular expression is used.
However, the expression is not restrictive enough: it picks everything
before the last slash character as prefix, so that, for example, the
ref name "remotes/orgin/ml/themes" is split into "remotes/origin/ml" and
"themes".
Tighten the regular expression so that only the name of the remote is
pulled into the prefix, but no part of the branch name. This gives the
desired result in the example: "remotes/origin" and "ml/themes".
Jeff King [Mon, 19 Jan 2026 05:23:20 +0000 (00:23 -0500)]
remote: always allocate branch.push_tracking_ref
In branch_get_push(), we usually allocate a new string for the @{push}
ref, but will not do so in push.default=upstream mode, where we just
pass back the result of branch_get_upstream() directly.
This led to a hacky memory management scheme in e291c75a95 (remote.c:
add branch_get_push, 2015-05-21): we store the result in the
push_tracking_ref field of a "struct branch", under the assumption that
the branch struct will last until the end of the program. So even though
the struct doesn't know if it has an allocated string or not, it doesn't
matter because we hold on to it either way.
But that assumption was violated by f5ccb535cc (remote: fix leaking
config strings, 2024-08-22), which added a function to free branch
structs. Any struct which is fed to branch_release() is at risk of
leaking its push_tracking_ref member.
I don't think this can actually be triggered in practice. We rarely
actually free the branch structs, and we only fill in the
push_tracking_ref string lazily when it is needed. So triggering the
leak would require a code path that does both, and I couldn't find one.
Still, this is an ugly trap that may eventually spring on us. Since
there is only one code path in branch_get_push() that doesn't allocate,
let's just have it copy the string. And then we know that
push_tracking_ref is always allocated, and we can free it in
branch_release().
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 19 Jan 2026 05:22:08 +0000 (00:22 -0500)]
remote: fix leak in branch_get_push_1() with invalid "simple" config
Most of the code paths in branch_get_push_1() allocate a string for the
@{push} value. We then return the result, which is stored in a "struct
branch", so the value is not leaked.
But there's one path that does leak: when we are in the "simple" push
mode, we have to check that the @{push} value matches what we'd get for
@{upstream}. If it doesn't, we return an error, but forget to free the
@{push} value we computed.
Curiously, the existing tests don't trigger this with LSan, even though
they do exercise the code path. As far as I can tell, it should be
triggered via:
which will complain that the upstream ("not-foo") does not match the
push destination ("foo"). We do die() shortly after this, but not until
after returning from branch_get_push_1(), which is where the leak
happens.
So it seems like a false negative in LSan. However, I can trigger it
reliably by printing the @{push} value using for-each-ref. This takes a
little more setup (because we need "foo" to actually exist to iterate
over it with for-each-ref), but we can piggy-back on the existing repo
config in t6300.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 19 Jan 2026 05:20:26 +0000 (00:20 -0500)]
remote: drop const return of tracking_for_push_dest()
The string returned from tracking_for_push_dest() comes from
apply_refspec(), and thus is always an allocated string (or NULL). We
should return a non-const pointer so that the caller knows that
ownership of the string is being transferred.
This goes back to the function's origin in e291c75a95 (remote.c: add
branch_get_push, 2015-05-21). It never really mattered because our
return is just forwarded through branch_get_push_1(), which returns a
const string as part of an intentionally hacky memory management scheme
(see that commit for details).
As the first step of untangling that hackery, let's drop the extra const
from this helper function (and from the variables that store its
result). There should be no functional change (yet).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 19 Jan 2026 05:19:45 +0000 (00:19 -0500)]
remote: return non-const pointer from error_buf()
We have an error_buf() helper that functions a bit like our error()
helper, but returns NULL instead of -1. Its return type is "const char
*", but this is overly restrictive. If we use the helper in a function
that returns non-const "char *", the compiler will complain about
the implicit cast from const to non-const.
Meanwhile, the const in the helper is doing nothing useful, as it only
ever returns NULL. Let's drop the const, which will let us use it in
both types of function.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sat, 17 Jan 2026 18:34:17 +0000 (10:34 -0800)]
ci: skip CVS and P4 tests in leaks job, too
Looking at the CI logs, the p4 and cvs tests account for another 24
minutes of test time and they offer minimal value for quite a
similar reason as the previous step.
Let's introduce and use a mechanism to skip these tests to save
some resources.
Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
I noticed recently that the leak-checking jobs still take a lot of time,
and upon analysis, the git-svn tests contribute significantly to this.
Analyzing a recent CI run, I saw that the Git test suite contains
1,017 tests, running for approximately 5¼ hours total. Of these, 65
git-svn-related tests (~6% of test count) took 42.24 minutes combined,
accounting for ~13.% of the total runtime. This implies that the git-svn
tests are roughly twice as expernsive compared to the other tests.
However, testing git-svn in the leak-checking jobs provides minimal
value: git-svn is implemented as a Perl script, and leak checking only
handles C code. While git-svn does call into Git's built-in commands
that are implemented in C, these are standard Git operations that are
already thoroughly exercised elsewhere in the test suite. Therefore,
running the git-svn tests in the leak-checking jobs only adds to the
overall run time with little value in return.
Given that the leak-checking jobs are particularly time-intensive and
these 42+ minutes of SVN tests per job provide no additional leak
detection value, skip them in the *-leaks jobs to reduce CI runtime.
Assisted-by: Claude Sonnet 4.5 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tian Yuchen [Sat, 17 Jan 2026 06:25:15 +0000 (14:25 +0800)]
t1005: modernize "! test -f" to "test_path_is_missing"
Replace instances of "! test -f <file>" with "test_path_is_missing <file>".
This macro provides better diagnostics when the test fails (it prints
"Path exists:" instead of silently failing).
Signed-off-by: Tian Yuchen <a3205153416@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jiang Xin [Sat, 17 Jan 2026 13:59:38 +0000 (21:59 +0800)]
help: report on whether or not gettext is enabled
When users report that Git has no localized output, we need to check not
only their locale settings, but also whether Git was built with GETTEXT
support in the first place.
Expose this information via the existing build info output by adding a
"gettext: enabled" line to `git version --build-options` (and therefore
also to `git bugreport`) when `NO_GETTEXT` is not defined at build time.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
LorenzoPegorari [Fri, 16 Jan 2026 00:05:38 +0000 (01:05 +0100)]
t4073: add test for diffstat paths length when containing UTF-8 chars
Add test checking the length of filepaths containing UTF-8 chars when
generating a diffstat with various `name-width`s.
Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
[jc: fixed up t/meson.build to spell the name of the new test file correctly] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Fri, 16 Jan 2026 20:39:56 +0000 (20:39 +0000)]
t0610-reftable-basics: mitigate a flaky test on cygwin
Test #29 ('ref transaction: corrupted tables cause failure') started to
fail intermittently for me (from v2.52.0-rc0) when running the testsuite
with '-j8'. (Also, having moved to a new laptop and windows 11, rather
than windows 10). If the test is run by hand, or without any parallelism,
then it passes without issue.
When the test fails (e.g. 1 out of 32 parallel runs) the cause is due to
a permission error while corrupting a table file:
./test-lib.sh: line 1010: .git/reftable/0x000000000001-0x000000000002-d89bb8ee.ref: Permission denied
This corruption is done in a shell loop, directly after a 'test_commit',
which uses an ': >"$f"' expression to truncate the file. Adding a sleep
of one second after the 'test_commit' and before the shell loop fixes
the test (it is not clear why). Replacing the redirection shell expression
with a 'test-tool truncate "$f" 0' invocation also provides a fix, which
could simply be another way to change the timing sufficiently to win the
race.
During a debug session, I tried looking at the strace output for the
shell redirection:
$ rm /tmp/hello; echo hello >/tmp/hello; ls -l /tmp/hello
-rw-r--r-- 1 ramsay None 6 Nov 10 17:25 /tmp/hello
$
When comparing the output, the differences seemed to be what you would
expect and, if anything, the shell redirect probably would have taken
longer than the test-tool solution (many fcntl() calls to dup the stdout
to the <fd>). The call to the win32 api NtCreateFile() was identical,
apart from the first (FileHandle) parameter, of course.
In order to fix this flaky test on cygwin, despite not knowing why it
works, replace the shell redirection with the above 'test-tool truncate'
invocation.
Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Fri, 16 Jan 2026 20:39:44 +0000 (20:39 +0000)]
t9700/test.pl: fix path type expectation on cygwin
Commit 4ec7ac101b ("t9700: accommodate for Windows paths", 2025-12-17)
changed the type of the absolute path to the git directory from unix to
win32 for both GfW and cygwin. This fixed the test for GfW but causes
new failures on cygwin, since the test expectation is that it uses unix
paths on cygwin. In order to not break cygwin, disable the new code by
removing the "or $^O eq 'cygwin'" sub-expression from the conditional
part of the fix.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 16 Jan 2026 20:40:28 +0000 (12:40 -0800)]
Merge branch 'kh/doc-patch-id'
"git patch-id" documentation updates.
* kh/doc-patch-id:
doc: patch-id: --verbatim locks in --stable
doc: patch-id: spell out the git-diff-tree(1) form
doc: patch-id: use definite article for the result
patch-id: use “patch ID” throughout
doc: patch-id: capitalize Git version
doc: patch-id: don’t use semicolon between bullet points
LorenzoPegorari [Fri, 16 Jan 2026 00:05:03 +0000 (01:05 +0100)]
diff: improve scaling of filenames in diffstat to handle UTF-8 chars
The `show_stats()` function tries to scale the filenames in the diffstat to
ensure they don't exceed the given `name-width`. It does so by calculating
the "display width" of the characters to be dropped, but then advances the
filename pointer by that number of bytes.
However, the "display width" of a character is not always equal to its byte
count. The result is that sometimes, when displaying UTF-8 characters,
filenames exceed the given `name-width`, and frequently the bytes of the
UTF-8 characters are truncated.
The following is an example of the issue, where the 2 files are "HelloHi" and
"Hello你好", and `name-width=6`:
...oHi | 0
...<BD><A0>好 | 0
Make the filename pointer move by the actual number of bytes of the
characters to drop from the filename, rather than their display width, using
the `utf8_width()` function.
Force `len` to not be less than 0 (this happens if the given `name-width` is
2 or less), otherwise an infinite loop is entered.
Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Thu, 15 Jan 2026 22:01:25 +0000 (23:01 +0100)]
cocci: remove obsolete the_repository rules
035c7de9e9e (cocci: apply the "revision.h" part of
"the_repository.pending", 2023-03-28) removed the last of the repo-less
functions and macros mentioned in the_repository.cocci at the time. No
stragglers appeared since then. Remove the applied rules now that they
have outlived their usefulness.
Also add a reminder to eventually remove the just added rules for
tree.h.
Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It seems to have caused a few regressions, two of the three known
ones we have proposed solutions for. Let's give ourselves a bit
more room to maneuver during the pre-release freeze period and
restart once the 2.53 ships.