When a caller holds a `struct odb_source`, they have no way of telling
what type the source is. This doesn't really cause any problems in the
current status quo as we only have a single type anyway, "files". But
going forward we expect to add more types, and if so it will become
necessary to tell the sources apart.
Introduce a new enum to cover this use case and assert that the given
source actually matches the target source when performing the downcast.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: move reparenting logic into respective subsystems
The primary object database source may be initialized with a relative
path. When the process changes its current working directory we thus
have to update this path and have it point to the same path, but
relative to the new working directory.
This logic is handled in the object database layer. It consists of three
steps:
1. We undo any potential temporary object directory, which are used
for transactions. This is done so that we don't end up modifying
the temporary object database source that got applied for the
transaction.
2. We then iterate through the non-transactional sources and reparent
their respective paths.
3. We reapply the temporary object directory, but update its path.
All of this logic is heavily tied to how the object database source
handles paths in the first place. It's an internal implementation
detail, and as sources may not even use an on-disk path at all it is not
a mechanism that applies to all potential sources.
Refactor the code so that the logic to reparent the sources is hosted by
the "files" source and the temporary object directory subsystems,
respectively. This logic is easier to reason about, but it also ensures
that this logic is handled at the correct level.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "files" backend is implemented as a pointer in the `struct
odb_source`. This contradicts our typical pattern for pluggable backends
like we use it for example in the ref store or for object database
streams, where we typically embed the generic base structure in the
specialized implementation. This pattern has a couple of small benefits:
- We avoid an extra allocation.
- We hide implementation details in the generic structure.
- We can easily downcast from a generic backend to the specialized
structure and vice versa because the offsets are known at compile
time.
- It becomes trivial to identify locations where we depend on backend
specific logic because the cast needs to be explicit.
Refactor our "files" object database source to do the same and embed the
`struct odb_source` in the `struct odb_source_files`.
There are still a bunch of sites in our code base where we do have to
access internals of the "files" backend. The intent is that those will
go away over time, but this will certainly take a while. Meanwhile,
provide a `odb_source_files_downcast()` function that can convert a
generic source into a "files" source.
As we only have a single source the downcast succeeds unconditionally
for now. Eventually though the intent is to make the cast `BUG()` in
case the caller requests to downcast a non-"files" backend to a "files"
backend.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new "files" object database source. This source encapsulates
access to both loose object files and the packfile store, similar to how
the "files" backend for refs encapsulates access to loose refs and the
packed-refs file.
Note that for now the "files" source is still a direct member of a
`struct odb_source`. This architecture will be reversed in the next
commit so that the files source contains a `struct odb_source`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: split `struct odb_source` into separate header
Subsequent commits will expand the `struct odb_source` to become a
generic interface for accessing an object database source. As part of
these refactorings we'll add a set of function pointers that will
significantly expand the structure overall.
Prepare for this by splitting out the `struct odb_source` into a
separate header. This keeps the high-level object database interface
detached from the low-level object database sources.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the setup logic into a 'test_expect_success' block.
This ensures that the code is properly tracked by the test harness.
Additionally, we use the 'test_when_finished' helper at the start of
the block to ensure that the 'import' directory is removed even if the
test fails.
This is cleaner than the previous manual 'rm -rf import' approach.
Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aditya Garg [Thu, 5 Mar 2026 17:01:15 +0000 (17:01 +0000)]
send-email: pass smtp hostname and port to Authen::SASL
Starting from version 2.2000, Authen::SASL supports passing the SMTP
server hostname and port to the OAUTHBEARER string passed via SMTP AUTH.
Add support for the same in git-send-email.
It's safe to add the new parameters unconditionally as older versions of
Authen::SASL will simply ignore them without any error. Something
similar is already being done for the authname parameter, which is not
supported by every authentication mechanism. This can be understood as
declaring a variable but not using at all.
meson: detect broken iconv that requires ICONV_RESTART_RESET
In d0cec08d70 (utf8.c: prepare workaround for iconv under macOS 14/15,
2026-01-12) we have introduced a new workaround for a broken version of
libiconv on macOS. This workaround has for now only been wired up for
our Makefile, so using Meson with such a broken version will fail.
We can rather easily detect the broken behaviour. Some encodings have
different modes that can be switched to via an escape sequence. In the
case of ISO-2022-JP this can be done via "<Esc>$B" and "<Esc>(J" to
switch between ASCII and JIS modes. The bug now triggers when one does
multiple calls to iconv(3p) to convert a string piece by piece, where
the first call enters JIS mode. The second call forgets about the fact
that it is still in JIS mode, and consequently it will incorrectly treat
the input as ASCII, and thus the produced output is of course garbage.
Wire up a test that exercises this in Meson and, if it fails, set the
`ICONV_RESTART_RESET` define.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Seyi Kufoiji [Thu, 5 Mar 2026 10:05:26 +0000 (11:05 +0100)]
builtin/rev-list: migrate missing_objects cleanup to oidmap_clear_with_free()
As part of the conversion away from oidmap_clear(), switch the
missing_objects map to use oidmap_clear_with_free().
missing_objects stores struct missing_objects_map_entry instances,
which own an xstrdup()'d path string in addition to the container
struct itself. Previously, rev-list manually freed entry->path
before calling oidmap_clear(&missing_objects, true).
Introduce a dedicated free callback and pass it to
oidmap_clear_with_free(), consolidating entry teardown into a
single place and making cleanup semantics explicit.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Seyi Kufoiji [Thu, 5 Mar 2026 10:05:25 +0000 (11:05 +0100)]
oidmap: make entry cleanup explicit in oidmap_clear
Replace oidmap's use of hashmap_clear_() and layout-dependent freeing
with an explicit iteration and optional free callback. This removes
reliance on struct layout assumptions while keeping the existing API
intact.
Add tests for oidmap_clear_with_free behavior.
test_oidmap__clear_with_free_callback verifies that entries are freed
when a callback is provided, while
test_oidmap__clear_without_free_callback verifies that entries are not
freed when no callback is given. These tests ensure the new clear
implementation behaves correctly and preserves ownership semantics.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"fsck" iterates over packfiles and its access to pack data caused
the list to be permuted, which caused it to loop forever; the code
to access pack data by "fsck" has been updated to avoid this.
* ps/fsck-stream-from-the-right-object-instance:
pack-check: fix verification of large objects
packfile: expose function to read object stream for an offset
object-file: adapt `stream_object_signature()` to take a stream
t/helper: improve "genrandom" test helper
The core.attributesfile is intended to be set per repository, but
were kept track of by a single global variable in-core, which has
been corrected by moving it to per-repository data structure.
* ob/core-attributesfile-in-repository:
environment: move "branch.autoSetupMerge" into `struct repo_config_values`
environment: stop using core.sparseCheckout globally
environment: stop storing `core.attributesFile` globally
Junio C Hamano [Wed, 4 Mar 2026 18:53:02 +0000 (10:53 -0800)]
Merge branch 'ds/config-list-with-type'
"git config list" is taught to show the values interpreted for
specific type with "--type=<X>" option.
* ds/config-list-with-type:
config: use an enum for type
config: restructure format_config()
config: format colors quietly
color: add color_parse_quietly()
config: format expiry dates quietly
config: format paths gently
config: format bools or strings in helper
config: format bools or ints gently
config: format bools gently
config: format int64s gently
config: make 'git config list --type=<X>' work
config: add 'gently' parameter to format_config()
config: move show_all_config()
Mark the marge-ort codebase to prevent more uses of the_repository
from getting added.
* en/merge-ort-almost-wo-the-repository:
replay: prevent the_repository from coming back
merge-ort: prevent the_repository from coming back
merge-ort: replace the_hash_algo with opt->repo->hash_algo
merge-ort: replace the_repository with opt->repo
merge-ort: pass repository to write_tree()
merge,diff: remove the_repository check before prefetching blobs
Junio C Hamano [Wed, 4 Mar 2026 18:53:01 +0000 (10:53 -0800)]
Merge branch 'lo/repo-leftover-bits'
Clean-up the code around "git repo info" command.
* lo/repo-leftover-bits:
Documentation/git-repo: capitalize format descriptions
Documentation/git-repo: replace 'NUL' with '_NUL_'
t1901: adjust nul format output instead of expected value
t1900: rename t1900-repo to t1900-repo-info
repo: rename struct field to repo_info_field
repo: replace get_value_fn_for_key by get_repo_info_field
repo: rename repo_info_fields to repo_info_field
CodingGuidelines: instruct to name arrays in singular
Junio C Hamano [Wed, 4 Mar 2026 18:53:01 +0000 (10:53 -0800)]
Merge branch 'ps/maintenance-geometric-default'
"git maintenance" starts using the "geometric" strategy by default.
* ps/maintenance-geometric-default:
builtin/maintenance: use "geometric" strategy by default
t7900: prepare for switch of the default strategy
t6500: explicitly use "gc" strategy
t5510: explicitly use "gc" strategy
t5400: explicitly use "gc" strategy
t34xx: don't expire reflogs where it matters
t: disable maintenance where we verify object database structure
t: fix races caused by background maintenance
Junio C Hamano [Wed, 4 Mar 2026 18:52:59 +0000 (10:52 -0800)]
Merge branch 'rr/gitweb-mobile'
"gitweb" has been taught to be mobile friendly.
* rr/gitweb-mobile:
gitweb: let page header grow on mobile for long wrapped project names
gitweb: fix mobile footer overflow by wrapping text and clearing floats
gitweb: fix mobile page overflow across log/commit/blob/diff views
gitweb: prevent project search bar from overflowing on mobile
gitweb: add viewport meta tag for mobile devices
Junio C Hamano [Wed, 4 Mar 2026 18:52:58 +0000 (10:52 -0800)]
Merge branch 'kn/ref-location'
Allow the directory in which reference backends store their data to
be specified.
* kn/ref-location:
refs: add GIT_REFERENCE_BACKEND to specify reference backend
refs: allow reference location in refstorage config
refs: receive and use the reference storage payload
refs: move out stub modification to generic layer
refs: extract out `refs_create_refdir_stubs()`
setup: don't modify repo in `create_reference_database()`
Harald Nordgren [Wed, 4 Mar 2026 12:25:31 +0000 (12:25 +0000)]
status: clarify how status.compareBranches deduplicates
The order of output when multiple branches are specified on the
configuration variable was not clearly spelled out in the
documentation.
Add a paragraph to describe the order and also how the branches are
deduplicated. Update t6040 with additional tests to illustrate how
multiple branches are shown and deduplicated.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
[jc: made a whole replacement into incremental; wrote log message.] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tian Yuchen [Wed, 4 Mar 2026 14:15:26 +0000 (22:15 +0800)]
setup: improve error diagnosis for invalid .git files
'read_gitfile_gently()' treats any non-regular file as
'READ_GITFILE_ERR_NOT_A_FILE' and fails to discern between 'ENOENT'
and other stat failures. This flawed error reporting is noted by two
'NEEDSWORK' comments.
Address these comments by introducing two new error codes:
'READ_GITFILE_ERR_MISSING'(which groups the "file missing" scenarios
together) and 'READ_GITFILE_ERR_IS_A_DIR':
1. Update 'read_gitfile_error_die()' to treat 'IS_A_DIR', 'MISSING',
'NOT_A_FILE' and 'STAT_FAILED' as non-fatal no-ops. This accommodates
intentional non-repo scenarios (e.g., GIT_DIR=/dev/null).
2. Explicitly catch 'NOT_A_FILE' and 'STAT_FAILED' during
discovery and call 'die()' if 'die_on_error' is set.
3. Unconditionally pass '&error_code' to 'read_gitfile_gently()'.
4. Only invoke 'is_git_directory()' when we explicitly receive
'READ_GITFILE_ERR_IS_A_DIR', avoiding redundant checks.
Additionally, audit external callers of 'read_gitfile_gently()' in
'submodule.c' and 'worktree.c' to accommodate the refined error codes.
Signed-off-by: Tian Yuchen <a3205153416@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
K Jayatheerth [Wed, 4 Mar 2026 13:05:02 +0000 (18:35 +0530)]
path: remove redundant function calls
repo_settings_get_shared_repository() is invoked multiple times in
calc_shared_perm(). While the function internally caches the value,
repeated calls still add unnecessary noise.
Store the result in a local variable and reuse it instead. This makes
it explicit that the value is expected to remain constant and avoids
repeated calls in the same scope.
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
K Jayatheerth [Wed, 4 Mar 2026 13:05:01 +0000 (18:35 +0530)]
path: use size_t for dir_prefix length
The strlen() function returns a size_t. Storing this in a standard
signed int is a bad practice that invites overflow vulnerabilities if
paths get absurdly long.
Switch the variable to size_t. This is safe to do because 'len' is
strictly used as an argument to strncmp() (which expects size_t) and
as a positive array index, involving no signed arithmetic that could
rely on negative values.
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wolfgang Faust [Wed, 4 Mar 2026 01:30:52 +0000 (17:30 -0800)]
git-gui: grey out comment lines in commit message
Comment lines are stripped by wash_commit_message, but there is no
indication in the UI that they are special and will be removed.
Grey these lines out to indicate that they will be removed.
Signed-off-by: Wolfgang Faust <contrib-git@wolfgangfaust.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Nasser Grainawi [Tue, 3 Mar 2026 23:40:44 +0000 (15:40 -0800)]
submodule: fetch missing objects from default remote
When be76c21282 (fetch: ensure submodule objects fetched, 2018-12-06)
added support for fetching a missing submodule object by id, it
hardcoded the remote name as "origin" and deferred anything more
complicated for a later patch. Implement the NEEDSWORK item to remove
the hardcoded assumption by adding and using a submodule helper subcmd
'get-default-remote'. Fixing this lets 'git fetch --recurse-submodules'
succeed when the fetched commit(s) in the superproject trigger a
submodule fetch, and that submodule's default remote name is not
"origin".
Add non-"origin" remote tests to t5526-fetch-submodules.sh and
t5572-pull-submodule.sh demonstrating this works as expected and add
dedicated tests for get-default-remote.
Signed-off-by: Nasser Grainawi <nasser.grainawi@oss.qualcomm.com> Reviewed-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is quite a common use case that one wants to split up one commit into
multiple commits by moving parts of the changes of the original commit
out into a separate commit. This is quite an involved operation though:
1. Identify the commit in question that is to be dropped.
2. Perform an interactive rebase on top of that commit's parent.
3. Modify the instruction sheet to "edit" the commit that is to be
split up.
4. Drop the commit via "git reset HEAD~".
5. Stage changes that should go into the first commit and commit it.
6. Stage changes that should go into the second commit and commit it.
7. Finalize the rebase.
This is quite complex, and overall I would claim that most people who
are not experts in Git would struggle with this flow.
Introduce a new "split" subcommand for git-history(1) to make this way
easier. All the user needs to do is to say `git history split $COMMIT`.
From hereon, Git asks the user which parts of the commit shall be moved
out into a separate commit and, once done, asks the user for the commit
message. Git then creates that split-out commit and applies the original
commit on top of it.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/history: split out extended function to create commits
In the next commit we're about to introduce a new command that splits up
a commit into two. Most of the logic will be shared with rewording
commits, except that we also need to have control over the parents and
the old/new trees.
Extract a new function `commit_tree_with_edited_message_ext()` to
prepare for this commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `write_in_core_index_as_tree()` takes a repository and
writes its index into a tree object. What this function cannot do though
is to take an _arbitrary_ in-memory index.
Introduce a new `struct index_state` parameter so that the caller can
pass a different index than the one belonging to the repository. This
will be used in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "add-patch" mode allows the user to edit hunks to apply custom
changes. This is incompatible with a new `git history split` command
that we're about to introduce in a subsequent commit, so we need a way
to disable this mode.
Add a new flag to disable editing hunks.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add-patch: add support for in-memory index patching
With `run_add_p()` callers have the ability to apply changes from a
specific revision to a repository's index. This infra supports several
different modes, like for example applying changes to the index,
working tree or both.
One feature that is missing though is the ability to apply changes to an
in-memory index different from the repository's index. Add a new
function `run_add_p_index()` to plug this gap.
This new function will be used in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add-patch: remove dependency on "add-interactive" subsystem
With the preceding commit we have split out interactive configuration
that is used by both "git add -p" and "git add -i". But we still
initialize that configuration in the "add -p" subsystem by calling
`init_add_i_state()`, even though we only do so to initialize the
interactive configuration as well as a repository pointer.
Stop doing so and instead store and initialize the interactive
configuration in `struct add_p_state` directly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `struct add_p_opt` is reused both by our infra for "git add -p" and
"git add -i". Users of `run_add_i()` for example are expected to pass
`struct add_p_opt`. This is somewhat confusing and raises the question
of which options apply to what part of the stack.
But things are even more confusing than that: while callers are expected
to pass in `struct add_p_opt`, these options ultimately get used to
initialize a `struct add_i_state` that is used by both subsystems. So we
are basically going full circle here.
Refactor the code and split out a new `struct interactive_options` that
hosts common options used by both. These options are then applied to a
`struct interactive_config` that hosts common configuration.
This refactoring doesn't yet fully detangle the two subsystems from one
another, as we still end up calling `init_add_i_state()` in the "git add
-p" subsystem. This will be fixed in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add-patch: split out header from "add-interactive.h"
While we have a "add-patch.c" code file, its declarations are part of
"add-interactive.h". This makes it somewhat harder than necessary to
find relevant code and to identify clear boundaries between the two
subsystems.
Split up concerns and move declarations that relate to "add-patch.c"
into a new "add-patch.h" header.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t3700: use test_grep helper for better diagnostics
Replace 'grep' and '! grep' invocations with 'test_grep' and
'test_grep !'. This provides better debugging output if tests fail
in the future, as 'test_grep' will automatically print the
contents of the file when a check fails.
While at it, update any remaining instances of 'grep' to 'test_grep'
that were missed in the previous versions to ensure that the entire
file is consistent with modern project style.
Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace pipelines involving git commands with temporary files (actual)
to ensure that any crashes or unexpected exit codes from the git
commands are properly caught by the test suite. A simple pipeline
like 'git foo | grep bar' ignores the exit code of 'git', which
can hide regressions.
In cases where we were counting lines with 'wc -l' to ensure a
pattern was absent, simplify the logic to use '! grep' to avoid
subshells entirely.
Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt [Mon, 2 Mar 2026 08:14:48 +0000 (09:14 +0100)]
gitk: commit translation files without file information
File information in the translation files is only helpful for the
translators, but is not needed to compile the message catalogs. On top
of that, file information is rather volatile and leads to large patches
that do not carry essential information. For this reason, Git project
has opted to remove the file information from its translation files.
Let's do that in this project, too.
Rewrite the update-po target to generate *.po files that do contain
file information for the benefit of translators. Configure a clean
filter under the name "gettext-no-location", which is the same that
the Git project uses. It is expected that translators have already
configured their repository suitably. Nevertheless, write a reminder
as part of the update-po target.
Junio C Hamano [Tue, 3 Mar 2026 19:08:13 +0000 (11:08 -0800)]
Merge branch 'hy/diff-lazy-fetch-with-break-fix'
A prefetch call can be triggered to access a stale diff_queue entry
after diffcore-break breaks a filepair into two and freed the
original entry that is no longer used, leading to a segfault, which
has been corrected.
* hy/diff-lazy-fetch-with-break-fix:
diffcore-break: avoid segfault with freed entries
Junio C Hamano [Tue, 3 Mar 2026 19:08:12 +0000 (11:08 -0800)]
Merge branch 'cs/subtree-split-fixes'
An earlier attempt to optimize "git subtree" discarded too much
relevant histories, which has been corrected.
* cs/subtree-split-fixes:
contrib/subtree: process out-of-prefix subtrees
contrib/subtree: test history depth
contrib/subtree: capture additional test-cases
Derrick Stolee [Tue, 3 Mar 2026 17:31:53 +0000 (17:31 +0000)]
for-each-repo: work correctly in a worktree
When run in a worktree, the GIT_DIR directory is set in a different way
than in a typical repository. Show this by updating t0068 to include a
worktree and add a test that runs from that worktree. This requires
moving the repo.key config into a global config instead of the base test
repository's local config (demonstrating that it worked with
non-worktree Git repositories).
We need to be careful to unset the local Git environment variables and
let the child process rediscover them, while also reinstating those
variables in the parent process afterwards. Update run_command_on_repo()
to use the new sanitize_repo_env() helper method to erase these
environment variables.
During review of this bug fix, there were several incorrect patches
demonstrating different bad behaviors. Most of these are covered by
tests, when it is not too expensive to set it up. One case that would be
expensive to set up is the GIT_NO_REPLACE_OBJECTS environment variable,
but we trust that using sanitize_repo_env() will be sufficient to
capture these uncovered cases by using the common code for resetting
environment variables.
Reported-by: Matthew Gabeler-Lee <fastcat@gmail.com> Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Tue, 3 Mar 2026 17:31:52 +0000 (17:31 +0000)]
run-command: extract sanitize_repo_env helper
The current prepare_other_repo_env() does two distinct things:
1. Strip certain known environment variables that should be set by a
child process based on a different repository.
2. Set the GIT_DIR variable to avoid repository discovery.
The second item is valuable for child processes that operate on
submodules, where the repo discovery could be mistaken for the parent
repository.
In the next change, we will see an important case where only the first
item is required as the GIT_DIR discovery should happen naturally from
the '-C' parameter in the child process.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Tue, 3 Mar 2026 17:31:51 +0000 (17:31 +0000)]
for-each-repo: test outside of repo context
The 'git for-each-repo' tool is frequently run outside of a repo context
in the real world. For example, it powers background maintenance.
Despite this typical case, we have not been testing it without a local
repository.
Update t0068 to stop creating a test repo and to use global config
everywhere. This has some subtle changes to test across the file.
This was noticed because an earlier attempt to remove the_repository
from builtin/for-each-repo.c did not catch a segmentation fault since
the passed 'repo' is NULL. This use of the_repository will need to stay
until we have a better way to handle config queries outside of a repo
context. Similar use still exists in builtin/config.c for the same
reason.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
xdiff: re-diff shifted change groups when using histogram algorithm
After a diff algorithm has been run, the compaction phase
(xdl_change_compact()) shifts and merges change groups to produce a
cleaner output. However, this shifting could create a new matched group
where both sides now have matching lines. This results in a
wrong-looking diff output which contains redundant lines that are the
same on both files.
Fix this by detecting this situation, and re-diff the texts on each side
to find similar lines, using the fall-back Myer's diff. Only do this for
histogram diff as it's the only algorithm where this is relevant. Below
contains an example, and more details.
For an example, consider two files below:
file1:
A
A
A
A
A
A
A
file2:
A
A
x
A
A
A
A
When using Myer's diff, the algorithm finds that only the "x" has been
changed, and produces a final diff result (these are line diffs, but
using word-diff syntax for ease of presentation):
A A[-A-]{+x+}A AAA
When using histogram diff, the algorithm first discovers the LCS "A
AAA", which it uses as anchor, then produces an intermediate diff:
{+A Ax+}A AAA[- AAA-].
This is a longer diff than Myer's, but it's still self-consistent.
However, the compaction phase attempts to shift the first file's diff
group upwards (note that this shift crosses the anchor that histogram
had used), leading to the final results for histogram diff:
[-A AA-]{+A Ax+}A AAA
This is a technically correct patch but looks clearly redundant to a
human as the first 3 lines should not be in the diff.
The fix would detect that a shift has caused matching to a new group,
and re-diff the "A AA" and "A Ax" parts, which results in "A A"
correctly re-marked as unchanged. This creates the now correct histogram
diff:
A A[-A-]{+x+}A AAA
This issue is not applicable to Myer's diff algorithm as it already
generates a minimal diff, which means a shift cannot result in a smaller
diff output (the default Myer's diff in xdiff is not guaranteed to be
minimal for performance reasons, but it typically does a good enough
job).
It's also not applicable to patience diff, because it uses only unique
lines as anchor for its splits, and falls back to Myer's diff within
each split. Shifting requires both ends having the same lines, and
therefore cannot cross the unique line boundaries established by the
patience algorithm. In contrast histogram diff uses non-unique lines as
anchors, and therefore shifting can cross over them.
This issue is rare in a normal repository. Below is a table of
repositories (`git log --no-merges -p --histogram -1000`), showing how
many times a re-diff was done and how many times it resulted in finding
matching lines (therefore addressing this issue) with the fix. In
general it is fewer than 1% of diff's that exhibit this offending
behavior:
Junio C Hamano [Tue, 3 Mar 2026 01:06:53 +0000 (17:06 -0800)]
Merge branch 'jt/object-file-use-container-of'
Code clean-up.
* jt/object-file-use-container-of:
object-file.c: avoid container_of() of a NULL container
object-file: use `container_of()` to convert from base types
Junio C Hamano [Tue, 3 Mar 2026 01:06:53 +0000 (17:06 -0800)]
Merge branch 'ps/receive-pack-shallow-optim'
The code to accept shallow "git push" has been optimized.
* ps/receive-pack-shallow-optim:
commit: use commit graph in `lookup_commit_reference_gently()`
commit: make `repo_parse_commit_no_graph()` more robust
commit: avoid parsing non-commits in `lookup_commit_reference_gently()`
Junio C Hamano [Tue, 3 Mar 2026 01:06:52 +0000 (17:06 -0800)]
Merge branch 'ps/object-info-bits-cleanup'
A couple of bugs in use of flag bits around odb API has been
corrected, and the flag bits reordered.
* ps/object-info-bits-cleanup:
odb: convert `odb_has_object()` flags into an enum
odb: convert object info flags into an enum
odb: drop gaps in object info flag values
builtin/fsck: fix flags passed to `odb_has_object()`
builtin/backfill: fix flags passed to `odb_has_object()`
"git subtree split --prefix=P <commit>" now checks the prefix P
against the tree of the (potentially quite different from the
current working tree) given commit.
* ps/validate-prefix-in-subtree-split:
subtree: validate --prefix against commit in split
A signature on a commit that was GPG signed long time ago ought to
be still valid after the key that was used to sign it has expired,
but we showed them in alarming red.
* uk/signature-is-good-after-key-expires:
gpg-interface: signatures by expired keys are fine
Junio C Hamano [Tue, 3 Mar 2026 01:06:50 +0000 (17:06 -0800)]
Merge branch 'ps/odb-for-each-object'
Revamp object enumeration API around odb.
* ps/odb-for-each-object:
odb: drop unused `for_each_{loose,packed}_object()` functions
reachable: convert to use `odb_for_each_object()`
builtin/pack-objects: use `packfile_store_for_each_object()`
odb: introduce mtime fields for object info requests
treewide: drop uses of `for_each_{loose,packed}_object()`
treewide: enumerate promisor objects via `odb_for_each_object()`
builtin/fsck: refactor to use `odb_for_each_object()`
odb: introduce `odb_for_each_object()`
packfile: introduce function to iterate through objects
packfile: extract function to iterate through objects of a store
object-file: introduce function to iterate through objects
object-file: extract function to read object info from path
odb: fix flags parameter to be unsigned
odb: rename `FOR_EACH_OBJECT_*` flags
Exit early if the hooks do not exist, to avoid spinning up/down
sideband async threads which no-op.
It is important to call the hook_exists() API provided by hook.[ch]
because it covers both config-defined hooks and the "traditional"
hooks from the hookdir. find_hook() only covers the hookdir hooks.
The regression happened because the no-op async threads add some
additional overhead which can be measured with the receive-refs test
of the benchmarks suite [1].
Fixes: fc148b146ad4 ("receive-pack: convert update hooks to new API") Reported-by: Patrick Steinhardt <ps@pks.im> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
[jc: avoid duplicated hardcoded hook names] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Mon, 2 Mar 2026 21:45:26 +0000 (15:45 -0600)]
builtin/repo: find tree with most entries
The size of a tree object usually corresponds with the number of entries
it has. While iterating through objects in the repository for
git-repo-structure, identify the tree with the most entries and display
it in the output.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Mon, 2 Mar 2026 21:45:25 +0000 (15:45 -0600)]
builtin/repo: find commit with most parents
Complex merge events may produce an octopus merge where the resulting
merge commit has more than two parents. While iterating through objects
in the repository for git-repo-structure, identify the commit with the
most parents and display it in the output.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Mon, 2 Mar 2026 21:45:24 +0000 (15:45 -0600)]
builtin/repo: add OID annotations to table output
The "structure" output for git-repo(1) does not show the corresponding
OIDs for the largest objects in its "table" output. Update the output to
include a list of OID annotations with an index to the corresponding row
in the table.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Mon, 2 Mar 2026 21:45:23 +0000 (15:45 -0600)]
builtin/repo: collect largest inflated objects
The "structure" output for git-repo(1) shows the total inflated and disk
sizes of reachable objects in the repository, but doesn't show the size
of the largest individual objects. Since an individual object may be a
large contributor to the overall repository size, it is useful for users
to know the maximum size of individual objects.
While interating across objects, record the size and OID of the largest
objects encountered for each object type to provide as output. Note that
the default "table" output format only displays size information and not
the corresponding OID. In a subsequent commit, the table format is
updated to add table annotations that mention the OID.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Mon, 2 Mar 2026 21:45:22 +0000 (15:45 -0600)]
builtin/repo: add helper for printing keyvalue output
The machine-parsable formats for the git-repo(1) "structure" subcommand
print output in keyvalue pairs. Introduce the helper function
`print_keyvalue()` to remove some code duplication and improve
readability.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Mon, 2 Mar 2026 21:45:21 +0000 (15:45 -0600)]
builtin/repo: update stats for each object
When walking reachable objects in the repository, `count_objects()`
processes a set of objects and updates the `struct object_stats`. In
preparation for more granular statistics being collected, update the
`struct object_stats` for each individual object instead.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t3310: replace test -f/-d with test_path_is_file/test_path_is_dir
Replace old-style path assertions with modern helpers that
provide clearer diagnostic messages on failure. When test -f
fails, the output gives no indication of what went wrong.
These instances were found using:
git grep "test -[efd]" t/
as suggested in the microproject ideas.
Signed-off-by: Francesco Paparatto <francescopaparatto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ci: unset GITLAB_FEATURES envvar to not bust xargs(1) limits
We have started to see the following assert happen in our GitLab CI
pipelines for jobs that use Windows with Meson:
assertion "bc_ctl.arg_max >= LINE_MAX" failed: file "xargs.c", line 512, function: main
The assert in question verifies that we have enough room available to
pass at least `LINE_MAX` many bytes via the command line. The xargs(1)
binary in those jobs comes from Git for Windows, which in turn sources
the binaries from MSYS2, and has the following limits in place:
$ & "C:/Program Files/Git/usr/bin/bash.exe" -l -c 'xargs --show-limits </dev/null'
Your environment variables take up 17373 bytes
POSIX upper limit on argument length (this system): 12579
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 18446744073709546822
Size of command buffer we are actually using: 12579
Maximum parallelism (--max-procs must be no greater): 2147483647
What's interesting to see is the limit of 16 exabits for the maximum
command line length. This value might seem a bit high, and it is indeed
the result of an underflow: our environment is larger than the POSIX
upper limit on argument length, and the value is computed by subtracting
the former from the latter. So what we get is the result of `2^64 -
(17373 - 12579)`.
This makes it clear that the problem here is the size of our environment
variables. A listing sorted by length yields the following result:
The GITLAB_FEATURES environment variable makes up for roughly a third of
the complete environment. This variable is a comma-separated list of
features available for the GitLab instance, and seemingly it has been
growing over time as GitLab added more and more features.
Fix the issue by unsetting the environment variable in "ci/lib.sh". This
ensures that the environment variables are now smaller than the upper
limit on argument length again, and that in turn fixes the assert in
xargs(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
David Timber [Mon, 2 Mar 2026 03:16:41 +0000 (12:16 +0900)]
send-email: add client certificate options
For SMTP servers that do "mutual certificate verification", the mail
client is required to present its own TLS certificate as well. This
patch adds --smtp-ssl-client-cert and --smtp-ssl-client-key for such
servers.
The problem of which private key for the certificate is chosen arises
when there are private keys in both the certificate and private key
file. According to the documentation of IO::Socket::SSL(link supplied),
the behaviour(the private key chosen) depends on the format of the
certificate. In a nutshell,
- PKCS12: the key in the cert always takes the precedence
- PEM: if the key file is not given, it will "try" to read one
from the cert PEM file
Many users may find this discrepancy unintuitive.
In terms of client certificate, git-send-email is implemented in a way
that what's possible with perl's SSL library is exposed to the user as
much as possible. In this instance, the user may choose to use a PEM
file that contains both certificate and private key should be
at their discretion despite the implications.
Michael Montalbo [Sat, 28 Feb 2026 20:31:16 +0000 (20:31 +0000)]
diff: fix crash with --find-object outside repository
When "git diff --find-object=<oid>" is run outside a git repository,
the option parsing callback eagerly resolves the OID via
repo_get_oid(), which reaches get_main_ref_store() and hits a BUG()
assertion because no repository has been set up.
Check startup_info->have_repository before attempting to resolve the
OID, and return a user-friendly error instead.
Signed-off-by: Michael Montalbo <mmontalbo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Paul Tarjan [Sat, 28 Feb 2026 17:37:57 +0000 (17:37 +0000)]
fsmonitor-watchman: fix variable reference and remove redundant code
The is_work_tree_watched() function in fsmonitor-watchman.sample has
two bugs:
1. Wrong variable in error check: After calling watchman_clock(), the
result is stored in $o, but the code checks $output->{error} instead
of $o->{error}. This means errors from the clock command are silently
ignored.
2. Double output violates protocol: When the retry path triggers (the
directory wasn't initially watched), output_result() is called with
the "/" flag, then launch_watchman() is called recursively which
calls output_result() again. This outputs two clock tokens to stdout,
but git's fsmonitor v2 protocol expects exactly one response.
Fix #1 by checking $o->{error} after watchman_clock().
Fix #2 by removing the recursive launch_watchman() call. The "/"
"everything is dirty" flag already tells git to do a full scan, and
git will call the hook again on the next invocation with a valid clock
token.
With the recursive call removed, the $retry guard is no longer needed
since it only existed to prevent infinite recursion. Remove it.
Apply the same fixes to the test helper scripts in t/t7519/.
Signed-off-by: Paul Tarjan <github@paulisageek.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
send-email: validate charset name in 8bit encoding prompt
When a non-ASCII character is detected in the body or subject of the email
the user is prompted with,
Which 8bit encoding should I declare [UTF-8]? foo
After this the input string is validated by the regex, based on the fact
that the charset string will be minimum 4 characters [1]. If the string is
more than 4 letters the email is sent, if not then a second prompt to
confirm is asked to the user,
Are you sure you want to use <foo> [y/N]? y
This relies on a length based regex heuristic check to validate the user
input, and can allow clearly invalid charset names to pass if the input is
greater than 4 characters.
Add a semantic validation of the charset name using the
Encode::find_encoding() which is a bundled module of perl. If the encoding
is not recognized, warn the user and ask for confirmation before proceeding.
After this validation the lenght based validation becomes redundant and also
breaks flow, so change the regex of valid input to any non blank string.
Make the encoding warning logic specific to the 8bit prompt, also add a
unique confirmation prompt which reduces the load on ask(), and improves
maintainability.
Additionally, the wording of the first prompt can confuse the user if not
read properly or under any default assumptions for a yes/no prompt. Change
the wording to make it explicitly clear to the user that the prompt needs a
string input, UTF-8 being the default.
The intended flow is,
Declare which 8bit encoding to use [default: UTF-8]? foobar
'foobar' does not appear to be a valid charset name. Use it anyway [y/N]?
Wang Zichong [Sat, 28 Feb 2026 07:59:44 +0000 (07:59 +0000)]
gitk: support link color in the Preferences dialog
As a dark-theme user, I use the Preferences dialog to set colors
for gitk. The only color I cannot change via that dialog is the
link foreground color, which leads to using the default link color
on a dark background that makes it hard to read.
Make the link foreground color also configurable in the Gitk
Preferences dialog's Color tab, so users won't need to dig into
the code/manual to check if it is configurable and can simply set
the color there.
Signed-off-by: Wang Zichong <wangzichong@deepin.org> Signed-off-by: Johannes Sixt <j6t@kdbg.org>