odb/source: make `read_object_info()` function pluggable
Introduce a new callback function in `struct odb_source` to make the
function pluggable.
Note that this function is a bit less straight-forward to convert
compared to the other functions. The reason here is that the logic to
read an object is:
1. We try to read the object. If it exists we return it.
2. If the object does not exist we reprepare the object database
source.
3. We then try reading the object info a second time in case the
reprepare caused it to appear.
The second read is only supposed to happen for the packfile store
though, as reading loose objects is not impacted by repreparing the
object database.
Ideally, we'd just move this whole logic into the ODB source. But that's
not easily possible because we try to avoid the reprepare unless really
required, which is after we have found out that no other ODB source
contains the object, either. So the logic spans across multiple ODB
sources, and consequently we cannot move it into an individual source.
Instead, introduce a new flag `OBJECT_INFO_SECOND_READ` that tells the
backend that we already tried to look up the object once, and that this
time around the ODB source should try to find any new objects that may
have surfaced due to an on-disk change.
With this flag, the "files" backend can trivially skip trying to re-read
the object as a loose object. Furthermore, as we know that we only try
the second read via the packfile store, we can skip repreparing loose
objects and only reprepare the packfile store.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a caller holds a `struct odb_source`, they have no way of telling
what type the source is. This doesn't really cause any problems in the
current status quo as we only have a single type anyway, "files". But
going forward we expect to add more types, and if so it will become
necessary to tell the sources apart.
Introduce a new enum to cover this use case and assert that the given
source actually matches the target source when performing the downcast.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: move reparenting logic into respective subsystems
The primary object database source may be initialized with a relative
path. When the process changes its current working directory we thus
have to update this path and have it point to the same path, but
relative to the new working directory.
This logic is handled in the object database layer. It consists of three
steps:
1. We undo any potential temporary object directory, which are used
for transactions. This is done so that we don't end up modifying
the temporary object database source that got applied for the
transaction.
2. We then iterate through the non-transactional sources and reparent
their respective paths.
3. We reapply the temporary object directory, but update its path.
All of this logic is heavily tied to how the object database source
handles paths in the first place. It's an internal implementation
detail, and as sources may not even use an on-disk path at all it is not
a mechanism that applies to all potential sources.
Refactor the code so that the logic to reparent the sources is hosted by
the "files" source and the temporary object directory subsystems,
respectively. This logic is easier to reason about, but it also ensures
that this logic is handled at the correct level.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "files" backend is implemented as a pointer in the `struct
odb_source`. This contradicts our typical pattern for pluggable backends
like we use it for example in the ref store or for object database
streams, where we typically embed the generic base structure in the
specialized implementation. This pattern has a couple of small benefits:
- We avoid an extra allocation.
- We hide implementation details in the generic structure.
- We can easily downcast from a generic backend to the specialized
structure and vice versa because the offsets are known at compile
time.
- It becomes trivial to identify locations where we depend on backend
specific logic because the cast needs to be explicit.
Refactor our "files" object database source to do the same and embed the
`struct odb_source` in the `struct odb_source_files`.
There are still a bunch of sites in our code base where we do have to
access internals of the "files" backend. The intent is that those will
go away over time, but this will certainly take a while. Meanwhile,
provide a `odb_source_files_downcast()` function that can convert a
generic source into a "files" source.
As we only have a single source the downcast succeeds unconditionally
for now. Eventually though the intent is to make the cast `BUG()` in
case the caller requests to downcast a non-"files" backend to a "files"
backend.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new "files" object database source. This source encapsulates
access to both loose object files and the packfile store, similar to how
the "files" backend for refs encapsulates access to loose refs and the
packed-refs file.
Note that for now the "files" source is still a direct member of a
`struct odb_source`. This architecture will be reversed in the next
commit so that the files source contains a `struct odb_source`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: split `struct odb_source` into separate header
Subsequent commits will expand the `struct odb_source` to become a
generic interface for accessing an object database source. As part of
these refactorings we'll add a set of function pointers that will
significantly expand the structure overall.
Prepare for this by splitting out the `struct odb_source` into a
separate header. This keeps the high-level object database interface
detached from the low-level object database sources.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 23 Feb 2026 21:48:48 +0000 (13:48 -0800)]
Merge branch 'ps/object-info-bits-cleanup' into ps/odb-sources
* ps/object-info-bits-cleanup:
odb: convert `odb_has_object()` flags into an enum
odb: convert object info flags into an enum
odb: drop gaps in object info flag values
builtin/fsck: fix flags passed to `odb_has_object()`
builtin/backfill: fix flags passed to `odb_has_object()`
Junio C Hamano [Mon, 23 Feb 2026 21:48:00 +0000 (13:48 -0800)]
Merge branch 'ps/odb-for-each-object' into ps/odb-sources
* ps/odb-for-each-object:
odb: drop unused `for_each_{loose,packed}_object()` functions
reachable: convert to use `odb_for_each_object()`
builtin/pack-objects: use `packfile_store_for_each_object()`
odb: introduce mtime fields for object info requests
treewide: drop uses of `for_each_{loose,packed}_object()`
treewide: enumerate promisor objects via `odb_for_each_object()`
builtin/fsck: refactor to use `odb_for_each_object()`
odb: introduce `odb_for_each_object()`
packfile: introduce function to iterate through objects
packfile: extract function to iterate through objects of a store
object-file: introduce function to iterate through objects
object-file: extract function to read object info from path
odb: fix flags parameter to be unsigned
odb: rename `FOR_EACH_OBJECT_*` flags
Junio C Hamano [Tue, 17 Feb 2026 21:30:41 +0000 (13:30 -0800)]
Merge branch 'yt/merge-file-outside-a-repository'
"git merge-file" can be run outside a repository, but it ignored
all configuration, even the per-user ones. The command now uses
available configuration files to find its customization.
* yt/merge-file-outside-a-repository:
merge-file: honor merge.conflictStyle outside of a repository
Junio C Hamano [Fri, 13 Feb 2026 21:39:26 +0000 (13:39 -0800)]
Merge branch 'jc/ci-test-contrib-too'
Test contrib/ things in CI to catch breakages before they enter the
"next" branch.
* jc/ci-test-contrib-too:
: Some of our downstream folks run more tests than we do and catch
: breakages in them, namely, where contrib/*/Makefile has "test" target.
: Let's make sure we fail upon accepting a new topic that break them in
: 'seen'.
ci: ubuntu: use GNU coreutils for dirname
test: optionally test contrib in CI
Junio C Hamano [Fri, 13 Feb 2026 21:39:25 +0000 (13:39 -0800)]
Merge branch 'jt/odb-transaction-per-source'
Transaction to create objects (or not) is currently tied to the
repository, but in the future a repository can have multiple object
sources, which may have different transaction mechanisms. Make the
odb transaction API per object source.
* jt/odb-transaction-per-source:
odb: transparently handle common transaction behavior
odb: prepare `struct odb_transaction` to become generic
object-file: rename transaction functions
odb: store ODB source in `struct odb_transaction`
Junio C Hamano [Fri, 13 Feb 2026 21:39:25 +0000 (13:39 -0800)]
Merge branch 'ps/commit-list-functions-renamed'
Rename three functions around the commit_list data structure.
* ps/commit-list-functions-renamed:
commit: rename `free_commit_list()` to conform to coding guidelines
commit: rename `reverse_commit_list()` to conform to coding guidelines
commit: rename `copy_commit_list()` to conform to coding guidelines
Junio C Hamano [Fri, 13 Feb 2026 21:39:25 +0000 (13:39 -0800)]
Merge branch 'tc/last-modified-not-a-tree'
Giving "git last-modified" a tree (not a commit-ish) died an
uncontrolled death, which has been corrected.
* tc/last-modified-not-a-tree:
last-modified: verify revision argument is a commit-ish
last-modified: remove double error message
last-modified: fix memory leak when more than one commit is given
last-modified: rewrite error message when more than one commit given
ISO C23 redefines strchr and friends that tradiotionally took
a const pointer and returned a non-const pointer derived from it to
preserve constness (i.e., if you ask for a substring in a const
string, you get a const pointer to the substring). Update code
paths that used non-const pointer to receive their results that did
not have to be non-const to adjust.
* cf/c23-const-preserving-strchr-updates-0:
gpg-interface: remove an unnecessary NULL initialization
global: constify some pointers that are not written to
odb: convert `odb_has_object()` flags into an enum
Following the reason in the preceding commit, convert the
`odb_has_object()` flags into an enum.
With this change, we would have catched the misuse of `odb_has_object()`
that was fixed in a preceding commit as the compiler would have
generated a warning:
../builtin/backfill.c:71:9: error: implicit conversion from enumeration type 'enum odb_object_info_flag' to different enumeration type 'enum odb_has_object_flag' [-Werror,-Wenum-conversion]
70 | if (!odb_has_object(ctx->repo->objects, &list->oid[i],
| ~~~~~~~~~~~~~~
71 | OBJECT_INFO_FOR_PREFETCH))
| ^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the object info flags into an enum and adapt all functions that
receive these flags as parameters to use the enum instead of an integer.
This serves two purposes:
- The function signatures become more self-documenting, as callers
don't have to wonder which flags they expect.
- The compiler can warn when a wrong flag type is passed.
Note that the second benefit is somewhat limited. For example, when
or-ing multiple enum flags together the result will be an integer, and
the compiler will not warn about such use cases. But where it does help
is when a single flag of the wrong type is passed, as the compiler would
generate a warning in that case.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The object info flag values have a two gaps in their definitions, where
some bits are skipped over. These gaps don't really hurt, but it makes
one wonder whether anything is going on and whether a subset of flags
might be defined somewhere else.
That's not the case though. Instead, this is a case of flags that have
been dropped in the past:
- The value 4 was used by `OBJECT_INFO_SKIP_CACHED`, removed in 9c8a294a1a (sha1-file: remove OBJECT_INFO_SKIP_CACHED, 2020-01-02).
- The value 8 was used by `OBJECT_INFO_ALLOW_UNKNOWN_TYPE`, removed in ae24b032a0 (object-file: drop OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag,
2025-05-16).
Close those gaps to avoid any more confusion.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/fsck: fix flags passed to `odb_has_object()`
In `mark_object()` we invoke `has_object()` with a value of 1. This is
somewhat fishy given that the function expects a bitset of flags, so any
behaviour that this results in is purely coincidental and may break at
any point in time.
The call to `has_object()` was originally introduced in 9eb86f41de
(fsck: do not lazy fetch known non-promisor object, 2020-08-05). The
intent here was to skip lazy fetches of promisor objects: we have
already verified that the object is not a promisor object, so if the
object is missing it indicates a corrupt repository.
The hardcoded value that we pass maps to `HAS_OBJECT_RECHECK_PACKED`,
which is probably the intended behaviour: `odb_has_object()` will not
fetch promisor objects unless `HAS_OBJECT_FETCH_PROMISOR` is passed, but
we may want to verify that no concurrent process has written the object
that we're trying to read.
Convert the code to use the named flag instead of the the hardcoded
value.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/backfill: fix flags passed to `odb_has_object()`
The function `fill_missing_blobs()` receives an array of object IDs and
verifies for each of them whether the corresponding object exists. If it
doesn't exist, we add it to a set of objects and then batch-fetch all of
the objects at once.
The check for whether or not we already have the object is broken
though: we pass `OBJECT_INFO_FOR_PREFETCH`, but `odb_has_object()`
expects us to pass `HAS_OBJECT_*` flags. The flag expands to:
- `OBJECT_INFO_QUICK`, which asks the object database to not reprepare
in case the object wasn't found. This makes sense, as we'd otherwise
reprepare the object database as many times as we have missing
objects.
- `OBJECT_INFO_SKIP_FETCH_OBJECT`, which asks the object database to
not fetch the object in case it's missing. Again, this makes sense,
as we want to batch-fetch the objects.
This shows that we indeed want the equivalent of this flag, but of
course represented as `HAS_OBJECT_*` flags.
Luckily, the code is already working correctly. The `OBJECT_INFO` flag
expands to `(1 << 3) | (1 << 4)`, none of which are valid `HAS_OBJECT`
flags. And if no flags are passed, `odb_has_object()` ends up calling
`odb_read_object_info_extended()` with exactly the above two flags that
we wanted to set in the first place.
Of course, this is pure luck, and this can break any moment. So let's
fix this and correct the code to not pass any flags at all.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phillip Wood [Thu, 12 Feb 2026 15:53:50 +0000 (15:53 +0000)]
diff --anchored: avoid checking unmatched lines
For a line to be an anchor it has to appear in each of the files being
diffed exactly once. With that in mind lets delay checking whether
a line is an anchor until we know there is exactly one instance of
the line in each file. As each line is checked at most once, there
is no need to cache the result of is_anchor() and we can drop that
field from the hashmap entries. When diffing 5000 recent commits in
git.git this gives a modest speedup of ~2%. In the (rather extreme)
example below that consists largely of deletions the speedup is ~16%.
seq 0 10000000 >old
printf '%s\n' 300000 100000 200000 >new
git diff --no-index --anchored=300000 old new
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 11 Feb 2026 19:17:48 +0000 (11:17 -0800)]
CodingGuidelines: document // comments
We do not use // comments in our C code, which is implied by the
description of multi-line comment rule and its examples, but is not
explicitly spelled out. Spell it out.
Junio C Hamano [Wed, 11 Feb 2026 20:29:06 +0000 (12:29 -0800)]
Merge branch 'sp/show-index-warn-fallback'
When "git show-index" is run outside a repository, it silently
defaults to SHA-1; the tool now warns when this happens.
* sp/show-index-warn-fallback:
show-index: use gettext wrapping in user facing error messages
show-index: warn when falling back to SHA-1 outside a repository
René Scharfe [Mon, 9 Feb 2026 19:24:52 +0000 (20:24 +0100)]
xdiff-interface: stop using the_repository
Use the algorithm-agnostic is_null_oid() and push the dependency of
read_mmblob() on the_repository->objects to its callers. This allows it
to be used with arbitrary object databases.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A handful of code paths that started using batched ref update API
(after Git 2.51 or so) lost detailed error output, which have been
corrected.
* kn/ref-batch-output-error-reporting-fix:
fetch: delay user information post committing of transaction
receive-pack: utilize rejected ref error details
fetch: utilize rejected ref error details
update-ref: utilize rejected error details if available
refs: add rejection detail to the callback function
refs: skip to next ref when current ref is rejected
Junio C Hamano [Mon, 9 Feb 2026 20:09:09 +0000 (12:09 -0800)]
Merge branch 'ps/history'
"git history" history rewriting UI.
* ps/history:
builtin/history: implement "reword" subcommand
builtin: add new "history" command
wt-status: provide function to expose status for trees
replay: support updating detached HEAD
replay: support empty commit ranges
replay: small set of cleanups
builtin/replay: move core logic into "libgit.a"
builtin/replay: extract core logic to replay revisions
Junio C Hamano [Mon, 9 Feb 2026 18:27:29 +0000 (10:27 -0800)]
rerere: minor documantation update
Let's not call our users "it". Also "rerere forget \*.c" does not
forget resolutions for just '*.c'; it forgets for all the files
whose filenames end with ".c".
René Scharfe [Sun, 8 Feb 2026 17:01:24 +0000 (18:01 +0100)]
version: stop using the_repository
Actually it has never been used in version.c since cf7ee481902 (agent:
advertise OS name via agent capability, 2025-02-15) added the dependency
macro. Remove it, along with the also unused struct declaration.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Yannik Tausch [Sat, 7 Feb 2026 21:37:48 +0000 (22:37 +0100)]
merge-file: honor merge.conflictStyle outside of a repository
When running outside a repository, git merge-file ignores the
merge.conflictStyle configuration variable entirely. Since the
function receives `repo` from the caller (which is NULL outside a
repository), and repo_config() falls back to reading system and user
configuration when passed NULL, pass `repo` to repo_config()
unconditionally.
Also document that merge.conflictStyle is honored.
Signed-off-by: Yannik Tausch <dev@ytausch.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sam Bostock [Fri, 6 Feb 2026 19:16:23 +0000 (19:16 +0000)]
merge-ours: integrate with sparse-index
The merge-ours built-in opens the index to compare it against HEAD.
The machinery used to do this (i.e. run_diff_index()) is capable of
working with a sparse index, but the start-up sequence of this
command does not take the necessary steps, so we end up expanding the
index fully before doing the comparison.
In order to convince sparse-index.c:is_sparse_index_allowed() to
return true, we need to:
- Read basic configuration with git_default_config so that global
variables like core_apply_sparse_checkout are populated.
merge-ours currently does not read configuration at all.
- Set command_requires_full_index to 0.
With that, the command can work without expanding the index fully
before doing its work.
Signed-off-by: Sam Bostock <sam@sambostock.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sam Bostock [Fri, 6 Feb 2026 19:16:22 +0000 (19:16 +0000)]
merge-ours: drop USE_THE_REPOSITORY_VARIABLE
The merge-ours built-in uses the `the_repository` global to access
the repository. The project is moving away from this global in favor
of the `repo` parameter that is passed to each built-in command.
Since merge-ours is registered with RUN_SETUP, `repo` is guaranteed
to be non-NULL and can be used directly.
Drop the USE_THE_REPOSITORY_VARIABLE macro and use `repo` throughout.
While at it, remove a stray double blank line between the #include
block and the usage string.
Signed-off-by: Sam Bostock <sam@sambostock.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jean-Noël Avila [Fri, 6 Feb 2026 04:12:25 +0000 (04:12 +0000)]
doc: fix some style issues in git-clone and for-each-ref-options
* spell out all forms of --[no-]reject-shallow in git-clone
* use imperative mood for the first line of options
* Use asciidoc NOTE macro
* fix markups
Reviewed-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Collin Funk [Fri, 6 Feb 2026 01:46:09 +0000 (17:46 -0800)]
global: constify some pointers that are not written to
The recent glibc 2.43 release had the following change listed in its
NEWS file:
For ISO C23, the functions bsearch, memchr, strchr, strpbrk, strrchr,
strstr, wcschr, wcspbrk, wcsrchr, wcsstr and wmemchr that return
pointers into their input arrays now have definitions as macros that
return a pointer to a const-qualified type when the input argument is
a pointer to a const-qualified type.
When compiling with GCC 15, which defaults to -std=gnu23, this causes
many warnings like this:
merge-ort.c: In function ‘apply_directory_rename_modifications’:
merge-ort.c:2734:36: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
2734 | char *last_slash = strrchr(cur_path, '/');
| ^~~~~~~
This patch fixes the more obvious ones by making them const when we do
not write to the returned pointer.
Signed-off-by: Collin Funk <collin.funk1@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The computation of column width made by "git diff --stat" was
confused when pathnames contain non-ASCII characters.
* lp/diff-stat-utf8-display-width-fix:
t4073: add test for diffstat paths length when containing UTF-8 chars
diff: improve scaling of filenames in diffstat to handle UTF-8 chars