Junio C Hamano [Wed, 8 Apr 2026 17:19:18 +0000 (10:19 -0700)]
Merge branch 'tc/replay-ref'
The experimental `git replay` command learned the `--ref=<ref>` option
to allow specifying which ref to update, overriding the default behavior.
* tc/replay-ref:
replay: allow to specify a ref with option --ref
replay: use stuck form in documentation and help message
builtin/replay: mark options as not negatable
Junio C Hamano [Wed, 8 Apr 2026 17:19:17 +0000 (10:19 -0700)]
Merge branch 'ng/add-files-to-cache-wo-rename'
add_files_to_cache() used diff_files() to detect only the paths that
are different between the index and the working tree and add them,
which does not need rename detection, which interfered with unnecessary
conflicts.
* ng/add-files-to-cache-wo-rename:
read-cache: disable renames in add_files_to_cache
Junio C Hamano [Wed, 8 Apr 2026 17:19:17 +0000 (10:19 -0700)]
Merge branch 'ps/reftable-portability'
Update reftable library part with what is used in libgit2 to improve
portability to different target codebases and platforms.
* ps/reftable-portability:
reftable/system: add abstraction to mmap files
reftable/system: add abstraction to retrieve time in milliseconds
reftable/fsck: use REFTABLE_UNUSED instead of UNUSED
reftable/stack: provide fsync(3p) via system header
reftable: introduce "reftable-system.h" header
Junio C Hamano [Wed, 8 Apr 2026 17:19:17 +0000 (10:19 -0700)]
Merge branch 'ps/odb-cleanup'
Various code clean-up around odb subsystem.
* ps/odb-cleanup:
odb: drop unneeded headers and forward decls
odb: rename `odb_has_object()` flags
odb: use enum for `odb_write_object` flags
odb: rename `odb_write_object()` flags
treewide: use enum for `odb_for_each_object()` flags
CodingGuidelines: document our style for flags
Adrian Ratiu [Wed, 8 Apr 2026 16:11:48 +0000 (19:11 +0300)]
t1800: add &&-chains to test helper functions
Add the missing &&'s so we properly propagate failures
between commands in the hook helper functions.
Also add a missing mkdir -p arg (found by adding the &&).
Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/reftable-backend: drop uses of the_repository
reftable_be_init() and reftable_be_create_on_disk() use the_repository even
though a repository instance is already available, either directly or via
struct ref_store.
Replace these uses with the appropriate local repository instance (repo or
ref_store->repo) to avoid relying on global state.
Note that USE_THE_REPOSITORY_VARIABLE cannot be removed yet, as
is_bare_repository() is still there in the file.
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs.c uses the_hash_algo in multiple places, relying on global state for
the object hash algorithm. Replace these uses with the appropriate
repository-specific hash_algo. In transaction-related functions
(ref_transaction_create, ref_transaction_delete, migrate_one_ref, and
transaction_hook_feed_stdin), use transaction->ref_store->repo->hash_algo.
In other cases, such as repo_get_submodule_ref_store(), use
repo->hash_algo.
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: add struct repository parameter in get_files_ref_lock_timeout_ms()
get_files_ref_lock_timeout_ms() calls repo_config_get_int() using
the_repository, as no repository instance is available in its scope. Add a
struct repository parameter and use it instead of the_repository.
Update all callers accordingly. In files-backend.c, lock_raw_ref() can
obtain repository instance from the struct ref_transaction via
transaction->ref_store->repo and pass it down. For create_reflock(), which
is used as a callback, introduce a small wrapper struct to pass both struct
lock_file and struct repository through the callback data.
This reduces reliance on the_repository global, though the function
still uses static variables and is not yet fully repository-scoped.
This can be addressed in a follow-up change.
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our description of the reftable format is that it is experimental and
subject to change, but that is no longer true. Remove this statement so
as not to mislead users.
In addition, the documentation says that the files format is the
default, but that is not true if breaking changes mode is on. Correct
this information with a conditional.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-file: avoid ODB transaction when not writing objects
In ce1661f9da (odb: add transaction interface, 2025-09-16), existing
ODB transaction logic is adapted to create a transaction interface
at the ODB layer. The intent here is for the ODB transaction
interface to eventually provide an object source agnostic means to
manage transactions.
An unintended consequence of this change though is that
`object-file.c:index_fd()` may enter the ODB transaction path even
when no object write is requested. In non-repository contexts, this
can result in a NULL dereference and segfault. One such case occurs
when running git-diff(1) outside of a repository with
"core.bigFileThreshold" forcing the streaming path in `index_fd()`:
$ echo foo >foo
$ echo bar >bar
$ git -c core.bigFileThreshold=1 diff -- foo bar
In this scenario, the caller only needs to compute the object ID. Object
hashing does not require an ODB, so starting a transaction is both
unnecessary and invalid.
Fix the bug by avoiding the use of ODB transactions in `index_fd()` when
callers are only interested in computing the object hash.
Reported-by: Luca Stefani <luca.stefani.ge1@gmail.com> Signed-off-by: Justin Tobler <jltobler@gmail.com>
[jc: adjusted to fd13909e (Merge branch 'jt/odb-transaction', 2025-10-02)] Signed-off-by: Junio C Hamano <gitster@pobox.com>
The check in "receive-pack" to prevent a checked out branch from
getting updated via updateInstead mechanism has been corrected.
* ps/receive-pack-updateinstead-in-worktree:
receive-pack: use worktree HEAD for updateInstead
t5516: clean up cloned and new-wt in denyCurrentBranch and worktrees test
t5516: test updateInstead with worktree and unborn bare HEAD
The way the "git log -L<range>:<file>" feature is bolted onto the
log/diff machinery is being reworked a bit to make the feature
compatible with more diff options, like -S/G.
* mm/line-log-use-standard-diff-output:
doc: note that -L supports patch formatting and pickaxe options
t4211: add tests for -L with standard diff options
line-log: route -L output through the standard diff pipeline
line-log: fix crash when combined with pickaxe options
Junio C Hamano [Tue, 7 Apr 2026 21:59:26 +0000 (14:59 -0700)]
Merge branch 'ps/fsck-wo-the-repository'
Internals of "git fsck" have been refactored to not depend on the
global `the_repository` variable.
* ps/fsck-wo-the-repository:
builtin/fsck: stop using `the_repository` in error reporting
builtin/fsck: stop using `the_repository` when marking objects
builtin/fsck: stop using `the_repository` when checking packed objects
builtin/fsck: stop using `the_repository` with loose objects
builtin/fsck: stop using `the_repository` when checking reflogs
builtin/fsck: stop using `the_repository` when checking refs
builtin/fsck: stop using `the_repository` when snapshotting refs
builtin/fsck: fix trivial dependence on `the_repository`
fsck: drop USE_THE_REPOSITORY
fsck: store repository in fsck options
fsck: initialize fsck options via a function
fetch-pack: move fsck options into function scope
Junio C Hamano [Tue, 7 Apr 2026 21:59:25 +0000 (14:59 -0700)]
Merge branch 'ps/commit-graph-overflow-fix'
Fix a regression in writing the commit-graph where commits with dates
exceeding 34 bits (beyond year 2514) could cause an underflow and
crash Git during the generation data overflow chunk writing.
In t5710, we frequently construct local file URIs using `file://$(pwd)`.
On Unix-like systems, $(pwd) returns an absolute path starting with a
slash (e.g., `/tmp/repo`), resulting in a valid 3-slash URI with an
empty host (`file:///tmp/repo`).
However, on Windows, $(pwd) returns a path starting with a drive
letter (e.g., `D:/a/repo`). This results in a 2-slash URI
(`file://D:/a/repo`). Standard URI parsers misinterpret this format,
treating `D:` as the host rather than part of the absolute path.
This is to be expected because RFC 8089 says that the `//` prefix with
an empty local host must be followed by an absolute path starting with
a slash.
While this hasn't broken the existing tests (because the old
`promisor.acceptFromServer` logic relies entirely on strict `strcmp()`
without normalizing the URLs), it will break future commits that pass
these URLs through `url_normalize()` or similar functions.
To future-proof the tests and ensure cross-platform URI compliance,
let's introduce a $TRASH_DIRECTORY_URL helper variable that explicitly
guarantees a leading slash for the path component, ensuring valid
3-slash `file:///` URIs on all operating systems.
While at it, let's also introduce $ENCODED_TRASH_DIRECTORY_URL to
handle some common special characters in directory paths.
To be extra safe, let's skip all the tests if there are uncommon
special characters in the directory path.
Then let's replace all instances of `file://$(pwd)` with
$TRASH_DIRECTORY_URL across the test script, and let's simplify the
`sendFields` and `checkFields` tests to use
$ENCODED_TRASH_DIRECTORY_URL directly.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a previous commit, filter_promisor_remote() was refactored to keep
accepted 'struct promisor_info' instances alive instead of dismantling
them into separate parallel data structures.
Let's go one step further and replace the 'struct strvec *accepted'
argument passed to filter_promisor_remote() with a
'struct string_list *accepted_remotes' argument.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In filter_promisor_remote(), the instances of `struct promisor_info`
for accepted remotes are dismantled into separate parallel data
structures (the 'accepted' strvec for server names, and
'accepted_filters' for filter strings) and then immediately freed.
Instead, let's keep these instances on an 'accepted_remotes' list.
This way the post-loop phase can iterate a single list to build the
protocol reply, apply advertised filters, and mark remotes as
accepted, rather than iterating three separate structures.
This refactoring also prepares for a future commit that will add a
'local_name' member to 'struct promisor_info'. Since struct instances
stay alive, downstream code will be able to simply read both names
from them rather than needing yet another parallel strvec.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In future commits, we are going to add more logic to
filter_promisor_remote() which is already doing a lot of things.
Let's alleviate that by moving the logic that checks and validates the
value of the `promisor.acceptFromServer` config variable into its own
accept_from_server() helper function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit we are going to check if some strings contain
control characters, so let's refactor the logic to do that in a new
has_control_char() helper function.
It cleans up the code a bit anyway.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: refactor should_accept_remote() control flow
A previous commit made sure we now reject empty URLs early at parse
time. This makes the existing warning() in case a remote URL is NULL
or empty very unlikely to be useful.
In future work, we also plan to add URL-based acceptance logic into
should_accept_remote().
To adapt to previous changes and prepare for upcoming changes, let's
restructure the control flow in should_accept_remote().
Concretely, let's:
- Replace the warning() in case of an empty URL with a BUG(), as a
previous commit made sure empty URLs are rejected early at parse
time.
- Move that modified empty-URL check to the very top of the function,
so that every acceptance mode, instead of only ACCEPT_KNOWN_URL, is
covered.
- Invert the URL comparison: instead of returning on match and
warning on mismatch, return early on mismatch and let the match
case fall through. This opens a single exit path at the bottom of
the function for future commits to extend.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: reject empty name or URL in advertised remote
In parse_one_advertised_remote(), we check for a NULL remote name and
remote URL, but not for empty ones. An empty URL seems possible as
url_percent_decode("") doesn't return NULL.
In promisor_config_info_list(), we ignore remotes with empty URLs, so a
Git server should not advertise remotes with empty URLs. It's possible
that a buggy or malicious server would do it though.
So let's tighten the check in parse_one_advertised_remote() to also
reject empty strings at parse time.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In should_accept_remote() and parse_one_advertised_remote(), when a
remote is ignored, we tell users why it is ignored in a warning, but we
don't tell them that the remote is actually ignored.
Let's clarify that, so users have a better idea of what's actually
happening.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: pass config entry to all_fields_match() directly
The `in_list == 0` path of all_fields_match() looks up the remote in
`config_info` by `advertised->name` repeatedly, even though every
caller in should_accept_remote() has already performed this
lookup and holds the result in `p`.
To avoid this useless work, let's replace the `int in_list`
parameter with a `struct promisor_info *config_entry` pointer:
- When NULL (ACCEPT_ALL mode): scan the whole `config_info` list, as
the old `in_list == 1` path did.
- When non-NULL: match against that single config entry directly,
avoiding the redundant string_list_lookup() call.
This removes the hidden dependency on `advertised->name` inside
all_fields_match(), which would be wrong if in the future
auto-configured remotes are implemented, as the local config name may
differ from the server's advertised name.
While at it, let's also add a comment before all_fields_match() and
match_field_against_config() to help understand how things work and
help avoid similar issues.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: try accepted remotes before others in get_direct()
When a server advertises promisor remotes and the client accepts some
of them, those remotes carry the server's intent: 'fetch missing
objects preferably from here', and the client agrees with that for the
remotes it accepts.
However promisor_remote_get_direct() actually iterates over all
promisor remotes in list order, which is the order they appear in the
config files (except perhaps for the one appearing in the
`extensions.partialClone` config variable which is tried last).
This means an existing, but not accepted, promisor remote, could be
tried before the accepted ones, which does not reflect the intent of
the agreement between client and server.
If the client doesn't care about what the server suggests, it should
accept nothing and rely on its remotes as they are already configured.
To better reflect the agreement between client and server, let's make
promisor_remote_get_direct() try the accepted promisor remotes before
the non-accepted ones.
Concretely, let's extract a try_promisor_remotes() helper and call it
twice from promisor_remote_get_direct():
- first with an `accepted_only=true` argument to try only the accepted
remotes,
- then with `accepted_only=false` to fall back to any remaining remote.
Ensuring that accepted remotes are preferred will be even more
important if in the future a mechanism is developed to allow the
client to auto-configure remotes that the server advertises. This will
in particular avoid fetching from the server (which is already
configured as a promisor remote) before trying the auto-configured
remotes, as these new remotes would likely appear at the end of the
config file, and as the server might not appear in the
`extensions.partialClone` config variable.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 6 Apr 2026 22:42:50 +0000 (15:42 -0700)]
Merge branch 'aa/reap-transport-child-processes'
A few code paths that spawned child processes for network
connection weren't wait(2)ing for their children and letting "init"
reap them instead; they have been tightened.
* aa/reap-transport-child-processes:
transport-helper, connect: use clean_on_exit to reap children on abnormal exit
Junio C Hamano [Mon, 6 Apr 2026 22:42:51 +0000 (15:42 -0700)]
Merge branch 'jk/c23-const-preserving-fixes'
Adjust the codebase for C23 that changes functions like strchr()
that discarded constness when they return a pointer into a const
string to preserve constness.
* jk/c23-const-preserving-fixes:
config: store allocated string in non-const pointer
rev-parse: avoid writing to const string for parent marks
revision: avoid writing to const string for parent marks
rev-parse: simplify dotdot parsing
revision: make handle_dotdot() interface less confusing
Junio C Hamano [Mon, 6 Apr 2026 22:42:49 +0000 (15:42 -0700)]
Merge branch 'tb/stdin-packs-excluded-but-open'
pack-objects's --stdin-packs=follow mode learns to handle
excluded-but-open packs.
* tb/stdin-packs-excluded-but-open:
repack: mark non-MIDX packs above the split as excluded-open
pack-objects: support excluded-open packs with --stdin-packs
t7704: demonstrate failure with once-cruft objects above the geometric split
pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap`
pack-objects: plug leak in `read_stdin_packs()`
Object name handling (disambiguation and abbreviation) has been
refactored to be backend-generic, moving logic into the respective
object database backends.
* ps/odb-generic-object-name-handling:
odb: introduce generic `odb_find_abbrev_len()`
object-file: move logic to compute packed abbreviation length
object-name: move logic to compute loose abbreviation length
object-name: simplify computing common prefixes
object-name: abbreviate loose object names without `disambiguate_state`
object-name: merge `update_candidates()` and `match_prefix()`
object-name: backend-generic `get_short_oid()`
object-name: backend-generic `repo_collect_ambiguous()`
object-name: extract function to parse object ID prefixes
object-name: move logic to iterate through packed prefixed objects
object-name: move logic to iterate through loose prefixed objects
odb: introduce `struct odb_for_each_object_options`
oidtree: extend iteration to allow for arbitrary return codes
oidtree: modernize the code a bit
object-file: fix sparse 'plain integer as NULL pointer' error
David Lin [Mon, 6 Apr 2026 19:27:11 +0000 (15:27 -0400)]
cache-tree: fix inverted object existence check in cache_tree_fully_valid
The negation in front of the object existence check in
cache_tree_fully_valid() was lost in 062b914c84 (treewide: convert
users of `repo_has_object_file()` to `has_object()`, 2025-04-29),
turning `!repo_has_object_file(...)` into `has_object(...)` instead
of `!has_object(...)`.
This makes cache_tree_fully_valid() always report the cache tree as
invalid when objects exist (the common case), forcing callers like
write_index_as_tree() to call cache_tree_update() on every
invocation. An odb_has_object() check inside update_one() avoids a
full tree rebuild, but the unnecessary call still pays the cost of
opening an ODB transaction and, in partial clones, a promisor remote
check.
Restore the missing negation and add a test that verifies write-tree
takes the cache-tree shortcut when the cache tree is valid.
Helped-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: David Lin <davidlin@stripe.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: fix promisor.quiet to use the correct repository
fetch_objects() reads the promisor.quiet configuration from
the_repository instead of the repo parameter it receives.
This means that when git lazy-fetches objects for a non-main
repository, eg. a submodule that is itself a partial clone opened
via repo_submodule_init(). The submodule's own promisor.quiet
setting is ignored and the superproject's setting is used instead.
Fix by replacing the_repository with repo in the repo_config_get_bool()
call. The practical trigger is git grep --recurse-submodules on a
superproject where the submodule is a partial clone.
Add a test where promisor.quiet is set only in a partial-clone
submodule; a lazy fetch triggered by "git grep --recurse-submodules"
must honor that setting.
Signed-off-by: Trieu Huynh <vikingtc4@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git rev-list --maximal-only' option filters the output to only
independent commits. A commit is independent if it is not reachable from
other listed commits. Currently this is implemented by doing a full
revision walk and marking parents with CHILD_VISITED to skip non-maximal
commits.
The 'git merge-base --independent' command computes the same result
using reduce_heads(), which uses the more efficient remove_redundant()
algorithm. This is significantly faster because it avoids walking the
entire commit graph.
Add a fast path in rev-list that detects when --maximal-only is the only
interesting option and all input commits are positive (no revision
ranges). In this case, use reduce_heads() directly instead of doing a
full revision walk.
In order to preserve the rest of the output filtering, this computation
is done opportunistically in a new prepare_maximal_independent() method
when possible. If successful, it populates revs->commits with the list
of independent commits and set revs->no_walk to prevent any other walk
from occurring. This allows us to have any custom output be handled
using the existing output code hidden inside
traverse_commit_list_filtered(). A new test is added to demonstrate that
this output is preserved.
The fast path is only used when no other flags complicate the walk or
output format: no UNINTERESTING commits, no limiting options (max-count,
age filters, path filters, grep filters), no output formatting beyond
plain OIDs, and no object listing flags.
Running the p6011 performance test for my copy of git.git, I see the
following improvement with this change:
Add a performance test that compares 'git rev-list --maximal-only'
against 'git merge-base --independent'. These two commands are asking
essentially the same thing, but the rev-list implementation is more
generic and hence slower. These performance tests will demonstrate that
in the current state and also be used to show the equivalence in the
future.
We also add a case with '--since' to force the generic walk logic for
rev-list even when we make that future change to use the merge-base
algorithm on a simple walk.
When run on my copy of git.git, I see these results:
Test HEAD
----------------------------------------------
6011.2: merge-base --independent 0.03
6011.3: rev-list --maximal-only 0.06
6011.4: rev-list --maximal-only --since 0.06
These numbers are low, but the --independent calculation is interesting
due to having a lot of local branches that are actually independent.
Running the same test on a fresh clone of the Linux kernel repository
shows a larger difference between the algorithms, especially because the
--independent algorithm is extremely fast when there are no independent
references selected:
Test HEAD
----------------------------------------------
6011.2: merge-base --independent 0.00
6011.3: rev-list --maximal-only 0.70
6011.4: rev-list --maximal-only --since 0.70
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test that verifies the 'git rev-list --maximal-only' option
produces the same set of commits as 'git merge-base --independent'. This
equivalence was noted when the feature was first created, but we are
about to update the implementation to use a common algorithm in this
case where the user intention is identical.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Mon, 6 Apr 2026 09:31:21 +0000 (11:31 +0200)]
history: fix short help for argument of --update-refs
"print" is not a valid argument for --update-refs. List both valid
alternatives literally in the argh string, consistent with documentation
and usage string.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1edeb9a (Win32: warn if the console font doesn't support Unicode,
2014-06-10) introduced both code to detect the current console font on
Windows Vista and newer and a fallback for older systems to detect the
default console font and issue a warning if that font doesn't support
unicode.
Since we haven't supported any Windows older than Vista in almost a
decade, we don't need to keep the workaround.
Signed-off-by: Matthias Aßhauer <mha1993@live.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
unify and bump _WIN32_WINNT definition to Windows 8.1
Git for Windows doesn't support anything prior to Windows 8.1 since 2.47.0
and Git followed along with commits like ce6ccba (mingw: drop Windows
7-specific work-around, 2025-08-04).
There is no need to pretend to the compiler that we still support Windows
Vista, just to lock us out of easy access to newer APIs. There is also no
need to have conflicting and unused definitions claiming we support some
versions of Windows XP or even Windows NT 4.0.
Bump all definitions of _WIN32_WINNT to a realistic value of Windows 8.1.
This will also simplify code for a followup commit that will improve cpu
core detection on multi-socket systems.
Signed-off-by: Matthias Aßhauer <mha1993@live.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let’s change the phrasing around the `linkgit` while we’re visiting this
file (see previous commit[1]).
We use the section syntax to refer to man pages, so writing “man page”
next to it is a bit redundant. We can be more concise and just lean on
the preposition “in”.
And in order to avoid this double “git”:
see `git config list` in git-config(1) ...
We can rephrase to the subcommand, which is a typical pattern (config or
option followed by “in git-command(1)”).
† 1: Which also discusses why we do not change a similar phrasing
in gittutorial(7)
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace uses of `git config --list` (short or long) with the subcommand
`list` since `--list` is deprecated.
We will change the “man page” phrasing in gitcvs-migration(7) in the
next commit, since we are already visiting that sentence. But note
that we leave the “man page” phrasing in the sentence that we touch in
gittutorial(7) since it’s a tutorial and not a manual page. We can be
more wordy in a tutorial context.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 85127bcdea ("backfill: assume --sparse when sparse-checkout is
enabled") intended for 'git backfill' to consult the repository
configuration when the user does not pass '--sparse' or
'--no-sparse' on the command line. It added the sentinel check:
if (ctx->sparse < 0)
ctx->sparse = cfg->apply_sparse_checkout;
However, the ctx->sparse field is initialized to 0 instead of -1,
so this guard never triggers. Consequently, the repository config
(core.sparseCheckout) is never checked, and the command always
performs a full backfill even when sparse-checkout is enabled.
Fix this by initializing ctx->sparse to -1, ensuring the existing
fallback logic correctly reads the repository configuration when
no explicit flags are provided.
Add a test to verify that 'git backfill' automatically respects
sparse-checkout settings when no flags are passed.
Signed-off-by: Trieu Huynh <vikingtc4@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 3 Apr 2026 22:24:45 +0000 (15:24 -0700)]
Merge branch 'ps/dash-buggy-0.5.13-workaround'
The way dash 0.5.13 handles non-ASCII contents in here-doc
is buggy and breaks our existing tests, which unfortunately
have been rewritten to avoid triggering the bug.
* ps/dash-buggy-0.5.13-workaround:
t9300: work around partial read bug in Dash v0.5.13
t: work around multibyte bug in quoted heredocs with Dash v0.5.13
Junio C Hamano [Fri, 3 Apr 2026 20:01:08 +0000 (13:01 -0700)]
Merge branch 'ar/config-hook-cleanups'
Code clean-up around the recent "hooks defined in config" topic.
* ar/config-hook-cleanups:
hook: reject unknown hook names in git-hook(1)
hook: show disabled hooks in "git hook list"
hook: show config scope in git hook list
hook: introduce hook_config_cache_entry for per-hook data
t1800: add test to verify hook execution ordering
hook: make consistent use of friendly-name in docs
hook: replace hook_list_clear() -> string_list_clear_func()
hook: detect & emit two more bugs
hook: rename cb_data_free/alloc -> hook_data_free/alloc
hook: fix minor style issues
builtin/receive-pack: properly init receive_hook strbuf
hook: move unsorted_string_list_remove() to string-list.[ch]
Junio C Hamano [Fri, 3 Apr 2026 20:01:08 +0000 (13:01 -0700)]
Merge branch 'ds/backfill-revs'
`git backfill` learned to accept revision and pathspec arguments.
* ds/backfill-revs:
t5620: test backfill's unknown argument handling
path-walk: support wildcard pathspecs for blob filtering
backfill: work with prefix pathspecs
backfill: accept revision arguments
t5620: prepare branched repo for revision tests
revision: include object-name.h
Junio C Hamano [Fri, 3 Apr 2026 20:01:08 +0000 (13:01 -0700)]
Merge branch 'mf/format-patch-commit-list-format'
Improve the recently introduced `git format-patch
--commit-list-format` (formerly `--cover-letter-format`) option,
including a new "modern" preset and better CLI ergonomics.
* mf/format-patch-commit-list-format:
format-patch: --commit-list-format without prefix
format-patch: add preset for --commit-list-format
format-patch: wrap generate_commit_list_cover()
format.commitListFormat: strip meaning from empty
docs/pretty-formats: add %(count) and %(total)
format-patch: rename --cover-letter-format option
format-patch: refactor generate_commit_list_cover
pretty.c: better die message %(count) and %(total)
"git format-patch --cover-letter" learns to use a simpler format
instead of the traditional shortlog format to list its commits with
a new --cover-letter-format option and format.commitListFormat
configuration variable.
* mf/format-patch-cover-letter-format:
docs: add usage for the cover-letter fmt feature
format-patch: add commitListFormat config
format-patch: add ability to use alt cover format
format-patch: move cover letter summary generation
pretty.c: add %(count) and %(total) placeholders
The `mingw_strftime()` wrapper exists to work around msvcrt.dll's
incomplete `strftime()` implementation by dynamically loading the
version from ucrtbase.dll at runtime via `LoadLibrary()` +
`GetProcAddress()`. When the binary is already linked against UCRT
(i.e. when building in the UCRT64 environment), the linked-in
`strftime()` is the ucrtbase.dll version, making the dynamic loading
needless churn: It's calling the very same code.
Simply guard both the declaration and implementation so that the
unnecessary work-around is skipped in UCRT builds.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a companion patch of 3b9b2c2a29a (compat/posix: introduce
writev(3p) wrapper, 2026-03-13) where support for using the `writev()`
wrapper was introduced in the `Makefile` and the Meson-based build, but
the CMake build still needs that treatment, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document po/AGENTS.md-driven AI workflows (with an example prompt)
before the PO helper section, cross-reference git-po-helper, and
summarize quality checks plus optional agent integration in the PO
helper section.
Jiang Xin [Wed, 25 Feb 2026 06:19:25 +0000 (14:19 +0800)]
l10n: docs: add review instructions in AGENTS.md
Add a new "Reviewing po/XX.po" section to po/AGENTS.md that provides
comprehensive guidance for AI agents to review translation files.
Translation diffs lose context, especially for multi-line msgid and
msgstr entries. Some LLMs ignore context and cannot evaluate
translations accurately; others rely on scripts to search for context
in source files, making the review process time-consuming. To address
this, git-po-helper implements the compare subcommand, which extracts
new or modified translations with full context (complete msgid/msgstr
pairs), significantly improving review efficiency.
A limitation is that the extracted content lacks other
already-translated content for reference, which may affect terminology
consistency. This is mitigated by including a glossary in the PO file
header. git-po-helper-generated review files include the header entry
and glossary (if present) by default.
The review workflow leverages git-po-helper subcommands:
- git-po-helper compare: Extract new or changed entries between two
versions of a PO file into a valid PO file for review. Supports
multiple modes:
* Compare HEAD with the working tree (local changes)
* Compare a commit's parent with the commit (--commit)
* Compare a commit with the working tree (--since)
* Compare two arbitrary revisions (-r)
- git-po-helper msg-select: Split large review files into smaller
batches by entry index range for manageable review sessions. Supports
range formats like "-50" (first 50), "51-100", "101-" (to end).
Jiang Xin [Sun, 15 Feb 2026 06:06:19 +0000 (14:06 +0800)]
l10n: docs: add translation instructions in AGENTS.md
Add a new "Translating po/XX.po" section to po/AGENTS.md with detailed
workflow and procedures for AI agents to translate language-specific PO
files. Users can invoke AI-assisted translation in coding tools with a
prompt such as:
"Translate the po/XX.po file by referring to @po/AGENTS.md"
Translation results serve as drafts; human contributors must review and
approve before submission.
To address the low translation efficiency of some LLMs, batch
translation replaces entry-by-entry translation. git-po-helper
implements a gettext JSON format for translation files, replacing PO
format during translation to enable batch processing.
The git-po-helper flow reduces the number of turns (86 → 56) with
similar execution time; the bottleneck appears to be LLM processing
rather than network interaction.
Jiang Xin [Sun, 15 Feb 2026 03:53:27 +0000 (11:53 +0800)]
l10n: docs: add update PO instructions in AGENTS.md
Add a new section to po/AGENTS.md to provide clear instructions for
updating language-specific PO files. The improved documentation
significantly reduces both conversation turns and execution time.
Performance evaluation with the Qwen model:
# Before: instructions in po/README.md; the custom prompt
# references po/README.md during execution
git-po-helper agent-test --runs=5 --agent=qwen update-po \
--prompt="Update po/zh_CN.po according to po/README.md"
# After: instructions in po/AGENTS.md; the built-in prompt
# references po/AGENTS.md during execution
git-po-helper agent-test --runs=5 --agent=qwen update-po
Benchmark results (5-run average):
| Metric | Before | After | Improvement |
|-------------|---------|--------|-------------|
| Turns | 22 | 4 | -82% |
| Exec. time | 38s | 9s | -76% |
| Turn range | 17-39 | 3-9 | |
| Time range | 25s-68s | 7s-14s | |
Jiang Xin [Sat, 14 Feb 2026 06:08:46 +0000 (14:08 +0800)]
l10n: docs: add AGENTS.md with update POT instructions
Add a new documentation file po/AGENTS.md that provides agent-specific
instructions for generating or updating po/git.pot, separating them
from the general po/README.md. This separation allows for more targeted
optimization of AI agent workflows.
Performance evaluation with the Qwen model:
# Before: No agent-specific instructions; use po/README.md for
# reference.
git-po-helper agent-test --runs=5 --agent=qwen update-pot \
--prompt="Update po/git.pot according to po/README.md"
# Phase 1: add the instructions to po/README.md; the prompt
# references po/README.md during execution
git-po-helper agent-test --runs=5 --agent=qwen update-pot \
--prompt="Update po/git.pot according to po/README.md"
# Phase 2: add the instructions to po/AGENTS.md; use the built-in
# prompt that references po/AGENTS.md during execution
git-po-helper agent-test --runs=5 --agent=qwen update-pot
Benchmark results (5-run average):
Phase 1 - Optimizing po/README.md:
| Metric | Before | Phase 1 | Improvement |
|-------------|---------|---------|-------------|
| Turns | 17 | 5 | -71% |
| Exec. time | 34s | 14s | -59% |
| Turn range | 3-36 | 3-7 | |
| Time range | 10s-59s | 9s-19s | |
| Metric | Before | Phase 2 | Improvement |
|-------------|---------|---------|-------------|
| Turns | 17 | 3 | -82% |
| Exec. time | 34s | 8s | -76% |
| Turn range | 3-36 | 3-3 | |
| Time range | 10s-59s | 6s-9s | |
Separating agent-specific instructions into AGENTS.md provides:
- More focused and concise instructions for AI agents
- Cleaner README.md for human readers
- An additional 11% reduction in turns and 17% reduction in execution
time
- More consistent behavior (turn range reduced from 3-7 to 3-3)
Jiang Xin [Wed, 4 Feb 2026 03:39:22 +0000 (11:39 +0800)]
l10n: add .gitattributes to simplify location filtering
To simplify the location filtering process for l10n contributors when
committing po/XX.po files, add filter attributes for selected PO
files to the repository. This ensures all contributors automatically
get the same filter configuration without manual setup in
.git/info/attributes.
The default filter (gettext-no-location) is applied to all .po files
except:
- Legacy, unmaintained PO files that still contain location comments.
Leaving the filter off avoids index vs working-tree discrepancies for
these files. The CI pipeline will report an error when future updates
touch these legacy files.
- Some PO files use a different filter that strips only line numbers
from location comments while keeping filenames.
Contributors still need to manually define the filter drivers via
git-config as documented in po/README.md.
Four PO files that use location filtering (po/ca.po, po/es.po, po/ga.po,
po/ru.po) were batch-modified so their on-disk format matches the filter
output (e.g. line wrapping), avoiding index vs working-tree mismatch.
Additionally, po/README.md has been reorganized: the material on
preparing location-less PO files for commit has been moved from
"Updating a XX.po file" to a separate "Preparing a XX.po file for
commit" section. This prevents AI agents from introducing unrelated
operations when updating PO files.
Basically, what's happening here is that we spawn git-fast-import(1) and
wait for it to output a certain string, "progress checkpoint". Curiously
though, what we end up reading is "rogress checkpoint" -- so the first
byte of the expected string is missing.
Same as in the preceding commit, this seems to be a bug in Dash itself
that bisects to c5bf970 (expand: Add multi-byte support to pmatch,
2024-06-02). But other than in the preceding commit, this bug has
already been fixed upstream in 079059a (input: Fix heap-buffer-overflow
in preadbuffer on long lines, 2026-02-11), which is part of v0.5.13.2.
For now though, work around the bug by waiting for the expected output
in a different way. There is no good reason why one version should work
better than the other, but at least the new version doesn't exhibit the
bug. And, if you ask me, it's also slightly easier to read.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t: work around multibyte bug in quoted heredocs with Dash v0.5.13
When executing our test suite with Dash v0.5.13.2 one can observe
several test failures that all have the same symptoms: we have a quoted
heredoc that contains multibyte characters, but the final data does not
match what we actually wanted to write. One such example is in t0300,
where we see the diffs like the following:
While seemingly the same, the data that we've written via the heredoc
contains some invisible bytes. The expected hex representation of the
string is:
7065 72c3 ba2e 6769 74 per...git
But what we actually get instead is this string:
7065 7285 02c3 ba02 852e 6769 74 per.......git
What's important to note here is that the multibyte character exists in
both versions. But in the broken version we see that the bytes are
wrapped in a sequence of "85 02" and "02 85". This is the CTLMBCHAR byte
sequence of Dash, which it uses internally to quote multibyte sequences.
As it turns out, this bug was introduced in c5bf970 (expand: Add
multi-byte support to pmatch, 2024-06-02), which adds multibyte support
to more contexts of Dash. One of these contexts seems to be in heredocs,
and Dash _does_ correctly unquote these multibyte sequences when using
an unquoted heredoc. But the bug seems to be that this unquoting does
not happen in quoted heredocs, and the bug still exists on the latest
"master" branch.
For now, work around the bug by using unquoted heredocs instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In our codebase we have a couple of wrappers around mmap(3p) that allow
us to reimplement the syscall on platforms that don't have it natively,
like for example Windows. Other projects that embed the reftable library
may have a different infra though to hook up mmap wrappers, but these
are currently hard to integrate.
Provide the infrastructure to let projects easily define the mmap
interface with a custom struct and custom functions.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/system: add abstraction to retrieve time in milliseconds
We directly call gettimeofday(3p), which may not be available on some
platforms. Provide the infrastructure to let projects easily use their
own implementations of this function.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/fsck: use REFTABLE_UNUSED instead of UNUSED
While we have the reftable-specific `REFTABLE_UNUSED` header, we
accidentally introduced a new usage of the Git-specific `UNUSED` header
into the reftable library in 9051638519 (reftable: add code to
facilitate consistency checks, 2025-10-07).
Convert the site to use `REFTABLE_UNUSED`.
Ideally, we'd move the definition of `UNUSED` into "git-compat-util.h"
so that it becomes in accessible to the reftable library. But this is
unfortunately not easily possible as "compat/mingw-posix.h" requires
this macro, and this header is included by "compat/posix.h".
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: provide fsync(3p) via system header
Users of the reftable library are expected to provide their own function
callback in cases they want to sync(3p) data to disk via the reftable
write options. But if no such function was provided we end up calling
fsync(3p) directly, which may not even be available on some systems.
While dropping the explicit call to fsync(3p) would work, it would lead
to an unsafe default behaviour where a project may have forgotten to set
up the callback function, and that could lead to potential data loss. So
this is not a great solution.
Instead, drop the callback function and make it mandatory for the
project to define fsync(3p). In the case of Git, we can then easily
inject our custom implementation via the "reftable-system.h" header so
that we continue to use `fsync_component()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're including a couple of standard headers like <stdint.h> in a bunch
of locations, which makes it hard for a project to plug in their own
logic for making required functionality available. For us this is for
example via "compat/posix.h", which already includes all of the system
headers relevant to us.
Introduce a new "reftable-system.h" header that allows projects to
provide their own headers. This new header is supposed to contain all
the project-specific bits to provide the POSIX-like environment, and some
additional supporting code. With this change, we thus have the following
split in our system-specific code:
- "reftable/reftable-system.h" is the project-specific header that
provides a POSIX-like environment. Every project is expected to
provide their own implementation.
- "reftable/system.h" contains the project-independent definition of
the interfaces that a project needs to implement. This file should
not be touched by a project.
- "reftable/system.c" contains the project-specific implementation of
the interfaces defined in "system.h". Again, every project is
expected to provide their own implementation.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 2 Apr 2026 04:15:16 +0000 (00:15 -0400)]
refs/files-backend: drop const to fix strchr() warning
In show_one_reflog_ent(), we're fed a writable strbuf buffer, which we
parse into the various reflog components. We write a NUL over email_end
to tie off one of the fields, and thus email_end must be non-const.
But with a C23 implementation of libc, strchr() will now complain when
assigning the result to a non-const pointer from a const one. So we can
fix this by making the source pointer non-const.
But there's a catch. We derive that source pointer by parsing the line
with parse_oid_hex_algop(), which requires a const pointer for its
out-parameter. We can work around that by teaching it to use our
CONST_OUTPARAM() trick, just like skip_prefix().
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 2 Apr 2026 04:15:14 +0000 (00:15 -0400)]
http: drop const to fix strstr() warning
In redact_sensitive_header(), a C23 implementation of libc will complain
that strstr() assigns the result from "const char *cookie" to "char
*semicolon".
Ultimately the memory is writable. We're fed a strbuf, generate a const
pointer "sensitive_header" within it using skip_iprefix(), and then
assign the result to "cookie". So we can solve this by dropping the
const from "cookie" and "sensitive_header".
However, this runs afoul of skip_iprefix(), which wants a "const char
**" for its out-parameter. We can solve that by teaching skip_iprefix()
the same "make sure out is at least as const as in" magic that we
recently taught to skip_prefix().
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 2 Apr 2026 04:15:12 +0000 (00:15 -0400)]
range-diff: drop const to fix strstr() warnings
This is another case where we implicitly drop the "const" from a pointer
by feeding it to strstr() and assigning the result to a non-const
pointer. This is OK in practice, since the const pointer originally
comes from a writable source (a strbuf), but C23 libc implementations
have started to complain about it.
We do write to the output pointer, so it needs to remain non-const. We
can just switch the input pointer to also be non-const in this case. By
itself that would run into problems with calls to skip_prefix(), but
since that function has now been taught to match in/out constness
automatically, it just works without us doing anything further.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 2 Apr 2026 04:15:10 +0000 (00:15 -0400)]
pkt-line: make packet_reader.line non-const
The "line" member of a packet_reader struct is marked as const. This
kind of makes sense, because it's not its own allocated buffer that
should be freed, and we often use const to indicate that. But it is
always writable, because it points into the non-const "buffer" member.
And we rely on this writability in places like send-pack and
receive-pack, where we parse incoming packet contents by writing NULs
over delimiters. This has traditionally worked because we implicitly
cast away the constness with strchr() like:
const char *head;
char *p;
head = reader->line;
p = strchr(head, ' ');
Since C23 libc provides a generic strchr() to detect this implicit
const removal, this now generate a compiler warning on some platforms
(like recent glibc).
We can fix it by marking "line" as non-const, as well as a few
intermediate variables (like "head" in the above example). Note that by
itself, switching to a non-const variable would cause problems with this
line in send-pack.c:
if (!skip_prefix(reader->line, "unpack ", &reader->line))
But due to our skip_prefix() magic introduced in the previous commit,
this compiles fine (both the in and out-parameters are non-const, so we
know it is safe).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 2 Apr 2026 04:15:07 +0000 (00:15 -0400)]
skip_prefix(): check const match between in and out params
The skip_prefix() function takes in a "const char *" string, and returns
via a "const char **" out-parameter that points somewhere in that
string. This is fine if you are operating on a const string, like:
const char *in = ...;
const char *out;
if (skip_prefix(in, "foo", &out))
...look at out...
It is also OK if "in" is not const but "out" is, as we add an implicit
const when we pass "in" to the function. But there's another case where
this is limiting. If we want both fields to be non-const, like:
it doesn't work. The compiler will complain about the type mismatch in
passing "&out" to a parameter which expects "const char **". So to make
this work, we have to do an explicit cast.
But such a cast is ugly, and also means that we run afoul of making this
mistake:
which causes us to write to the memory pointed by "in", which was const.
We can imagine these four cases as:
(1) const in, const out
(2) non-const in, const out
(3) non-const in, non-const out
(4) const in, non-const out
Cases (1) and (2) work now. We would like case (3) to work but it
doesn't. But we would like to catch case (4) as a compile error.
So ideally the rule is "the out-parameter must be at least as const as
the in-parameter". We can do this with some macro trickery. We wrap
skip_prefix() in a macro so that it has access to the real types of
in/out. And then we pass those parameters through another macro which:
1. Fails if the "at least as const" rule is not filled.
2. Casts to match the signature of the real skip_prefix().
There are a lot of ways to implement the "fails" part. You can use
__builtin_types_compatible_p() to check, and then either our
BUILD_ASSERT macros or _Static_assert to fail. But that requires some
conditional compilation based on compiler feature. That's probably OK
(the fallback would be to just cast without catching case 4). But we can
do better.
The macro I have here uses a ternary with a dead branch that tries to
assign "in" to "out", which should work everywhere and lets the compiler
catch the problem in the usual way. With an input like this:
foo.c: In function ‘bad_drop_const’:
foo.c:2:35: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers]
2 | ((const char **)(0 ? ((*(out) = (in)),(out)) : (out)))
| ^
foo.c:4:31: note: in expansion of macro ‘CONST_OUTPARAM’
4 | #define foo(in,out) foo((in), CONST_OUTPARAM((in), (out)))
| ^~~~~~~~~~~~~~
foo.c:23:9: note: in expansion of macro ‘foo’
23 | foo(x, y);
| ^~~
It's a bit verbose, but I think makes it reasonably clear what's going
on. Using BUILD_ASSERT_OR_ZERO() ends up much worse. Using
_Static_assert you can be a bit more informative, but that's not
something we use at all yet in our code-base (it's an old gnu-ism later
standardized in C11).
Our generic macro only works for "const char **", which is something we
could improve by using typeof(in). But that introduces more portability
questions, and also some weird corner cases (e.g., around implicit void
conversion).
This patch just introduces the concept. We'll make use of it in future
patches.
Note that we rename skip_prefix() to skip_prefix_impl() here, to avoid
expanding the macro when defining the function. That's not strictly
necessary since we could just define the macro after defining the inline
function. But that would not be the case for a non-inline function (and
we will apply this technique to them later, and should be consistent).
It also gives us more freedom about where to define the macro. I did so
right above the definition here, which I think keeps the relevant bits
together.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 2 Apr 2026 04:15:05 +0000 (00:15 -0400)]
pseudo-merge: fix disk reads from find_pseudo_merge()
The goal of this commit was to fix a const warning when compiling
with new versions of glibc, but ended up untangling a much deeper
problem.
The find_pseudo_merge() function does a bsearch() on the "commits"
pointer of a pseudo_merge_map. This pointer ultimately comes from memory
mapped from the on-disk bitmap file, and is thus not writable.
The "commits" array is correctly marked const, but the result from
bsearch() is returned directly as a non-const pseudo_merge_commit
struct. Since new versions of glibc annotate bsearch() in a way that
detects the implicit loss of const, the compiler now warns.
My first instinct was that we should be returning a const struct. That
requires apply_pseudo_merges_for_commit() to mark its local pointer as
const. But that doesn't work! If the offset field has the high-bit set,
we look it up in the extended table via nth_pseudo_merge_ext(). And that
function then feeds our const struct to read_pseudo_merge_commit_at(),
which writes into it by byte-swapping from the on-disk mmap.
But I think this points to a larger problem with find_pseudo_merge(). It
is not just that the return value is missing const, but it is missing
that byte-swapping! And we know that byte-swapping is needed here,
because the comparator we use for bsearch() also calls our
read_pseudo_merge_commit_at() helper.
So I think the interface is all wrong here. We should not be returning a
pointer to a struct which was cast from on-disk data. We should be
filling in a caller-provided struct using the bytes we found,
byte-swapping the values.
That of course raises the dual question: how did this ever work, and
does it work now? The answer to the first part is: this code does not
seem to be triggered in the test suite at all. If we insert a BUG("foo")
call into apply_pseudo_merges_for_commit(), it never triggers.
So I think there is something wrong or missing from the test setup, and
this bears further investigation. Sadly the answer to the second part
("does it work now") is still "no idea". I _think_ this takes us in a
positive direction, but my goal here is mainly to quiet the compiler
warning. Further bug-hunting on this experimental feature can be done
separately.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 2 Apr 2026 04:15:03 +0000 (00:15 -0400)]
find_last_dir_sep(): convert inline function to macro
The find_last_dir_sep() function is implemented as an inline function
which takes in a "const char *" and returns a "char *" via strrchr().
That means that just like strrchr(), it quietly removes the const from
our pointer, which could lead to accidentally writing to the resulting
string.
But C23 versions of libc (including recent glibc) annotate strrchr()
such that the compiler can detect when const is implicitly lost, and it
now complains about the call in this inline function.
We can't just switch the return type of the function to "const char *",
though. Some callers really do want a non-const string to be returned
(and are OK because they are feeding a non-const string into the
function).
The most general solution is for us to annotate find_last_dir_sep() in
the same way that is done for strrchr(). But doing so relies on using
C23 generics, which we do not otherwise require.
Since this inline function is wrapping a single call to strrchr(), we
can take a shortcut. If we implement it as a macro, then the original
type information is still available to strrchr(), and it does the check
for us.
Note that this is just one implementation of find_last_dir_sep(). There
is an alternate implementation in compat/win32/path-utils.h. It doesn't
suffer from the same warning, as it does not use strrchr() and just
casts away const explicitly. That's not ideal, and eventually we may
want to conditionally teach it the same C23 generic trick that strrchr()
uses. But it has been that way forever, and our goal here is just
quieting new warnings, not improving const-checking.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>