Junio C Hamano [Thu, 21 May 2026 23:48:20 +0000 (08:48 +0900)]
Merge branch 'ps/maintenance-daemonize-lockfix'
"git maintenance" that goes background did not use the lockfile to
prevent multiple maintenance processes from running at the same
time, which has been corrected.
* ps/maintenance-daemonize-lockfix:
run-command: honor "gc.auto" for auto-maintenance
builtin/maintenance: fix locking with "--detach"
Junio C Hamano [Thu, 21 May 2026 03:28:55 +0000 (12:28 +0900)]
Merge branch 'js/mingw-no-nedmalloc' into maint-2.54
Stop using unmaintained custom allocator in Windows build which was
the last user of the code.
* js/mingw-no-nedmalloc:
mingw: remove the vendored compat/nedmalloc/ subtree
mingw: drop the build-system plumbing for nedmalloc
mingw: stop using nedmalloc
Junio C Hamano [Thu, 21 May 2026 03:27:47 +0000 (12:27 +0900)]
Merge branch 'js/maintenance-fix-deadlock-on-win10' into maint-2.54
To help Windows 10 installations, avoid removing files whose
contents are still mmap()'ed.
* js/maintenance-fix-deadlock-on-win10:
maintenance(geometric): do release the `.idx` files before repacking
mingw: optionally use legacy (non-POSIX) delete semantics
Junio C Hamano [Thu, 21 May 2026 03:26:28 +0000 (12:26 +0900)]
Merge branch 'js/ci-github-actions-update' into maint-2.54
Update various GitHub Actions versions.
* js/ci-github-actions-update:
l10n: bump mshick/add-pr-comment from v2 to v3
ci: bump git-for-windows/setup-git-for-windows-sdk from v1 to v2
ci: bump actions/checkout from v5 to v6
ci: bump actions/github-script from v8 to v9
ci: bump actions/{upload,download}-artifact to v7 and v8
ci: bump microsoft/setup-msbuild from v2 to v3
Junio C Hamano [Thu, 21 May 2026 03:06:47 +0000 (12:06 +0900)]
Merge branch 'kn/refs-generic-helpers'
Refactor service routines in the ref subsystem backends.
* kn/refs-generic-helpers:
refs: use peeled tag values in reference backends
refs: add peeled object ID to the `ref_update` struct
refs: move object parsing to the generic layer
update-ref: handle rejections while adding updates
update-ref: move `print_rejected_refs()` up
refs: return `ref_transaction_error` from `ref_transaction_update()`
refs: extract out reflog config to generic layer
refs: introduce `ref_store_init_options`
refs: remove unused typedef 'ref_transaction_commit_fn'
Junio C Hamano [Wed, 20 May 2026 01:30:57 +0000 (10:30 +0900)]
Merge branch 'ps/history-fixup'
"git history" learned "fixup" command.
* ps/history-fixup:
builtin/history: introduce "fixup" subcommand
builtin/history: generalize function to commit trees
replay: allow callers to control what happens with empty commits
Some tests assume that bare repository accesses are by default
allowed; rewrite some of them to avoid the assumption, rewrite
others to explicitly set safe.bareRepository to allow them.
* js/adjust-tests-to-explicitly-access-bare-repo:
safe.bareRepository: default to "explicit" with WITH_BREAKING_CHANGES
status tests: filter `.gitconfig` from status output
ls-files tests: filter `.gitconfig` from `--others` output
t5601: restore `.gitconfig` after includeIf test
t1305: use `--git-dir=.` for bare repo in include cycle test
t1300: remove global config settings injected by test-lib.sh
t7900: do not let `$HOME/.gitconfig` interfere with XDG tests
test-lib: allow bare repository access when breaking changes are enabled
Junio C Hamano [Wed, 20 May 2026 01:30:56 +0000 (10:30 +0900)]
Merge branch 'en/diffstat-utf8-truncation-fix'
The computation to shorten the filenames shown in diffstat measured
width of individual UTF-8 characters to add up, but forgot to take
into account error cases (e.g., an invalid UTF-8 sequence, or a
control character).
* en/diffstat-utf8-truncation-fix:
diff: fix out-of-bounds reads and NULL deref in diffstat UTF-8 truncation
Junio C Hamano [Wed, 20 May 2026 01:30:56 +0000 (10:30 +0900)]
Merge branch 'js/mingw-no-nedmalloc'
Stop using unmaintained custom allocator in Windows build which was
the last user of the code.
* js/mingw-no-nedmalloc:
mingw: remove the vendored compat/nedmalloc/ subtree
mingw: drop the build-system plumbing for nedmalloc
mingw: stop using nedmalloc
Update code paths that assumed "unsigned long" was long enough for
"size_t".
* js/objects-larger-than-4gb-on-windows:
ci: run expensive tests on push builds to integration branches
t5608: mark >4GB tests as EXPENSIVE
test-tool synthesize: add precomputed SHA-256 pack for 4 GiB + 1
test-tool synthesize: precompute pack for 4 GiB + 1
test-tool synthesize: use the unsafe hash for speed
t5608: add regression test for >4GB object clone
test-tool: add a helper to synthesize large packfiles
delta, packfile: use size_t for delta header sizes
odb, packfile: use size_t for streaming object sizes
git-zlib: handle data streams larger than 4GB
index-pack, unpack-objects: use size_t for object size
"git rebase --update-refs", when used with an rebase.instructionFormat
with "%d" (describe) in it, tried to update local branch HEAD by
mistake, which has been corrected.
Junio C Hamano [Tue, 19 May 2026 00:57:44 +0000 (09:57 +0900)]
Merge branch 'kh/name-rev-custom-format'
A new builtin "git format-rev" is introduced for pretty formatting
one revision expression per line or commit object names found in
running text.
* kh/name-rev-custom-format:
format-rev: introduce builtin for on-demand pretty formatting
name-rev: make dedicated --annotate-stdin --name-only test
name-rev: factor code for sharing with a new command
name-rev: run clang-format before factoring code
name-rev: wrap both blocks in braces
Junio C Hamano [Tue, 19 May 2026 00:57:43 +0000 (09:57 +0900)]
Merge branch 'en/xdiff-cleanup-3'
Preparation of the xdiff/ codebase to work with Rust.
* en/xdiff-cleanup-3:
xdiff/xdl_cleanup_records: make execution of action easier to follow
xdiff/xdl_cleanup_records: make setting action easier to follow
xdiff/xdl_cleanup_records: make limits more clear
xdiff/xdl_cleanup_records: use unambiguous types
xdiff: use unambiguous types in xdl_bogo_sqrt()
xdiff/xdl_cleanup_records: delete local recs pointer
Junio C Hamano [Tue, 19 May 2026 00:57:43 +0000 (09:57 +0900)]
Merge branch 'mc/http-emptyauth-negotiate-fix'
The 'http.emptyAuth=auto' configuration now correctly attempts
Negotiate authentication before falling back to manual credentials.
This allows seamless Kerberos ticket-based authentication without
requiring users to explicitly set 'http.emptyAuth=true'.
* mc/http-emptyauth-negotiate-fix:
doc: clarify http.emptyAuth values
t5563: add tests for http.emptyAuth with Negotiate
http: attempt Negotiate auth in http.emptyAuth=auto mode
http: extract http_reauth_prepare() from retry paths
Junio C Hamano [Sun, 17 May 2026 13:58:30 +0000 (22:58 +0900)]
Merge branch 'hn/git-checkout-m-with-stash'
"git checkout -m another-branch" was invented to deal with local
changes to paths that are different between the current and the new
branch, but it gave only one chance to resolve conflicts. The command
was taught to create a stash to save the local changes.
* hn/git-checkout-m-with-stash:
checkout -m: autostash when switching branches
checkout: rollback lock on early returns in merge_working_tree
sequencer: teach autostash apply to take optional conflict marker labels
sequencer: allow create_autostash to run silently
stash: add --label-ours, --label-theirs, --label-base for apply
Junio C Hamano [Sun, 17 May 2026 13:58:30 +0000 (22:58 +0900)]
Merge branch 'ss/t7004-unhide-git-failures'
Test clean-up.
* ss/t7004-unhide-git-failures:
t7004: avoid subshells to capture git exit codes
t7004: dynamically grab expected state in tests
t7004: drop hardcoded tag count for state verification
Junio C Hamano [Sun, 17 May 2026 13:58:29 +0000 (22:58 +0900)]
Merge branch 'en/backfill-fixes-and-edges'
The 'git backfill' command now rejects revision-limiting options that
are incompatible with its operation, uses standard documentation for
revision ranges, and includes blobs from boundary commits by default
to improve performance of subsequent operations.
* en/backfill-fixes-and-edges:
backfill: default to grabbing edge blobs too
backfill: document acceptance of revision-range in more standard manner
backfill: reject rev-list arguments that do not make sense
Jeff King [Sat, 16 May 2026 02:23:10 +0000 (22:23 -0400)]
commit: handle large commit messages in utf8 verification
Running t4205 under UBSan with the EXPENSIVE prereq enabled triggers an
error when we try to create a commit message that is over 2GB:
commit.c:1574:6: runtime error: signed integer overflow:
-2147483648 - 1 cannot be represented in type 'int'
The problem is that find_invalid_utf8() is not prepared to handle
large buffers, as it uses an "int" to represent buffer sizes and
offsets.
We can fix this with a few changes:
1. We'll take in "len" as a size_t (which is what the caller has
anyway, since it's working with a strbuf).
2. We need to return a size_t to give the offset to the invalid utf8,
but we also need a sentinel value for "no invalid value"
(previously "-1"). Let's split these to return a bool for "found
invalid utf8" and then pass back the offset as an out-parameter.
We'll switch the function name to match the new semantics.
3. The caller in verify_utf8() uses a "long" to store buffer
positions, which is a bit funny. This goes back to 08a94a145c
(commit/commit-tree: correct latin1 to utf-8, 2012-06-28) and is
perhaps trying to match our use of "unsigned long" for object sizes
(though we don't care about it ever becoming negative here). This
should be a size_t, too, as some platforms (like Windows) still use
a 32-bit long on machines with 64-bit pointers.
4. The "bytes" field within find_invalid_utf() does not have range
problems. It is the number of bytes the utf8 sequence claims to
have, so is limited by how many bits can be set in a single 8-bit
byte. However, if we leave it as an "int" then the compiler will
complain about the sign mismatch when comparing it to "len". So
let's make it unsigned, too.
All of this is a little silly, of course, because 2GB text commit
messages are clearly nonsense. So we might consider rejecting them
outright, but it is easy enough to make these helper functions more
robust in the meantime.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sat, 16 May 2026 02:16:22 +0000 (22:16 -0400)]
apply: plug leak on "patch too large" error
In apply_patch(), we return immediately if read_patch_file() returns an
error. Traditionally this was OK, since an error from strbuf_read()
would restore the strbuf to its unallocated state.
But since f1c0e3946e (apply: reject patches larger than ~1 GiB,
2022-10-25), we may also return an error if we successfully read the
patch but it is too large. In this case we leak the strbuf contents when
apply_patch() returns.
You can see it in action by running t4141 under LSan with the EXPENSIVE
prereq enabled.
We can fix this in one of two places:
1. In read_patch_file(), we could release the buffer before returning
the error, behaving more like a raw strbuf_read() call.
2. In apply_patch(), we can release the strbuf ourselves before
returning.
I picked the latter, since it future proofs us against read_patch_file()
getting new error modes. We also have a cleanup label in that function
already, so now our error handling at this spot matches the rest of the
function (and all of the variables are initialized such that the rest of
the cleanup is correctly a noop at this point).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "gc.auto" configuration has traditionally been used to turn off
running git-gc(1) as part of our auto-maintenance. We have eventually
switched over to git-maintenance(1) in a95ce12430 (maintenance: replace
run_auto_gc(), 2020-09-17), and with 1942d48380 (maintenance: optionally
skip --auto process, 2020-08-28) we have introduced "maintenance.auto"
to control whether or not to run auto-maintenance.
At that point though we still shelled out to git-gc(1) internally. So
if "gc.auto=0" was set we would still _execute_ git-maintenance(1), but
the command would have exited fast because git-gc(1) itself knew to
honor the config key.
This has recently changed though, as we have adapted the default
maintenance strategy to not use git-gc(1) anymore. The consequence is
that "gc.auto=0" doesn't have an effect anymore, which is a somewhat
surprising change in behaviour for our users.
Adapt `run_auto_maintenance()` so that it knows to also read "gc.auto",
similar to how it also reads both "maintenance.autoDetach" and
"gc.autoDetach".
Reported-by: Jean-Christophe Manciot <actionmystique@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running git-maintenance(1), we create a lockfile that is supposed
to keep other maintenance processes from running at the same time. This
lockfile is broken though in case the "--detach" flag is passed: the
lockfile is created by the parent process and will be cleaned up either
manually or on exit. But when detaching, the parent will exit before all
of the background maintenance tasks have been run, and consequently the
lock only covers a smaller part of the whole maintenance process.
Fix this bug by reassigning all tempfiles from the parent process to the
child process when daemonizing so that it becomes the responsibility of
the child to clean them up.
Note that this is a broader fix, as we now always reassign tempfiles
when daemonizing. This is a natural consequence of the semantics of
`daemonize()` though, as it essentially promises to continue running the
current process in the background. It is thus sensible to have that
function perform the whole dance of assigning resources to the child
process, including tempfiles.
There's only a single other caller in "daemon.c", but that process
doesn't create any tempfiles before the call to `daemonize()` and is
thus not impacted by this change.
Reported-by: Jean-Christophe Manciot <actionmystique@gmail.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Derrick Stolee <stolee@gmail.com> Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
To help Windows 10 installations, avoid removing files whose
contents are still mmap()'ed.
* js/maintenance-fix-deadlock-on-win10:
maintenance(geometric): do release the `.idx` files before repacking
mingw: optionally use legacy (non-POSIX) delete semantics
Junio C Hamano [Tue, 12 May 2026 02:04:44 +0000 (11:04 +0900)]
Merge branch 'js/ci-github-actions-update'
Update various GitHub Actions versions.
* js/ci-github-actions-update:
l10n: bump mshick/add-pr-comment from v2 to v3
ci: bump git-for-windows/setup-git-for-windows-sdk from v1 to v2
ci: bump actions/checkout from v5 to v6
ci: bump actions/github-script from v8 to v9
ci: bump actions/{upload,download}-artifact to v7 and v8
ci: bump microsoft/setup-msbuild from v2 to v3
format-rev: introduce builtin for on-demand pretty formatting
Introduce a new builtin for pretty formatting one revision expression
per line or commit object names found in running text.
Sometimes you want to format commits. Most of the time you’re
walking the graph, e.g. getting a range of commits like
`master..topic`. That’s a job for git-log(1).
But there are times when you want to format commits that you encounter
on demand:
• Full hashes in running text that you might want to pretty-print
• git-last-modified(1) outputs full hashes that you can do the same
with
• git-cherry(1) has `-v` for commit subject, but maybe you want
something else?
But now you can’t use git-log(1), git-show(1), or git-rev-list(1):
• You can’t feed commits piecemeal to these commands, one input
for one output; they block until standard in is closed
• You can’t feed a list of possibly duplicate commits, like the output
of git-last-modified(1); they effectively deduplicate the output
Beyond these two points there’s also the input massage problem: you
cannot feed mixed input (revisions mixed with arbitrary text).
One might hope that git-cat-file(1) can save us. But it doesn’t
support pretty formats.
But there is one command that already both handles revisions as
arguments, revisions on standard input, and even revisions mixed in
with arbitrary text. Namely git-name-rev(1): the command for outputting
symbolic names for commits.
We made some room in `builtin/name-rev.c` two commits ago. Let’s
now add this new git-format-rev(1) command. Taking inspiration from
git-name-rev(1), there are two modes:
• revs: like git-name-rev(1) in argv mode, but one revision per line
on standard in
• text: like git-name-rev(1) with `--annotate-stdin`
***
We need to add this command to the exception list in
`t/t1517-outside-repo.sh` because it uses “EXPERIMENTAL!”
in the usage line.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
name-rev: make dedicated --annotate-stdin --name-only test
The previous commit split the `--name-only` handling:
1. `--annotate-stdin`: uses the new `struct command`
2. The rest: uses `struct name_ref_data`
But there is no dedicated test for the option combination in (1). That
means that the following tests will fail if you neglect to set
`command.u.name_only` properly:
name-rev --annotate-stdin works with commitGraph
name-rev --annotate-stdin works with non-monotonic timestamps
even though it has nothing to do with what these tests are supposed
to test.
Let’s add another regression test now that it is relevant.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
name-rev: factor code for sharing with a new command
We are about to introduce a new command git-format-rev(1) to this
file. Let’s factor some code so that we can share it with the new
command.
We want to be able to format commits found in freeform text, and
git-name-rev(1) already has a function for that but for symbolic
names. Let’s use a tagged union for the command-specific payload.
No functional changes.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Samo Pogačnik [Mon, 11 May 2026 19:20:42 +0000 (21:20 +0200)]
shallow: fix relative deepen on non-shallow repositories
The commit "3ef68ff40e (shallow: handling fetch relative-deepen,
2026-02-15)" introduced a bug where using --deepen=<n> on a non-
shallow repository incorrectly treated the value as an absolute
depth, resulting in a shallow fetch and truncated history.
This patch prevents any modification when a relative deepen is
requested on a non-shallow repository.
A test is added to ensure that history is not changed when
--deepen is used on a non-shallow repository.
Reported-by: Owen Stephens <owen@owenstephens.co.uk> Signed-off-by: Samo Pogačnik <samo_pogacnik@t-2.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
build: tolerate use of _Generic from glibc 2.43 with Clang
When building with `make DEVELOPER=1` we explicitly pass "-std=gnu99" to
the compiler so that we don't start leaning on features exposed by more
recent versions of the C standard. Unfortunately though, glibc 2.43
started to use type-generic expressions. This works alright with GCC,
but when compiling with Clang this leads to errors:
$ make DEVELOPER=1 CC=clang
CC daemon.o
In file included from daemon.c:3:
./git-compat-util.h:344:11: error: '_Generic' is a C11 extension [-Werror,-Wc11-extensions]
344 | return !!strchr(path, '/');
| ^
/usr/include/string.h:265:3: note: expanded from macro 'strchr'
265 | __glibc_const_generic (S, const char *, strchr (S, C))
| ^
/usr/include/x86_64-linux-gnu/sys/cdefs.h:838:3: note: expanded from macro '__glibc_const_generic'
838 | _Generic (0 ? (PTR) : (void *) 1, \
| ^
In theory, the `__glibc_const_generic` macro does have feature gating:
But this feature gating isn't effective because `_has_extension()` will
always evaluate to true as C generics _are_ available as a language
extension to GNU C99 when using Clang. This would have been different if
`_has_feature()` was used instead, in which case it would have properly
evaluated to `false`.
GCC has a workaround to squelch this warning from standard system
headers, but because clang fails due to [-Werror,-Wc11-extensions],
as it lacks the corresponding workaround.
For both meson and Makefile, pass -Wno-c11-extensions when we are
building with clang.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Helped-by: Shardul Natu <snatu@google.com>
[jc: replaced Makefile side with Shardul's approach] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 11 May 2026 04:49:05 +0000 (13:49 +0900)]
Merge branch 'jc/neuter-sideband-fixup'
Try to resurrect and reboot a stalled "avoid sending risky escape
sequences taken from sideband to the terminal" topic by Dscho. The
plan is to keep it in 'next' long enough to see if anybody screams
with the "everything dropped except for ANSI color escape sequences"
default.
* jc/neuter-sideband-fixup:
sideband: drop 'default' configuration
sideband: offer to configure sanitizing on a per-URL basis
sideband: add options to allow more control sequences to be passed through
sideband: do allow ANSI color sequences by default
sideband: introduce an "escape hatch" to allow control characters
sideband: mask control characters
Junio C Hamano [Mon, 11 May 2026 01:05:54 +0000 (10:05 +0900)]
Merge branch 'ps/test-set-e-clean'
The test suite harness and many individual test scripts have been
updated to work correctly when 'set -e' is in effect, which helps
detect misspelled test commands.
* ps/test-set-e-clean:
t: detect errors outside of test cases
t9902: fix use of `read` with `set -e`
t6002: fix use of `expr` with `set -e`
t1301: don't fail in case setfacl(1) doesn't exist or fails
t0008: silence error in subshell when using `grep -v`
t: prepare `test_when_finished ()`/`test_atexit()` for `set -e`
t: prepare execution of potentially failing commands for `set -e`
t: prepare conditional test execution for `set -e`
t: prepare `git config --unset` calls for `set -e`
t: prepare `stop_git_daemon ()` for `set -e`
t: prepare `test_must_fail ()` for `set -e`
t: prepare `test_match_signal ()` calls for `set -e`
Junio C Hamano [Mon, 11 May 2026 01:05:53 +0000 (10:05 +0900)]
Merge branch 'ar/parallel-hooks'
Hook scripts defined via the configuration system can now be
configured to run in parallel.
* ar/parallel-hooks:
t1800: test SIGPIPE with parallel hooks
hook: allow hook.jobs=-1 to use all available CPU cores
hook: add hook.<event>.enabled switch
hook: move is_known_hook() to hook.c for wider use
hook: warn when hook.<friendly-name>.jobs is set
hook: add per-event jobs config
hook: add -j/--jobs option to git hook run
hook: mark non-parallelizable hooks
hook: allow pre-push parallel execution
hook: allow parallel hook execution
hook: parse the hook.jobs config
config: add a repo_config_get_uint() helper
repository: fix repo_init() memleak due to missing _clear()
Junio C Hamano [Mon, 11 May 2026 01:05:53 +0000 (10:05 +0900)]
Merge branch 'cc/promisor-auto-config-url'
Promisor remote handling has been refactored and fixed in
preparation for auto-configuration of advertised remotes.
* cc/promisor-auto-config-url:
t5710: use proper file:// URIs for absolute paths
promisor-remote: remove the 'accepted' strvec
promisor-remote: keep accepted promisor_info structs alive
promisor-remote: refactor accept_from_server()
promisor-remote: refactor has_control_char()
promisor-remote: refactor should_accept_remote() control flow
promisor-remote: reject empty name or URL in advertised remote
promisor-remote: clarify that a remote is ignored
promisor-remote: pass config entry to all_fields_match() directly
promisor-remote: try accepted remotes before others in get_direct()
Junio C Hamano [Mon, 11 May 2026 01:05:51 +0000 (10:05 +0900)]
Merge branch 'sp/refs-reduce-the-repository'
Code clean-up to use the right instance of a repository instance in
calls inside refs subsystem.
* sp/refs-reduce-the-repository:
refs/reftable-backend: drop uses of the_repository
refs: remove the_hash_algo global state
refs: add struct repository parameter in get_files_ref_lock_timeout_ms()
Junio C Hamano [Sun, 10 May 2026 23:51:15 +0000 (08:51 +0900)]
ci: enable EXPENSIVE for contributor builds
Earlier, we enabled EXPENSIVE tests for pushes to integration
branches. As we didn't have any CI jobs that run these tests, this
was a step in the right direction.
It however is an ineffective and inefficient use of the maintainer
time, which does not scale, to allow contributors to send changes
that are less tested at the list, only to force the maintainer
notice breakages caused by their changes but only after these
changes are mixed with changes from other contributors. The
problematic topic needs to be isolated by bisecting, and it
historically has been done by the maintainer alone.
It is far better to let the problem identified early, preferably
before the problematic code leaves the hands of the original
developer. In order for it to happen, the test coverage of the
contributor tests must be at least as wide as the coverage of the
integration tests.
Enable expensive tests for CI jobs triggered by pull requests. This
will make each contributor take care of their own, which scales much
better.
Keep the expensive tests also enabled for the pushes of integration
branches, as that is the only place we can notice problems stemming
from mismerges and inter-topic interactions, even if the topics from
the contributors in isolation all passes these tests.
Junio C Hamano [Mon, 11 May 2026 00:05:28 +0000 (09:05 +0900)]
Merge branch 'js/objects-larger-than-4gb-on-windows' into jc/ci-enable-expensive
* js/objects-larger-than-4gb-on-windows:
ci: run expensive tests on push builds to integration branches
t5608: mark >4GB tests as EXPENSIVE
test-tool synthesize: add precomputed SHA-256 pack for 4 GiB + 1
test-tool synthesize: precompute pack for 4 GiB + 1
test-tool synthesize: use the unsafe hash for speed
t5608: add regression test for >4GB object clone
test-tool: add a helper to synthesize large packfiles
delta, packfile: use size_t for delta header sizes
odb, packfile: use size_t for streaming object sizes
git-zlib: handle data streams larger than 4GB
index-pack, unpack-objects: use size_t for object size
Abhinav Gupta [Sun, 10 May 2026 22:41:11 +0000 (15:41 -0700)]
rebase: ignore non-branch update-refs
The following Git configuration breaks git rebase --update-refs:
[rebase]
instructionFormat = %s%d
The '%d' format requests all available decorations for a commit,
filling the global decoration table with all of them,
which --update-refs then uses to populate 'update-ref' instructions
in the rebase todo list.
Specifically, this results in the following instruction:
update-ref HEAD
The todo parser then rejects the instruction:
error: update-ref requires a fully qualified refname e.g. refs/heads/HEAD
error: invalid line 3: update-ref HEAD
To fix, ignore decorations that are not local branches
when scanning through the table.
This matches the documented contract:
it moves branch refs under refs/heads/
and leaves display-only decorations (HEAD, tags, etc.) alone.
Verification:
A regression test that fails without this fix is included.
Signed-off-by: Abhinav Gupta <mail@abhinavg.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sun, 10 May 2026 12:42:04 +0000 (14:42 +0200)]
sideband: clear full line when printing remote messages
demultiplex_sideband() can write its remote output over active local
progress lines. That's why it has been using ANSI code Erase in Line on
smart terminals to clear the remainder of lines it writes since ebe8fa738d (fix display overlap between remote and local progress,
2007-11-04).
This erases the last character of remote lines that span the full width
of the terminal, though, as the cursor is stuck at the rightmost column
for them. It's the same effect as in the following command, which
clears the 1 and shows just the leading zeros:
$ EL="\033[K"
$ printf "%0${COLUMNS}d${EL}\n" 1
If we move the ANSI code to the start we get to see the 1 as well:
$ printf "${EL}%0${COLUMNS}d\n" 1
So do the same in demultiplex_sideband() and emit the ANSI code as a
prefix instead of a suffix to show messages in full even if they happen
to fill the whole width of a smart terminal.
Reported-by: Hugo Osvaldo Barrera <hugo@whynothugo.nl> Suggested-by: Chris Torek <chris.torek@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Saagar Jha [Sun, 10 May 2026 03:50:22 +0000 (03:50 +0000)]
submodule-config: fix reading submodule.fetchJobs
update_clone_config_from_gitmodules() passes &max_jobs to
config_from_gitmodules(), but max_jobs is already a pointer. This causes
the config value to be written to the wrong address and get dropped.
Pass max_jobs directly.
Signed-off-by: Saagar Jha <saagar@saagarjha.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ci: run expensive tests on push builds to integration branches
Derrick Stolee suggested [1] that expensive tests should be run at a
regular cadence rather than on every PR iteration. Gate GIT_TEST_LONG
on push builds to the integration branches (next, master, main, maint)
so that the EXPENSIVE prereq is satisfied there but not during PR
validation, where the extra minutes of wall-clock time do not justify
themselves.
Helped-by: Derrick Stolee <derrickstolee@github.com> Assisted-by: Claude Opus 4.6 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even with precomputed pack constants that reduced the helper's
runtime from minutes to seconds, the >4GB clone tests still take
200-850 seconds across CI jobs. The bottleneck is no longer the
pack generation but the clone operations themselves: transporting,
unpacking, and indexing 4 GiB of data through unpack-objects and
index-pack is inherently expensive.
As Jeff King pointed out [1], t5608 alone takes 160 seconds on his
laptop while the rest of the entire test suite finishes in under 90
seconds, and the test's disk footprint (4+ GiB source repo, then
two clones) is problematic for developers who use RAM disks for
their trash directories.
Gate the >4GB tests on the EXPENSIVE prereq (which requires
GIT_TEST_LONG to be set) in addition to SIZE_T_IS_64BIT, keeping
them out of normal local test runs.
Add a SHA-256 entry to the fast_packs[] table. The pack prefix and
deflate block structure are identical to SHA-1 (the pack format does
not encode the hash algorithm in its header). Only the suffix differs:
SHA-256 OIDs are 32 bytes instead of 20, giving a 609-byte suffix
compared to 513 for SHA-1, and a different pack checksum.
The constants were generated by running the generic path inside a
repository initialized with --object-format=sha256.
Assisted-by: Claude Opus 4.6 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-tool synthesize: precompute pack for 4 GiB + 1
The synthesize helper hashes roughly 8 GiB of data through SHA-1 to
produce a 4 GiB + 1 pack (4 GiB for the pack checksum, 4 GiB for
the blob OID). Since the blob content is all NUL bytes, every byte
in the resulting pack file is deterministic for a given blob size and
hash algorithm.
Add a fast path that writes the pack from precomputed constants:
a 25-byte prefix (pack header, object header, zlib header, first
block header), the zero-filled bulk with periodic 5-byte deflate
block headers, and a 513-byte suffix (tree, two commits, empty tree,
pack SHA-1 checksum). This eliminates all SHA-1 and adler32
computation, making the helper purely I/O-bound.
The precomputed constants are stored in a struct fast_pack array
keyed by hash algorithm format_id, so that adding SHA-256 support
later requires only adding another array entry with its suffix.
The constants were generated by running the generic path and
extracting the non-zero bytes from the resulting pack file.
Benchmarks generating a 4 GiB + 1 pack (3 runs each, SHA1DC on
x86_64):
On CI, where t5608 currently takes 200-850 seconds depending on the
job, the fast path cuts the pack-generation phase from minutes to
seconds, leaving only the clone operations themselves.
Assisted-by: Claude Opus 4.6 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-tool synthesize: use the unsafe hash for speed
Jeff King pointed out on the mailing list [1] that t5608's new >4GB
test cases dominate the entire test suite runtime: 160 seconds on his
laptop when the rest of the suite finishes in under 90 seconds, and
305-850 seconds across CI jobs. The bottleneck is that the synthesize
helper hashes roughly 8 GB of data through SHA-1 (4 GB for the pack
checksum plus 4 GB for the blob OID) for a 4 GB+1 blob.
Since the helper generates known test data, collision detection is
unnecessary. Switch from repo->hash_algo to unsafe_hash_algo(), which
uses hardware-accelerated SHA-1 (via OpenSSL or Apple CommonCrypto)
when available.
Benchmarks on an x86_64 machine generating a 4 GB+1 pack (2 runs
each, interleaved):
SHA-1 backend Run 1 Run 2
SHA1DC (safe) 75s 80s
OpenSSL (unsafe) 21s 19s
The effect scales linearly. At 64 MB with 10 randomized interleaved
runs, the OpenSSL unsafe backend shows a 5.4x improvement (median
0.202s vs 1.088s) with tight variance (stdev 0.028s vs 0.095s).
The speedup is only realized when the build has a fast unsafe backend
compiled in. The CI's linux-TEST-vars job already sets
OPENSSL_SHA1_UNSAFE=YesPlease; macOS benefits from Apple CommonCrypto
when configured. On builds without a separate unsafe backend (such as
the default Windows builds), unsafe_hash_algo() returns the regular
collision-detecting implementation and the change is a no-op.
The shift overflow bug in index-pack and unpack-objects caused incorrect
object size calculation when the encoded size required more than 32 bits
of shift. This would result in corrupted or failed unpacking of objects
larger than 4GB.
Add a test that creates a pack file containing a 4GB+ blob using the
new 'test-tool synthesize pack --reachable-large' command, then clones
the repository to verify the fix works correctly.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-tool: add a helper to synthesize large packfiles
To test Git's behavior with very large pack files, we need a way to
generate such files quickly.
A naive approach using only readily-available Git commands would take
over 10 hours for a 4GB pack file, which is prohibitive.
Side-stepping Git's machinery and actual zlib compression by writing
uncompressed content with the appropriate zlib header makes things
much faster. The fastest method using this approach generates many
small, unreachable blob objects and takes about 1.5 minutes for 4GB.
However, this cannot be used because we need to test git clone, which
requires a reachable commit history.
Generating many reachable commits with small, uncompressed blobs takes
about 4 minutes for 4GB. But this approach 1) does not reproduce the
issues we want to fix (which require individual objects larger than
4GB) and 2) is comparatively slow because of the many SHA-1
calculations.
The approach taken here generates a single large blob (filled with NUL
bytes), along with the trees and commits needed to make it reachable.
This takes about 2.5 minutes for 4.5GB, which is the fastest option
that produces a valid, clonable repository with an object large enough
to trigger the bugs we want to test.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
delta, packfile: use size_t for delta header sizes
The delta header decoding functions return unsigned long, which
truncates on Windows for objects larger than 4GB. Introduce size_t
variants get_delta_hdr_size_sz() and get_size_from_delta_sz() that
preserve the full 64-bit size, and use them in packed_object_info()
where the size is needed for streaming decisions.
This was originally authored by LordKiRon <https://github.com/LordKiRon>,
who preferred not to reveal their real name and therefore agreed that I
take over authorship.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb, packfile: use size_t for streaming object sizes
The odb_read_stream structure uses unsigned long for the size field,
which is 32-bit on Windows even in 64-bit builds. When streaming
objects larger than 4GB, the size would be truncated to zero or an
incorrect value, resulting in empty files being written to disk.
Change the size field in odb_read_stream to size_t and introduce
unpack_object_header_sz() to return sizes via size_t pointer. Since
object_info.sizep remains unsigned long for API compatibility, use
temporary variables where the types differ, with comments noting the
truncation limitation for code paths that still use unsigned long.
Widening the producers to size_t in this way introduces a handful of
silent size_t -> unsigned long narrowings on Windows, all in
builtin/pack-objects.c, where the consumers are still typed
unsigned long. Make those narrowings explicit with
cast_size_t_to_ulong() so they assert loudly the moment an object
actually exceeds ULONG_MAX bytes:
- oe_get_size_slow() returns unsigned long but holds a size_t
locally; cast at the return.
- write_reuse_object() passes a size_t into check_pack_inflate(),
whose expect parameter is unsigned long; cast at the call.
- check_object() routes a size_t through SET_SIZE() and
SET_DELTA_SIZE(), both of which take unsigned long via
oe_set_size() / oe_set_delta_size(); cast at the three call
sites in the OBJ_OFS_DELTA / OBJ_REF_DELTA branches and in the
non-delta default arm.
The cast-only treatment is deliberately a stop-gap. Properly
widening oe_set_size, oe_get_size_slow's return type,
check_pack_inflate's expect parameter, object_info.sizep,
patch_delta, and the OE_SIZE_BITS bit-fields cascades into a series
that is too large to be reviewable, so the proper widening is
deferred to a follow-up topic. Until then,
cast_size_t_to_ulong() at least makes the truncation explicit at
the source: it documents the boundary, and on a 64-bit non-Windows
platform it is a no-op.
This was originally authored by LordKiRon <https://github.com/LordKiRon>,
who preferred not to reveal their real name and therefore agreed that I
take over authorship.
Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, zlib's `uLong` type is 32-bit even on 64-bit systems. When
processing data streams larger than 4GB, the `total_in` and `total_out`
fields in zlib's `z_stream` structure wrap around, which caused the
sanity checks in `zlib_post_call()` to trigger `BUG()` assertions.
The git_zstream wrapper now tracks its own 64-bit totals rather than
copying them from zlib. The sanity checks compare only the low bits,
using `maximum_unsigned_value_of_type(uLong)` to mask appropriately for
the platform's `uLong` size.
This is based on work by LordKiRon in git-for-windows#6076.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
index-pack, unpack-objects: use size_t for object size
When unpacking objects from a packfile, the object size is decoded
from a variable-length encoding. On platforms where unsigned long is
32-bit (such as Windows, even in 64-bit builds), the shift operation
overflows when decoding sizes larger than 4GB. The result is a
truncated size value, causing the unpacked object to be corrupted or
rejected.
Fix this by changing the size variable to size_t, which is 64-bit on
64-bit platforms, and ensuring the shift arithmetic occurs in 64-bit
space.
Declare the per-byte continuation variable `c` as size_t as well,
matching the canonical varint decoder unpack_object_header_buffer()
in packfile.c. With c as size_t the expression (c & 0x7f) << shift
is naturally size_t-typed, so the explicit cast that an earlier
iteration carried at the use site is no longer needed.
While at it, add the same overflow guard that
unpack_object_header_buffer() carries: if the cumulative shift would
exceed bitsizeof(size_t) - 7, refuse the input rather than invoking
undefined behavior. Unlike unpack_object_header_buffer(), which
labels this case "bad object header", report it as the platform
limit it actually is: a header may be perfectly well-formed and
still encode a size we cannot represent locally (notably on a
32-bit build consuming a packfile produced on a 64-bit host).
This was originally authored by LordKiRon <https://github.com/LordKiRon>,
who preferred not to reveal their real name and therefore agreed that I
take over authorship.
Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 8 May 2026 05:31:03 +0000 (14:31 +0900)]
t5551: "GIT_TEST_LONG=Yes make test" is broken
The "test_expect_success 'tag following always works over v0 http'"
test in t5551 fails when it tries to run "git init tags", but this
happens only when EXPENSIVE test is allowed to run.
This is because the step tries to create a repository with "git init
tags" but the EXPENSIVE test that runs way before it creates and
leaves around a temporary file "tags". Have the EXPENSIVE test
clean it up after itself.
Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
mingw: remove the vendored compat/nedmalloc/ subtree
The previous two commits stopped opting into nedmalloc on Windows
and stripped out the build-system plumbing that referenced it; the
compat/nedmalloc/ subtree now has no callers and no consumers in
the build, so retire it from the tree.
Note that this patch is larger than can be sent via the mailing
list, and was originally sent in three-pieces and merged back on the
receiving end.
Assisted-by: Opus 4.7 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
mingw: drop the build-system plumbing for nedmalloc
With the previous commit removing every opt-in, the build-system
plumbing for nedmalloc has nothing left to switch on. Remove it so
that the upcoming deletion of the compat/nedmalloc/ tree is a pure
file removal.
Assisted-by: Opus 4.7 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The vendored nedmalloc allocator under compat/nedmalloc/ has been
unmaintained upstream for a very long time: the original repository at
https://github.com/ned14/nedmalloc received its last commit on July 5,
2014, and was archived (made read-only) by its owner on March 15, 2019.
Our copy has been carried forward unchanged ever since.
The Git for Windows commit that introduced mimalloc as a replacement
on Windows ("mingw: use mimalloc", 2019-06-24, present in the Git for
Windows branch thicket but not upstream) already observed at that time
that nedmalloc had ceased to see any updates for several years.
This came to a head when the Git for Windows SDK upgraded to GCC 16:
the `add_segment()` function in `compat/nedmalloc/malloc.c.h` declares
`int nfences = 0` and only references it inside an `assert()`, which
GCC 16 now flags as `-Wunused-but-set-variable`. Combined with the
`-Werror` enabled by `DEVELOPER=1`, this turns into a hard build
failure:
compat/nedmalloc/malloc.c.h: In function 'add_segment':
compat/nedmalloc/malloc.c.h:3897:7: error: variable 'nfences' set but not used [-Werror=unused-but-set-variable=]
3897 | int nfences = 0;
| ^~~~~~~
cc1.exe: all warnings being treated as errors
The same source built without complaint under GCC 15.2.0; the
regression was bisected to the SDK package update at
https://github.com/git-for-windows/git-sdk-64/commit/188d93dd455
(`mingw-w64-x86_64-gcc 15.2.0-14 -> 16.1.0-1`), with the failing CI
run captured at
https://github.com/git-for-windows/git-sdk-64/actions/runs/25244795074.
Rather than patch the unmaintained vendored sources to silence the
warning, stop opting into nedmalloc altogether on Windows. The
platform allocator is what every non-MINGW build already uses, and a
fresh build of git.git's master against a minimal Git for Windows SDK
upgraded to GCC 16 completes successfully.
The compat/nedmalloc/ subtree itself is removed by subsequent commits
in this series.
Assisted-by: Opus 4.7 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The doc `technical/commit-graph.adoc` says that replace objects and
commit grafts turn off commit-graph:
Commit grafts and replace objects can change the shape of the commit
history. The latter can also be enabled/disabled on the fly using
`--no-replace-objects`. This leads to difficulty storing both possible
interpretations of a commit id, especially when computing generation
numbers. The commit-graph will not be read or written when
replace-objects or grafts are present.
But this isn’t mentioned in the user-facing doc. Let’s mention it on
git-replace(1) and git-commit-graph(1).
Acked-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>