Junio C Hamano [Mon, 5 May 2025 21:56:24 +0000 (14:56 -0700)]
Merge branch 'js/windows-arm64'
Update to arm64 Windows port.
* js/windows-arm64:
max_tree_depth: lower it for clangarm64 on Windows
mingw(arm64): do move the `/etc/git*` location
msvc: do handle builds on Windows/ARM64
mingw: do not use nedmalloc on Windows/ARM64
config.mak.uname: add support for clangarm64
bswap.h: add support for built-in bswap functions
Junio C Hamano [Tue, 29 Apr 2025 21:21:30 +0000 (14:21 -0700)]
Merge branch 'ps/fewer-perl'
Reduce requirement for Perl in our documentation build and a few
scripts.
* ps/fewer-perl:
Documentation: stop depending on Perl to generate command list
Documentation: stop depending on Perl to massage user manual
request-pull: stop depending on Perl
filter-branch: stop depending on Perl
Junio C Hamano [Tue, 29 Apr 2025 21:21:29 +0000 (14:21 -0700)]
Merge branch 'ps/reftable-api-revamp'
Overhaul of the reftable API.
* ps/reftable-api-revamp:
reftable/table: move printing logic into test helper
reftable/constants: make block types part of the public interface
reftable/table: introduce iterator for table blocks
reftable/table: add `reftable_table` to the public interface
reftable/block: expose a generic iterator over reftable records
reftable/block: make block iterators reseekable
reftable/block: store block pointer in the block iterator
reftable/block: create public interface for reading blocks
git-zlib: use `struct z_stream_s` instead of typedef
reftable/block: rename `block_reader` to `reftable_block`
reftable/block: rename `block` to `block_data`
reftable/table: move reading block into block reader
reftable/block: simplify how we track restart points
reftable/blocksource: consolidate code into a single file
reftable/reader: rename data structure to "table"
reftable: fix formatting of the license header
Since a call to repo_config() can be called with repo set to NULL
these days, a command that is marked as RUN_SETUP in the builtin
command table does not have to check repo with NULL before making
the call.
* ua/call-repo-config-with-possibly-null-repository:
builtin/difftool: remove unnecessary if statement
builtin/add: remove unnecessary if statement
Wire up a couple of benchmarking options that we end up writing into our
"GIT-BUILD-OPTIONS" file. These options allow users to control how
exactly benchmarks are executed.
Note that neither `GIT_PERF_MAKE_COMMAND` nor `GIT_PERF_MAKE_OPTS` are
exposed as a build option. Those options are used by "t/perf/run", which
is not used by Meson.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wire up benchmarks in Meson. The setup is mostly the same as how we wire
up our tests. The only difference is that benchmarks get wired up via
the `benchmark()` option instead of via `test()`, which gives them a bit
of special treatment:
- Benchmarks never run in parallel.
- Benchmarks aren't run by default when tests are executed.
- Meson does not inject the `MALLOC_PERTURB` environment variable.
Using benchmarks is quite simple:
```
$ meson setup build
# Run all benchmarks.
$ meson test -C build --benchmark
# Run a specific benchmark.
$ meson test -C build --benchmark p0000-*
```
Other than that the usual command line arguments accepted when running
tests are also accepted when running benchmarks.
Note that the benchmarking target is somewhat limited because it will
only run benchmarks for the current build. Other use cases, like running
benchmarks against multiple different versions of Git, are not currently
supported. Users should continue to use "t/perf/run" for those use
cases. The script should get extended at one point in time to support
Meson, but this is outside of the scope of this series.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "perf-lib.sh" script is sourced by all of our benchmarking suites to
make available common infrastructure. The script assumes that build and
source directory are the same, which works for our Makefile. But the
assumption breaks with both CMake and Meson, where the build directory
can be located in an arbitrary place.
Adapt the script so that it works with out-of-tree builds. Most
importantly, this requires us to figure out the location of the build
directory:
- When running benchmarks via our Makefile the build directory is the
same as the source directory. We already know to derive the test
directory ("t/") via `$(pwd)/..`, which works because we chdir into
"t/perf" before executing benchmarks. We can thus derive the build
directory by appending another "/.." to that path.
- When running benchmarks via Meson the build directory is located at
an arbitrary location. The build system thus has to make the path
known by exporting the `GIT_BUILD_DIR` environment variable.
This change prepares us for wiring up benchmarks in Meson.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our benchmarks use a couple of Perl scripts to compute results. These
Perl scripts get executed directly, and as the shebang is hardcoded to
"/usr/bin/perl" this will fail on any system where the Perl interpreter
is located in a different path.
Our build infrastructure already lets users configure the location of
Perl, which ultimately gets written into the GIT-BUILD-OPTIONS file.
This file is being sourced by "test-lib.sh", and consequently we already
have the "PERL_PATH" variable available that contains its configured
location.
Use "PERL_PATH" to execute Perl scripts, which makes them work on more
esoteric systems like NixOS. Furthermore, adapt the shebang to use
env(1) to execute Perl so that users who have Perl in PATH, but in a
non-standard location can execute the script directly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/perf: fix benchmarks with alternate repo formats
Many of our benchmarks operate on a user-defined repository that we copy
over before running the benchmarked logic. To keep unintentional side
effects caused by on-disk state at bay we skip copying some files. This
includes for example hooks, but also the repo's configuration.
It is quite sensible to not copy over the configuration, as it is quite
easy to inadvertently carry over configuration that may significantly
impact the performance measurements. But we cannot fully ignore the
configuration either, as it may contain information about the repository
format. This will cause failures when for example using a repository
with SHA256 object format or the reftable ref format.
Fix the issue by parsing the reference and object formats from the
source repository and passing them to git-init(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 25 Apr 2025 00:25:34 +0000 (17:25 -0700)]
Merge branch 'rj/build-tweaks'
Various build tweaks, including CSPRNG selection on some platforms.
* rj/build-tweaks:
config.mak.uname: set CSPRNG_METHOD to getrandom on Linux
config.mak.uname: add arc4random to the cygwin build
config.mak.uname: add sysinfo() configuration for cygwin
builtin/gc.c: correct RAM calculation when using sysinfo
config.mak.uname: add clock_gettime() to the cygwin build
config.mak.uname: add HAVE_GETDELIM to the cygwin section
config.mak.uname: only set NO_REGEX on cygwin for v1.7
config.mak.uname: add a note about NO_STRLCPY for Linux
Makefile: remove NEEDS_LIBRT build variable
meson.build: set default help format to html on windows
meson.build: only set build variables for non-default values
Makefile: only set some BASIC_CFLAGS when RUNTIME_PREFIX is set
meson.build: remove -DCURL_DISABLE_TYPECHECK
Junio C Hamano [Fri, 25 Apr 2025 00:25:33 +0000 (17:25 -0700)]
Merge branch 'ps/parse-options-integers'
Update parse-options API to catch mistakes to pass address of an
integral variable of a wrong type/size.
* ps/parse-options-integers:
parse-options: detect mismatches in integer signedness
parse-options: introduce precision handling for `OPTION_UNSIGNED`
parse-options: introduce precision handling for `OPTION_INTEGER`
parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()`
parse-options: support unit factors in `OPT_INTEGER()`
global: use designated initializers for options
parse: fix off-by-one for minimum signed values
Junio C Hamano [Fri, 25 Apr 2025 00:25:33 +0000 (17:25 -0700)]
Merge branch 'ps/object-file-cleanup'
Code clean-up.
* ps/object-file-cleanup:
object-store: merge "object-store-ll.h" and "object-store.h"
object-store: remove global array of cached objects
object: split out functions relating to object store subsystem
object-file: drop `index_blob_stream()`
object-file: split up concerns of `HASH_*` flags
object-file: split out functions relating to object store subsystem
object-file: move `xmmap()` into "wrapper.c"
object-file: move `git_open_cloexec()` to "compat/open.c"
object-file: move `safe_create_leading_directories()` into "path.c"
object-file: move `mkdir_in_gitdir()` into "path.c"
Junio C Hamano [Fri, 25 Apr 2025 00:14:14 +0000 (17:14 -0700)]
CI updates
Ever since we issued 2.49, external forces broke our CI jobs in
various ways, and we had to adjust our code to work them around.
Backmerge them from the 'master' front to make it easier to test
real changes to the maintenance track.
Junio C Hamano [Thu, 24 Apr 2025 23:10:47 +0000 (16:10 -0700)]
ci: skip unavailable external software
The ci/install-dependencies.sh script used in a very early phase of
our CI jobs downloads Perforce, Git-LFS, and JGit, used for running
the test scripts. The test framework is prepared to properly skip
the tests that depend on these external software, but the CI script
is unnecessarily strict (due to its use of "set -e" in ci/lib.sh)
and fails the entire CI run before even starting to test the rest of
the system.
Notice a failure to download to any of these external software, but
keep going. We need to be careful about cleaning after a failed
wget, as a later part of the script that does:
if type jgit >/dev/null 2>&1
then
echo "$(tput setaf 6)JGit Version$(tput sgr0)"
jgit version
else
echo >&2 "WARNING: JGit wasn't installed, see above for clues why"
fi
will (surprise!) succeed running "type jgit", and then fail with
"jgit version", taking the whole thing down due to "set -e".
Junio C Hamano [Wed, 23 Apr 2025 20:58:50 +0000 (13:58 -0700)]
Merge branch 'ja/doc-reset-mv-rm-markup-updates'
Doc mark-up updates.
* ja/doc-reset-mv-rm-markup-updates:
doc: add markup for characters in Guidelines
doc: fix asciidoctor synopsis processing of triple-dots
doc: convert git-mv to new documentation format
doc: move synopsis git-mv commands in the synopsis section
doc: convert git-rm to new documentation format
doc: fix synopsis analysis logic
doc: convert git-reset to new documentation format
Junio C Hamano [Wed, 23 Apr 2025 20:58:50 +0000 (13:58 -0700)]
Merge branch 'pb/perf-test-fixes'
"make perf" fixes.
* pb/perf-test-fixes:
p7821: fix instructions for testing with threads
p9210: fix 'scalar clone' when running from a detached HEAD
p7821: fix test_perf invocation for prereqs
When using the launchctl scheduler, the weekly job runs daily, and the
daily job runs on the first six days of each month. This appears to be
due to specifying "Day" in the calendar intervals, which according to
launchd.plist(5) is for specifying days of the month rather than days of
the week. The behaviour of running a job on the 0th day is undocumented,
but in my testing appears to be the same as not specifying "Day" in the
calendar interval, in which case the job will run daily.
Use "Weekday" in the calendar intervals, which is the correct way to
schedule jobs to run on specific days of the week.
Signed-off-by: Josh Heinrichs <joshiheinrichs@gmail.com> Acked-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
max_tree_depth: lower it for clangarm64 on Windows
Just as in b64d78ad02ca (max_tree_depth: lower it for MSVC to avoid
stack overflows, 2023-11-01), I encountered the same problem with the
clang builds on Windows/ARM64.
The symptom is an exit code 127 when t6700 tries to verify that `git
archive big` fails.
This exit code is reserved on Unix/Linux to mean "command not found".
Unfortunately in this case, it is the fall-back chosen by
Cygwin's `pinfo::status_exit()` method when encountering
the NSTATUS `STATUS_STACK_OVERFLOW`, see
https://github.com/cygwin/cygwin/blob/cygwin-3.6.1/winsup/cygwin/pinfo.cc#L171
I verified manually that the stack overflow always happens somewhere
around tree depth 1403, therefore 1280 should be a safe bound in these
instances.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In fb5e3378f8 (mingw: move Git for Windows' system config where users
expect it, 2021-06-22), I moved the location of Git for Windows' system
config and system Git attributes file to the top-level `/etc/` directory
(because it is a much more obvious location than, say, `/mingw64/etc/`).
The patch relied on a very specific scenario that the newly-supported
Windows/ARM64 builds of `git.exe` fails to fall into. So let's broaden
the condition a bit, so that Windows/ARM64 builds also use that location
(instead of the even more obscure `/clangarm64/etc/` directory).
This fixes https://github.com/git-for-windows/git/issues/5431.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows/ARM64 settled on using `clang` to compile `git.exe`, and
hence needs to run in a system where `MSYSTEM` is set to `CLANGARM64`
and the prefix to use is `/clangarm64`.
We already did that in the `MINGW` arm, i.e. for regular Git for Windows
builds using MINGW GCC (or `clang`'s shim pretending to be GCC), now it
is time to do the same in the MS Visual C part.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jc: adjust config.mak.uname for c18400c6] Signed-off-by: Junio C Hamano <gitster@pobox.com>
It does not compile there, and seeing as nedmalloc has been pretty much
unmaintained since at least November 2017, as per
https://github.com/ned14/nedmalloc/issues/20#issuecomment-343432314,
there is also no hope that any fixes will materialize there.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jc: adjust config.mak.uname for c18400c6] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dennis Ameling [Wed, 23 Apr 2025 08:01:44 +0000 (08:01 +0000)]
config.mak.uname: add support for clangarm64
CLANGARM64 is a relatively new MSYSTEM added by the MSYS2 team. In order
to have Git build correctly for this platform, let's add some
configuration for it to config.mak.uname.
Signed-off-by: Dennis Ameling <dennis@dennisameling.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dennis Ameling [Wed, 23 Apr 2025 08:01:43 +0000 (08:01 +0000)]
bswap.h: add support for built-in bswap functions
Newer compiler versions, like GCC 10 and Clang 12, have built-in
functions for bswap32 and bswap64. This comes in handy, for example,
when targeting CLANGARM64 on Windows, which would not be supported
without this logic.
Signed-off-by: Dennis Ameling <dennis@dennisameling.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the log_reencode field from struct rev-info, as it is not used.
This field was introduced in 52883fb, but it hasn't been used since its
introduction.
Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 22 Apr 2025 11:16:32 +0000 (07:16 -0400)]
p5332: drop "+" from --stdin-packs input
This perf script creates a midx by running "git multi-pack-index write"
with the "--stdin-packs" option. We feed that stdin by running "find" on
.git/objects/pack, using sed to strip off everything but the basename.
But that sed invocation also does something peculiar: it adds a "+" to
the start of each pack name. This causes the multi-pack-index command to
barf. The modified name does not match any pack it knows about, so it
ends up with an empty list of packs to put in the midx. And thus nothing
matches the --preferred-pack option we pass, which causes it die().
The fix is to remove the extra "+" (which also lets us simplify the sed
invocation a bit, as it is now just stripping the leading directories).
But that leaves the mystery of why it was ever there in the first place.
The answer is that an earlier iteration of the patch series had a
concept of "disjoint" packs in the midx. And one of its patches here:
taught read_packs_from_stdin() to treat a leading "+" as marking a
disjoint pack. But in the second version of the series, which was
ultimately merged, that disjoint concept went away, and the code to
parse "+" did likewise. The regular regression tests were adjusted to
match, but this case in t/perf was forgotten.
Signed-off-by: Jeff King <peff@peff.net> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The shell completion scripts in "contrib/completion" are being tested,
but none of our build systems support installing them. This is somewhat
confusing for Meson, where users can explicitly enable building these
scripts via `-Dcontrib=completion`. This option only controlls whether
the completions are built and tested against, where "building" is a bit
of an euphemism for "copying them into the build directory".
Teach both our Makefile and Meson to install our Bash completion script.
For now, this is the only completion script that we're installing given
that Bash completions "just work" with a canonical well-known location
nowadays. Other completion scripts, like for example the one for zsh,
don't have a well-known location and/or require extra steps by the user
to make them available. As such, we skip installing these scripts for
now, but we may do so in the future if we ever figure out a proper way
to do this.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ci: fix p4d executable not being found on GitHub Actions
Our tests for git-p4(1) depend on the p4d(1) and p4(1) executables to
exist. As we require specific versions of those binaries which typically
aren't available on common distributions, we install them manually via
"ci/install-dependencies.sh".
This script will put the binaries into "$CUSTOM_PATH", which gets
defined by "ci/lib.sh" -- if not explicitly overridden, its value will
be set to "$HOME/path". This causes issues though when running our tests
as unprivileged user, as we do both in GitLab CI and GitHub Actions,
because "$HOME" will be different when installing dependencies and when
running the tests. Consequently, the downloaded binaries will not be
found unless "$CUSTOM_PATH" is overridden to a common location.
We already do this for GitLab CI, where it points to "/custom". Let's do
the same for GitHub Actions so that Perforce-based tests are executed
again.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we already teach the `repo_config()` in "f29f1990b5
(config: teach repo_config to allow `repo` to be NULL, 2025-03-08)"
to allow `repo` to be NULL, no need to check if `repo` is NULL
before calling `repo_config()`.
Suggested-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we already teach the `repo_config()` in "f29f1990b5
(config: teach repo_config to allow `repo` to be NULL, 2025-03-08)"
to allow `repo` to be NULL, no need to check if `repo` is NULL
before calling `repo_config()`.
Suggested-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
perf: do allow `GIT_PERF_*` to be overridden again
A common way to run Git's performance benchmarks on repositories other
than Git's own repository (which is not exactly large when compared to
actually large repositories) is to run them like this:
Contrary to developers' common expectations, this failed to work when
Git was built with a different `GIT_PERF_LARGE_REPO` value specified at
build time: That build-time option would have been written to the
`GIT-BUILD-OPTIONS` file, which in turn would have been sourced by
`test-lib.sh`, which in turn would have been sourced by `perf-lib.sh`,
which in turn would have been sourced by the perf test script,
_overriding_ the environment variable specified in the way illustrated
above.
Since perf tests are not run as part of the build, this most likely
unintended behavior was not caught and certainly not fixed, as the
`GIT_PERF_*` values would have been empty at build-time.
However, in 4638e8806e3a (Makefile: use common template for
GIT-BUILD-OPTIONS, 2024-12-06), a subtle change of behavior was
introduced: Whereas before, a couple of build-time options (the
`GIT_PERF_*` ones included) were written to `GIT-BUILD-OPTIONS` only
when their values were non-empty. With this commit, they are also
written when they are empty.
The consequence is that above-mentioned way to run the perf tests will
not only fail to pick up the desired `GIT_PERF_*` settings when they
were specified differently while building Git, instead the desired
settings will be only respected when specified _while building_ Git.
Let's work around the original issue, i.e. let `GIT_PERF_*` environment
variables override what is recorded in `GIT-BUILD-OPTIONS`.
Note that this is just the tip of the iceberg, there are a couple of
`GIT_TEST_*` options that may want a similar fix in `test-lib.sh`. Due
to time constraints on my side, this here patch focuses exclusively on
the `GIT_PERF_*` settings.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:34 +0000 (00:18 +0100)]
config.mak.uname: set CSPRNG_METHOD to getrandom on Linux
Commit 05cd988dce ("wrapper: add a helper to generate numbers from a
CSPRNG", 2022-01-17) added a csprng_bytes() function which used one
of several interfaces to provide a source of cryptographically secure
pseudorandom numbers. The CSPRNG_METHOD make variable was provided to
determine the choice of available 'backends' for the source of random
bytes.
Commit 05cd988dce did not set CSPRNG_METHOD in the Linux section of
the config.mak.uname file, so it defaults to using '/dev/urandom' as
the source of random bytes. The 'backend' values which could be used
on Linux are 'arc4random', 'getrandom' or 'getentropy' ('openssl' is
an option, but seems to be discouraged).
The arc4random routines (arc4random_buf() is the one actually used) were
added to glibc in version 2.36, while both getrandom() and getentropy()
were included in 2.25. So, some of the more up-to-date distributions of
Linux (eg Debian 12, Ubuntu 24.04) would be able to use the 'arc4random'
setting. All currently supported distributions have glibc 2.25 or later
(RHEL 8 has v2.28) and, therefore, have support for the 'getrandom' and
'getentropy' settings.
The arc4random routines on the *BSDs (along with cygwin) implement the
ChaCha20 stream cipher algorithm (see RFC8439) in userspace, rather than
as a system call, and are thus somewhat faster (having avoided a context
switch to the kernel). In contrast, on Linux all three functions are
simple wrappers around the same kernel CSPRNG syscall.
If the meson build system is used on a newer platform, then they will be
configured to use 'arc4random', whereas the make build will currently
default to using '/dev/urandom' on Linux. Since there is no advantage,
in terms of performance, to the 'arc4random' setting, the 'getrandom'
setting should be preferred from an availability perspective. (Also, the
current uses of csprng_bytes() are not in any hot path).
In order to set an appropriate default, set the CSPRNG_METHOD build
variable to 'getrandom' in the Linux section of the 'config.mak.uname'
file.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 17 Apr 2025 17:28:18 +0000 (10:28 -0700)]
Merge branch 'en/merge-recursive-debug'
Remove remnants of the recursive merge strategy backend, which was
superseded by the ort merge strategy.
* en/merge-recursive-debug:
builtin/{merge,rebase,revert}: remove GIT_TEST_MERGE_ALGORITHM
tests: remove GIT_TEST_MERGE_ALGORITHM and test_expect_merge_algorithm
merge-recursive.[ch]: thoroughly debug these
merge, sequencer: switch recursive merges over to ort
sequencer: switch non-recursive merges over to ort
merge-ort: enable diff-algorithms other than histogram
builtin/merge-recursive: switch to using merge_ort_generic()
checkout: replace merge_trees() with merge_ort_nonrecursive()
Junio C Hamano [Thu, 17 Apr 2025 17:28:17 +0000 (10:28 -0700)]
Merge branch 'jk/fetch-follow-remote-head-fix'
"git fetch [<remote>]" with only the configured fetch refspec
should be the only thing to update refs/remotes/<remote>/HEAD,
but the code was overly eager to do so in other cases.
* jk/fetch-follow-remote-head-fix:
fetch: make set_head() call easier to read
fetch: don't ask for remote HEAD if followRemoteHEAD is "never"
fetch: only respect followRemoteHEAD with configured refspecs
parse-options: detect mismatches in integer signedness
It was reported that "t5620-backfill.sh" fails on s390x and sparc64 in a
test that exercises the "--min-batch-size" command line option. The
symptom was that the option didn't seem to have an effect: we didn't
fetch objects with a batch size of 20, but instead fetched all objects
at once.
As it turns out, the root cause is that `--min-batch-size` uses
`OPT_INTEGER()` to parse the command line option. While this macro
expects the caller to pass a pointer to an integer, we instead pass a
pointer to a `size_t`. This coincidentally works on most platforms, but
it breaks apart on the mentioned platforms because they are big endian.
This issue isn't specific to git-backfill(1): there are a couple of
other places where we have the same type confusion going on. This
indicates that the issue really is the interface that the parse-options
subsystem provides -- it is simply too easy to get this wrong as there
isn't any kind of compiler warning, and things just work on the most
common systems.
Address the systemic issue by introducing two new build asserts
`BARF_UNLESS_SIGNED()` and `BARF_UNLESS_UNSIGNED()`. As the names
already hint at, those macros will cause a compiler error when passed a
value that is not signed or unsigned, respectively.
Adapt `OPT_INTEGER()`, `OPT_UNSIGNED()` as well as `OPT_MAGNITUDE()` to
use those asserts. This uncovers a small set of sites where we indeed
have the same bug as in git-backfill(1). Adapt all of them to use the
correct option.
Reported-by: Todd Zullinger <tmz@pobox.com> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse-options: introduce precision handling for `OPTION_UNSIGNED`
This commit is the equivalent to the preceding commit, but instead of
introducing precision handling for `OPTION_INTEGER` we introduce it for
`OPTION_UNSIGNED`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse-options: introduce precision handling for `OPTION_INTEGER`
The `OPTION_INTEGER` option type accepts a signed integer. The type of
the underlying integer is a simple `int`, which restricts the range of
values accepted by such options. But there is a catch: because the
caller provides a pointer to the value via the `.value` field, which is
a simple void pointer. This has two consequences:
- There is no check whether the passed value is sufficiently long to
store the entire range of `int`. This can lead to integer wraparound
in the best case and out-of-bounds writes in the worst case.
- Even when a caller knows that they want to store a value larger than
`INT_MAX` they don't have a way to do so.
In practice this doesn't tend to be a huge issue because users typically
don't end up passing huge values to most commands. But the parsing logic
is demonstrably broken, and it is too easy to get the calling convention
wrong.
Improve the situation by introducing a new `precision` field into the
structure. This field gets assigned automatically by `OPT_INTEGER_F()`
and tracks the size of the passed value. Like this it becomes possible
for the caller to pass arbitrarily-sized integers and the underlying
logic knows to handle it correctly by doing range checks. Furthermore,
convert the code to use `strtoimax()` intstead of `strtol()` so that we
can also parse values larger than `LONG_MAX`.
Note that we do not yet assert signedness of the passed variable, which
is another source of bugs. This will be handled in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()`
With the preceding commit, `OPT_INTEGER()` has learned to support unit
factors. Consequently, the major differencen between `OPT_INTEGER()` and
`OPT_MAGNITUDE()` isn't the support of unit factors anymore, as both of
them do support them now. Instead, the difference is that one handles
signed and the other handles unsigned integers.
Adapt the name of `OPT_MAGNITUDE()` accordingly by renaming it to
`OPT_UNSIGNED()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse-options: support unit factors in `OPT_INTEGER()`
There are two main differences between `OPT_INTEGER()` and
`OPT_MAGNITUDE()`:
- The former parses signed integers whereas the latter parses unsigned
integers.
- The latter parses unit factors like 'k', 'm' or 'g'.
While the first difference makes obvious sense, there isn't really a
good reason why signed integers shouldn't support unit factors, too.
This inconsistency will also become a bit of a problem with subsequent
commits, where we will fix a couple of callsites that pass an unsigned
integer to `OPT_INTEGER()`. There are three options:
- We could adapt those users to instead pass a signed integer, but
this would needlessly extend the range of accepted integer values.
- We could convert them to use `OPT_MAGNITUDE()`, as it only accepts
unsigned integers. But now we have the inconsistency that we also
start to accept unit factors.
- We could introduce `OPT_UNSIGNED()` as equivalent to `OPT_INTEGER()`
so that it knows to only accept unsigned integers without unit
suffix.
Introducing a whole new option type feels a bit excessive. There also
isn't really a good reason why `OPT_INTEGER()` cannot be extended to
also accept unit factors: all valid values passed to such options cannot
have a unit factors right now, so there wouldn't be any ambiguity.
Refactor `OPT_INTEGER()` to use `git_parse_int()`, which knows to
interpret unit factors. This removes the inconsistency between the
signed and unsigned options so that we can easily fix up callsites that
pass the wrong integer type right now.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we expose macros for most of our different option types understood
by the "parse-options" subsystem, not every combination of fields that
has one as that would otherwise quickly lead to an explosion of macros.
Instead, we just initialize structures manually for those variants of
fields that don't have a macro.
Callsites that open-code these structure initialization don't use
designated initializers though and instead just provide values for each
of the fields that they want to initialize. This has three significant
downsides:
- Callsites need to specify all values up to the last field that they
care about. This often includes fields that should simply be left at
their default zero-initialized state, which adds distraction.
- Any reader not deeply familiar with the layout of the structure
has a hard time figuring out what the respective initializers mean.
- Reordering or introducing new fields in the middle of the structure
is impossible without adapting all callsites.
Convert all sites to instead use designated initializers, which we have
started using in our codebase quite a while ago. This allows us to skip
any default-initialized fields, gives the reader context by specifying
the field names and allows us to reorder or introduce new fields where
we want to.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We accept a maximum value in `git_parse_signed()` that restricts the
range of accepted integers. As the intent is to pass `INT*_MAX` values
here, this maximum doesn't only act as the upper bound, but also as the
implicit lower bound of the accepted range.
This lower bound is calculated by negating the maximum. But given that
the maximum value of a signed integer with N bits is `2^(N-1)-1` whereas
the minimum value is `-2^(N-1)` we have an off-by-one error in the lower
bound.
Fix this off-by-one error by using `-max - 1` as lower bound instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:33 +0000 (00:18 +0100)]
config.mak.uname: add arc4random to the cygwin build
The arc4random_buf() function has been available in cygwin since
about 2016 (somewhere in the v2.x branch). Set the CSPRNG_METHOD
build variable to 'arc4random', in the cygwin section, to enable
the use of this cryptographically-secure pseudorandom number
function. Note that the autoconf and new meson builds also enable
this function.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:32 +0000 (00:18 +0100)]
config.mak.uname: add sysinfo() configuration for cygwin
Although sysinfo() is a 'Linux only' function, cygwin provides an
implementation which appears to be functional. The assumption that
this function is Linux only is reflected in the way the HAVE_SYSINFO
build variable is handled by the Makefile and config.mak.uname.
Rework the setting of HAVE_SYSINFO in the Linux section of the system
specific config file, along with the corresponding setting of the
BASIC_CFLAGS in the Makefile. Add the setting of HAVE_SYSINFO to the
cygwin section of 'config.mak.uname'. While here, add a test for the
sysinfo() function to the autoconf build system.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:31 +0000 (00:18 +0100)]
builtin/gc.c: correct RAM calculation when using sysinfo
The man page for sysinfo(2) on Linux states that (from v2.3.48) the
sizes of the memory and swap fields, of the returned structure, are
given as multiples of 'mem_unit' bytes. In earlier versions (prior to
v2.3.23 on i386 in particular), the 'mem_unit' field was not part of
the structure, and all sizes were measured in bytes. The man page does
not discuss the motivation for this change, but it is possible that the
change was intended for the, relatively rare, 32-bit platform with more
than 4GB of memory.
The total_ram() function makes the assumption that the 'totalram' field
of the 'struct sysinfo' is measured in bytes, or alternatively that the
'mem_unit' field is always equal to one. Having writen a program to call
the sysinfo() function and print the structure fields, it seems that, on
Linux x84_64 and i686 anyway, the 'mem_unit' field is indeed set to one
(note that the 32-bit system had only 2GB ram). However, cygwin also has
an sysinfo() implementation, which gives the following values:
Ramsay Jones [Wed, 16 Apr 2025 23:18:30 +0000 (00:18 +0100)]
config.mak.uname: add clock_gettime() to the cygwin build
Cygwin supports the clock_gettime() function, along with the associated
CLOCK_MONOTONIC preprocessor symbol. The autoconf and meson builds both
enable the use of those symbols. In order to have the same configuration
for the make builds, add the HAVE_CLOCK_GETTIME and HAVE_CLOCK_MONOTONIC
build variables to the cygwin section of the config.mak.uname file.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:29 +0000 (00:18 +0100)]
config.mak.uname: add HAVE_GETDELIM to the cygwin section
Cygwin has provided the getdelim() function as far back as (at least)
2011. The autoconf and meson builds enable the use of this symbol.
In order to have the same configuration for autoconf, meson and make,
enable the HAVE_GETDELIM build variable in the cygwin section of the
config.mak.uname file.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:28 +0000 (00:18 +0100)]
config.mak.uname: only set NO_REGEX on cygwin for v1.7
Commit 92f63d2b05 ("Cygwin 1.7 needs compat/regex", 2013-07-19) set
the NO_REGEX build variable because the platform regex library failed
some of the tests (t4018 and t4034), which passed just fine with the
compat library.
After some time (maybe a year or two), the platform library had been
updated (with an import from FreeBSD, I believe) and now passed the full
test-suite. This would be about the time of the v1.7 -> v2.0 transition
in 2015. I had a patch ready to send, but just didn't get around to
submitting it to the list. At some point in the interim, the official
cygwin git package used the autoconf build system, which sets the
NO_REGEX variable to use the platform regex library functions. The new
meson build system does likewise.
The cygwin platform regex library, in addition to now passing the tests
which formerly failed, now passes an 'test_expect_failure' test in the
t7815-grep-binary test file. In particular, test #12 'git grep .fi a'
which determines that the regex pattern '.' matches a NUL character.
The commit f96e56733a ("grep: use REG_STARTEND for all matching if
available", 2010-05-22) added the test in question, but it does not
give any indication as to why the test was framed as an expected fail,
rather than a 'positive' test that the 'git grep' command fails to
match a NUL. Note that the previous test #11 was also originally
marked in that commit as a 'test_expect_failure', but was flipped to
an 'success' test in commit 7e36de5859 ("t/t7008-grep-binary.sh: un-TODO
a test that needs REG_STARTEND", 2010-08-17).
In order to produce the same NO_REGEX configuration from autoconf, meson
and make, modify config.mak.uname to only set NO_REGEX for cygwin v1.7.
In addition, skip test t7815.12 on cygwin, by adding the !CYGWIN pre-
requisite to the test header, which (among other things) removes an
'...; please update test(s)' comment.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:27 +0000 (00:18 +0100)]
config.mak.uname: add a note about NO_STRLCPY for Linux
Commit 817151e61a ("Rename safe_strncpy() to strlcpy().", 2006-06-24)
added the NO_STRLCPY make variable to allow the conditional use of
the gitstrlcpy() compat function on those platforms which didn't
provide the 'standard' strlcpy() function.
Recently, in the summer of 2023, the strlcpy() and strlcat() functions
were added to the glibc library (v2.38), so some of the more up-to-date
Linux distributions no longer need to set NO_STRLCPY. For example, both
Ubuntu 24.04 LTS and RHEL 10 beta have glibc v2.39. However, several
distributions, which are still within their support window, have an
earlier version and must still use the 'compat' version of strlcpy().
If the meson or autoconf build systems are used on newer platforms, then
they will be configured to to use strlcpy() from glibc, whereas the make
build will always choose the 'compat' function instead. Add a note to
the config.mak.uname file, in the Linux section, to prompt make users to
override NO_STRLCPY in the config.mak file, if appropriate.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:26 +0000 (00:18 +0100)]
Makefile: remove NEEDS_LIBRT build variable
Commit d19e3a5b21 ("Makefile: add NEEDS_LIBRT to optionally link with
librt", 2016-07-07) introduced the NEEDS_LIBRT build variable to
disassociate the HAVE_CLOCK_GETTIME variable with the unconditional
linking of the librt library. At one time, the clock_gettime() function
was not available as part of the libc library and (on some unix systems)
required linking with librt.
Commit 52fcec75ce ("config.mak.uname: define NEEDS_LIBRT under Linux, for
now", 2016-07-10) set the NEEDS_LIBRT variable in the Linux section of
the config.mak.uname file, since Debian 7 (wheezy) was one of the few
remaining distributions, with glibc 2.13, that required linking with
librt for clock_gettime(). Note that from glibc version 2.17, this is no
longer necessary.
Note that Debian 7.0 was released on May 4th, 2013 and benefited from
long term support until May 2018 when it went end-of-life. Since that
time, Linux distributions use a more up-to-date library, for example:
Distribution version end of support
Debian 8 2.19 30th June 2020
RHEL 8 2.28 31st May 2024 *
Ubuntu 16.04 2.23 30th Apr 2021
* paid 'Maintenance support' ends 31st May 2029
Since it is no longer required, remove NEEDS_LIBRT from the Makefile and
config.mak.uname.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:25 +0000 (00:18 +0100)]
meson.build: set default help format to html on windows
The build variable DEFAULT_HELP_FORMAT has an appropriate default
('man') set in the code, so there is no need to pass the -Define on
the compiler command-line, unless the build requires a non-standard
value.
In addition, on windows the make build overrides the default help
format to 'html', rather than 'man', in the 'config.mak.uname' file.
In order to suppress the -Define on the C compiler command-line, only
add the -Define to the 'libgit_c_args' variable when the requested
value is not the standard 'man'. In order to override the default value
on windows, add a 'platform' value to the 'default_help_format' combo
option and set it as the default choice. When this option is set to
'platform', use the 'host_machine.system()' method call to determine the
appropriate default value for the host system.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:24 +0000 (00:18 +0100)]
meson.build: only set build variables for non-default values
Some preprocessor -Defines have defaults set in the source code when
they have not been provided to the C compiler. In this case, there is
no need to pass them on the command-line, unless the build requires a
non-standard value.
The build variables for DEFAULT_EDITOR and DEFAULT_PAGER have appropriate
defaults ('vi' and 'less') set in the code. Add the preprocessor -Defines
to the 'libgit_c_args' only if the values set with the corresponding
'options' are different to these standard values.
Also, the 'git-var' documentation contains some conditional text which
documents the chosen compiled in value, which would not read well for
the standard values. Similar to the above, only add the corresponding
'-a' attribute arguments to the 'asciidoc_common_options' variable, if
the values set in the 'options' are different to these standard values.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Wed, 16 Apr 2025 23:18:23 +0000 (00:18 +0100)]
Makefile: only set some BASIC_CFLAGS when RUNTIME_PREFIX is set
Several build variables only have any meaning when the RUNTIME_PREFIX
variable has been set. In particular, the following build variables are
otherwise ignored:
Ramsay Jones [Wed, 16 Apr 2025 23:18:22 +0000 (00:18 +0100)]
meson.build: remove -DCURL_DISABLE_TYPECHECK
Commit 9371322a60 ("sparse: suppress some \"using sizeof on a function\"
warnings", 2013-10-06) used target-specific variable assignments to add
-DCURL_DISABLE_TYPECHECK to SPARSE_FLAGS for each of the files affected
by the "typecheck-gcc.h" warnings. (http-push.c, http.c, http-walker.c
and remote-curl.c).
These warnings are only issued by sparse, and not by gcc, so we do not
want to disable the 'type checking' for non-sparse targets. The meson
build does not provide any sparse targets, so there is no need to use
the CURL_DISABLE_TYPECHECK preprocessor flag with the c compiler.
In order to re-enable the curl 'type checking' in the meson build, remove
the assignment of -DCURL_DISABLE_TYPECHECK to libgit_c_args.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 16 Apr 2025 20:54:20 +0000 (13:54 -0700)]
Merge branch 'ps/cat-file-filter-batch'
"git cat-file --batch" and friends learned to allow "--filter=" to
omit certain objects, just like the transport layer does.
* ps/cat-file-filter-batch:
builtin/cat-file: use bitmaps to efficiently filter by object type
builtin/cat-file: deduplicate logic to iterate over all objects
pack-bitmap: introduce function to check whether a pack is bitmapped
pack-bitmap: add function to iterate over filtered bitmapped objects
pack-bitmap: allow passing payloads to `show_reachable_fn()`
builtin/cat-file: support "object:type=" objects filter
builtin/cat-file: support "blob:limit=" objects filter
builtin/cat-file: support "blob:none" objects filter
builtin/cat-file: wire up an option to filter objects
builtin/cat-file: introduce function to report object status
builtin/cat-file: rename variable that tracks usage
Junio C Hamano [Wed, 16 Apr 2025 20:54:20 +0000 (13:54 -0700)]
Merge branch 'ps/test-wo-perl-prereq'
"make test" used to have a hard dependency on (basic) Perl; tests
have been rewritten help environment with NO_PERL test the build as
much as possible.
* ps/test-wo-perl-prereq:
t5703: refactor test to not depend on Perl
t5316: refactor `max_chain()` to not depend on Perl
t0210: refactor trace2 scrubbing to not use Perl
t0021: refactor `generate_random_characters()` to not depend on Perl
t/lib-httpd: refactor "one-time-perl" CGI script to not depend on Perl
t/lib-t6000: refactor `name_from_description()` to not depend on Perl
t/lib-gpg: refactor `sanitize_pgp()` to not depend on Perl
t: refactor tests depending on Perl for textconv scripts
t: refactor tests depending on Perl to print data
t: refactor tests depending on Perl substitution operator
t: refactor tests depending on Perl transliteration operator
Makefile: stop requiring Perl when running tests
meson: stop requiring Perl when tests are enabled
t: adapt existing PERL prerequisites
t: introduce PERL_TEST_HELPERS prerequisite
t: adapt `test_readlink()` to not use Perl
t: adapt `test_copy_bytes()` to not use Perl
t: adapt character translation helpers to not use Perl
t: refactor environment sanitization to not use Perl
t: skip chain lint when PERL_PATH is unset
Junio C Hamano [Wed, 16 Apr 2025 20:54:19 +0000 (13:54 -0700)]
Merge branch 'kn/non-transactional-batch-updates'
Updating multiple references have only been possible in all-or-none
fashion with transactions, but it can be more efficient to batch
multiple updates even when some of them are allowed to fail in a
best-effort manner. A new "best effort batches of updates" mode
has been introduced.
* kn/non-transactional-batch-updates:
update-ref: add --batch-updates flag for stdin mode
refs: support rejection in batch updates during F/D checks
refs: implement batch reference update support
refs: introduce enum-based transaction error types
refs/reftable: extract code from the transaction preparation
refs/files: remove duplicate duplicates check
refs: move duplicate refname update check to generic layer
refs/files: remove redundant check in split_symref_update()
Junio C Hamano [Wed, 16 Apr 2025 20:54:18 +0000 (13:54 -0700)]
Merge branch 'jt/rev-list-z'
"git rev-list" learns machine-parsable output format that delimits
each field with NUL.
* jt/rev-list-z:
rev-list: support NUL-delimited --missing option
rev-list: support NUL-delimited --boundary option
rev-list: support delimiting objects with NUL bytes
rev-list: refactor early option parsing
rev-list: inline `show_object_with_name()` in `show_object()`
Junio C Hamano [Wed, 16 Apr 2025 20:54:18 +0000 (13:54 -0700)]
Merge branch 'ps/misc-build-fixes'
Random build fixes.
* ps/misc-build-fixes:
ci: use Visual Studio for win+meson job on GitHub Workflows
meson: distinguish build and target host binaries
meson: respect 'tests' build option in contrib
gitweb: fix generation of "gitweb.js"
meson: fix handling of '-Dcurl=auto'