It is possible to configure a Git build without Perl when disabling both
our test suite and all Perl-based features. In Meson, this can be
achieved with `meson setup -Dperl=disabled -Dtests=false`.
It was reported by a user that this breaks the Meson build because
gitweb gets built even if Perl was not discovered in such a build:
$ meson setup .. -Dtests=false -Dperl=disabled
...
../gitweb/meson.build:2:43: ERROR: Unable to get the path of a not-found external program
Fix this issue by introducing a new feature-option that allows the user
to configure whether or not to build Gitweb. The feature is set to
'auto' by default and will be disabled automatically in case Perl was
not found on the system.
Reported-by: Daniel Engberg <daniel.engberg.lists@pyret.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce support for the Meson build system, a "modern" meta build
system that supports many different platforms, including Linux, macOS,
Windows and BSDs. Meson supports different backends, including Ninja,
Xcode and Microsoft Visual Studio. Several common IDEs provide an
integration with it.
The biggest contender compared to Meson is probably CMake as outlined in
our "Documentation/technical/build-systems.txt" file. Based on my own
personal experience from working with both build systems extensively I
strongly favor Meson over CMake. In my opinion, it feels significantly
easier to use with a syntax that feels more like a "real" programming
language. The second big reason is that Meson supports Rust natively,
which may prove to be important given that the project may pick up Rust
as another language eventually.
Using Meson is rather straight-forward. An example:
```
# Meson uses out-of-tree builds. You can set up multiple build
# directories, how you name them is completely up to you.
$ mkdir build
$ cd build
$ meson setup .. -Dprefix=/tmp/git-installation
# Build the project. This also provides several other targets like
e.g. `install` or `test`.
$ ninja
# Meson has been wired up to support execution of our test suites.
# Both our unit tests and our integration tests are supported.
# Running `meson test` without any arguments will execute all tests,
# but the syntax supports globbing to select only some tests.
$ meson test 't-*'
# Execute single test interactively to allow for debugging.
$ meson test 't0000-*' --interactive --test-args=-ix
```
The build instructions have been successfully tested on the following
systems, tests are passing:
- Apple macOS 10.15.
- FreeBSD 14.1.
- NixOS 24.11.
- OpenBSD 7.6.
- Ubuntu 24.04.
- Windows 10 with Cygwin.
- Windows 10 with MinGW64, except for t9700, which is also broken with
our Makefile.
- Windows 10 with Visual Studio 2022 toolchain, using the Native Tools
Command Prompt with `meson setup --vsenv`. Tests pass, except for
t9700.
- Windows 10 with Visual Studio 2022 solution, using the Native Tools
Command Prompt with `meson setup --backend vs2022`. Tests pass,
except for t9700.
- Windows 10 with VS Code, using the Meson plug-in.
It is expected that there will still be rough edges in the current
version. If this patch lands the expectation is that it will coexist
with our other build systems for a while. Like this, distributions can
slowly migrate over to Meson and report any findings they have to us
such that we can continue to iterate. A potential cutoff date for other
build systems may be Git 3.0.
Some notes:
- The installed distribution is structured somewhat differently than
how it used to be the case. All of our binaries are installed into
`$libexec/git-core`, while all binaries part of `$bindir` are now
symbolic links pointing to the former. This rule is consistent in
itself and thus easier to reason about.
- We do not install dashed binaries into `$libexec/git-core` anymore,
so there won't e.g. be a symlink for git-add(1). These are not
required by modern Git and there isn't really much of a use case for
those anymore. By not installing those symlinks we thus start the
deprecation of this layout.
- We're targeting Meson 1.3.0, which has been released relatively
recently November 2023. The only feature we use from that version is
`fs.relative_to()`, which we could replace if necessary. If so, we
could start to target Meson 1.0.0 and newer, released in December
2022.
- The whole build instructions count around 3300 lines, half of which
is listing all of our code and test files. Our Makefiles are around
5000 lines, autoconf adds another 1300 lines. CMake in comparison
has only 1200 linescode, but it avoids listing individual files and
does not wire up auto-configuration as extensively as the Meson
instructions do.
- We bundle a set of subproject wrappers for curl, expat, openssl,
pcre2 and zlib. This allows developers to build Git without these
dependencies preinstalled, and Meson will fetch and build them
automatically. This is especially helpful on Windows.
Helped-by: Eli Schwartz <eschwartz@gentoo.org> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're contemplating whether to eventually replace our build systems with
a build system that is easier to use. Add a comparison of build systems
to our technical documentation as a baseline for discussion.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our "test-lib.sh" assumes that our build directory is the parent
directory of "t/". While true when using our Makefile, it's not when
using build systems that support out-of-tree builds.
In commit ee9e66e4e7 (cmake: avoid editing t/test-lib.sh, 2022-10-18),
we have introduce support for overriding the GIT_BUILD_DIR by creating
the file "$GIT_BUILD_DIR/GIT-BUILD-DIR" with its contents pointing to
the location of the build directory. The intent was to stop modifying
"t/test-lib.sh" with the CMake build systems while allowing out-of-tree
builds. But "$GIT_BUILD_DIR" is somewhat misleadingly named, as it in
fact points to the _source_ directory. So while that commit solved part
of the problem for out-of-tree builds, CMake still has to write files
into the source tree.
Solve the second part of the problem, namely not having to write any
data into the source directory at all, by also supporting an environment
variable that allows us to point to a different build directory. This
allows us to perform properly self-contained out-of-tree builds.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our in-tree builds used by the Makefile use various different build
directories scattered around different locations. The paths to those
build directories have to be propagated to our tests such that they can
find the contained files. This is done via a mixture of hardcoded paths
in our test library and injected variables in our bin-wrappers or
"GIT-BUILD-OPTIONS".
The latter two mechanisms are preferable over using hardcoded paths. For
one, we have all paths which are subject to change stored in a small set
of central files instead of having the knowledge of build paths in many
files. And second, it allows build systems which build files elsewhere
to adapt those paths based on their own needs. This is especially nice
in the context of build systems that use out-of-tree builds like CMake
or Meson.
Remove hardcoded knowledge of build paths from our test library and move
it into our bin-wrappers and "GIT-BUILD-OPTIONS".
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: extract script to generate a list of mergetools
We include the list of available mergetools into our manpages. Extract
the script that performs this logic such that we can reuse it in other
build systems.
While at it, refactor the Makefile targets such that we don't create
"mergetools-list.made" anymore. It shouldn't be necessary, as we can
instead have other targets depend on "mergetools-{diff,merge}.txt"
directly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: teach "cmd-list.perl" about out-of-tree builds
The "cmd-list.perl" script generates a list of commands that can be
included into our manpages. The script doesn't know about out-of-tree
builds and instead writes resulting files into the source directory.
Adapt it such that we can read data from the source directory and write
data into the build directory.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: allow sourcing generated includes from separate dir
Our documentation uses "include::" directives to include parts that are
either reused across multiple documents or parts that we generate at
build time. Unfortunately, top-level includes are only ever resolved
relative to the base directory, which is typically the directory of the
including document. Most importantly, it is not possible to have either
asciidoc or asciidoctor search multiple directories.
It follows that both kinds of includes must live in the same directory.
This is of course a bummer for out-of-tree builds, because here the
dynamically-built includes live in the build directory whereas the
static includes live in the source directory.
Introduce a `build_dir` attribute and prepend it to all of our includes
for dynamically-built files. This attribute gets set to the build
directory and thus converts the include path to an absolute path, which
asciidoc and asciidoctor know how to resolve.
Note that this change also requires us to update "build-docdep.perl",
which tries to figure out included files such our Makefile can set up
proper build-time dependencies. This script simply scans through the
source files for any lines that match "^include::" and treats the
remainder of the line as included file path. But given that those may
now contain the "{build_dir}" variable we have to teach the script to
replace that attribute with the actual build directory.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we install Git we also install a set of default templates that both
git-init(1) and git-clone(1) populate into our build directories. The
way the pristine templates are laid out in our source directory is
somewhat weird though: instead of reconstructing the actual directory
hierarchy in "templates/", we represent directory separators with "--".
The only reason I could come up with for why we have this is the
"branches/" directory, which is supposed to be empty when installing it.
And as Git famously doesn't store empty directories at all we have to
work around this limitation.
Now the thing is that the "branches/" directory is a leftover to how
branches used to be stored in the dark ages. gitrepository-layout(5)
lists this directory as "slightly deprecated", which I would claim is a
strong understatement. I have never encountered anybody using it today
and would be surprised if it even works as expected. So having the "--"
hack in place for an item that is basically unused, unmaintained and
deprecated doesn't only feel unreasonable, but installing that entry by
default may also cause confusion for users that do not know what this is
supposed to be in the first place.
Remove this directory from our templates and, now that we do not require
the workaround anymore, restructure the templates to form a proper
hierarchy. This makes it way easier for build systems to install these
templates into place.
We should likely think about removing support for "branch/" altogether,
but that is outside of the scope of this patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: allow "bin-wrappers/" directory to exist
The "bin-wrappers/" directory gets created by our build system and is
populated with one script for each of our binaries. There isn't anything
inherently wrong with the current layout, but it is somewhat hard to
adapt for out-of-tree build systems.
Adapt the layout such that our "bin-wrappers/" directory always exists
and contains our "wrap-for-bin.sh" script to make things a little bit
easier for subsequent steps.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: refactor generators to be PWD-independent
We have multiple scripts that generate headers from other data. All of
these scripts have the assumption built-in that they are executed in the
current source directory, which makes them a bit unwieldy to use during
out-of-tree builds.
Refactor them to instead take the source directory as well as the output
file as arguments.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to the preceding commit, also extract the script to generate the
"gitweb.js" file. While the logic itself is trivial, it helps us avoid
duplication of logic across build systems and ensures that the build
systems will remain in sync with each other in case the logic ever needs
to change.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to generate "gitweb.cgi" we have to replace various different
placeholders. This is done ad-hoc and is thus not easily reusable across
different build systems.
Introduce a new GITWEB-BUILD-OPTIONS.in template that we populate at
configuration time with the expected options. This script is then used
as input for a new "generate-gitweb.sh" script that generates the final
"gitweb.cgi" file. While this requires us to repeat the options multiple
times, it is in line to how we generate other build options like our
GIT-BUILD-OPTIONS file.
While at it, refactor how we replace the GITWEB_PROJECT_MAXDEPTH. Even
though this variable is supposed to be an integer, the source file has
the value quoted. The quotes are eventually stripped via sed(1), which
replaces `"@GITWEB_PROJECT_MAXDEPTH@"` with the actual value, which is
rather nonsensical. This is made clearer by just dropping the quotes in
the source file.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract the script to inject various build-time parameters into our Perl
scripts into a standalone script. This is done such that we can reuse it
in other build systems.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When injecting the Perl path into our scripts we sometimes use '@PERL@'
while we othertimes use '@PERL_PATH@'. Refactor the code use the latter
consistently, which makes it easier to reuse the same logic for multiple
scripts.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: generate doc versions via GIT-VERSION-GEN
The documentation we generate embeds information for the exact Git
version used as well as the date of the commit. This information is
injected by injecting attributes into the build process via command line
argument.
Refactor the logic so that we write the information into "asciidoc.conf"
and "asciidoctor-extensions.rb" via `GIT-VERSION-GEN` for AsciiDoc and
AsciiDoctor, respectively.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git.rc" is used on Windows to embed information like the project
name and version into the resulting executables. As such we need to
inject the version information, which we do by using preprocessor
defines. The logic to do so is non-trivial and needs to be kept in sync
with the different build systems.
Refactor the logic so that we generate "git.rc" via `GIT-VERSION-GEN`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: propagate Git version via generated header
We set up a couple of preprocessor macros when compiling Git that
propagate the version that Git was built from to `git version` et al.
The way this is set up makes it harder than necessary to reuse the
infrastructure across the different build systems.
Refactor this such that we generate a "version-def.h" header via
`GIT-VERSION-GEN` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our "GIT-VERSION-GEN" script always writes the "GIT-VERSION-FILE" into
the current directory, where the expectation is that it should exist in
the source directory. But other build systems that support out-of-tree
builds may not want to do that to keep the source directory pristine,
even though CMake currently doesn't care.
Refactor the script such that it won't write the "GIT-VERSION-FILE"
directly anymore, but instead knows to replace @PLACEHOLDERS@ in an
arbitrary input file. This allows us to simplify the logic in CMake to
determine the project version, but can also be reused later on in order
to generate other files that need to contain version information like
our "git.rc" file.
While at it, change the format of the version file by removing the
spaces around the equals sign. Like this we can continue to include the
file in our Makefiles, but can also start to source it in shell scripts
in subsequent steps.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: consistently use @PLACEHOLDER@ to substitute
We have a bunch of placeholders in our scripts that we replace at build
time, for example by using sed(1). These placeholders come in three
different formats: @PLACEHOLDER@, @@PLACEHOLDER@@ and ++PLACEHOLDER++.
Next to being inconsistent it also creates a bit of a problem with
CMake, which only supports the first syntax in its `configure_file()`
function. To work around that we instead manually replace placeholders
via string operations, which is a hassle and removes safeguards that
CMake has to verify that we didn't forget to replace any placeholders.
Besides that, other build systems like Meson also support the CMake
syntax.
Unify our codebase to consistently use the syntax supported by such
build systems.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: use common template for GIT-BUILD-OPTIONS
The "GIT-BUILD-OPTIONS" file is generated by our build systems to
propagate built-in features and paths to our tests. The generation is
done ad-hoc, where both our Makefile and the CMake build instructions
simply echo a bunch of strings into the file. This makes it very hard to
figure out what variables are expected to exist and what format they
have, and the written variables can easily get out of sync between build
systems.
Introduce a new "GIT-BUILD-OPTIONS.in" template to address this issue.
This has multiple advantages:
- It demonstrates which built options exist in the first place.
- It can serve as a spot to document the build options.
- Some build systems complain when not all variables could be
substituted, alerting us of mismatches. Others don't, but if we
forgot to substitute such variables we now have a bogus string that
will likely cause our tests to fail, if they have any meaning in the
first place.
Backfill values that we didn't yet set in our CMake build instructions.
While at it, remove the `SUPPORTS_SIMPLE_IPC` variable that we only set
up in CMake as it isn't used anywhere.
This change requires us to adapt the setup of TEST_OUTPUT_DIRECTORY in
"test-lib.sh" such that it does not get overwritten after sourcing when
it has been set up via the environment. This is the only instance I
could find where we rely on ordering on variables.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 4 Dec 2024 01:14:49 +0000 (10:14 +0900)]
Merge branch 'ja/git-diff-doc-markup'
Documentation mark-up updates.
* ja/git-diff-doc-markup:
doc: git-diff: apply format changes to config part
doc: git-diff: apply format changes to diff-generate-patch
doc: git-diff: apply format changes to diff-format
doc: git-diff: apply format changes to diff-options
doc: git-diff: apply new documentation guidelines
Junio C Hamano [Wed, 4 Dec 2024 01:14:42 +0000 (10:14 +0900)]
Merge branch 'sj/ref-contents-check'
"git fsck" learned to issue warnings on "curiously formatted" ref
contents that have always been taken valid but something Git
wouldn't have written itself (e.g., missing terminating end-of-line
after the full object name).
* sj/ref-contents-check:
ref: add symlink ref content check for files backend
ref: check whether the target of the symref is a ref
ref: add basic symref content check for files backend
ref: add more strict checks for regular refs
ref: port git-fsck(1) regular refs check for files backend
ref: support multiple worktrees check for refs
ref: initialize ref name outside of check functions
ref: check the full refname instead of basename
ref: initialize "fsck_ref_report" with zero
Junio C Hamano [Wed, 4 Dec 2024 01:14:41 +0000 (10:14 +0900)]
Merge branch 'ps/ref-backend-migration-optim'
The migration procedure between two ref backends has been optimized.
* ps/ref-backend-migration-optim:
reftable: rename scratch buffer
refs: adapt `initial_transaction` flag to be unsigned
reftable/block: optimize allocations by using scratch buffer
reftable/block: rename `block_writer::buf` variable
reftable/writer: optimize allocations by using a scratch buffer
refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG`
refs: skip collision checks in initial transactions
refs: use "initial" transaction semantics to migrate refs
refs/files: support symbolic and root refs in initial transaction
refs: introduce "initial" transaction flag
refs/files: move logic to commit initial transaction
refs: allow passing flags when setting up a transaction
Junio C Hamano [Wed, 4 Dec 2024 01:14:38 +0000 (10:14 +0900)]
Merge branch 'ps/leakfixes-part-10'
Leakfixes.
* ps/leakfixes-part-10: (27 commits)
t: remove TEST_PASSES_SANITIZE_LEAK annotations
test-lib: unconditionally enable leak checking
t: remove unneeded !SANITIZE_LEAK prerequisites
t: mark some tests as leak free
t5601: work around leak sanitizer issue
git-compat-util: drop now-unused `UNLEAK()` macro
global: drop `UNLEAK()` annotation
t/helper: fix leaking commit graph in "read-graph" subcommand
builtin/branch: fix leaking sorting options
builtin/init-db: fix leaking directory paths
builtin/help: fix leaks in `check_git_cmd()`
help: fix leaking return value from `help_unknown_cmd()`
help: fix leaking `struct cmdnames`
help: refactor to not use globals for reading config
builtin/sparse-checkout: fix leaking sanitized patterns
split-index: fix memory leak in `move_cache_to_base_index()`
git: refactor builtin handling to use a `struct strvec`
git: refactor alias handling to use a `struct strvec`
strvec: introduce new `strvec_splice()` function
line-log: fix leak when rewriting commit parents
...
Junio C Hamano [Wed, 4 Dec 2024 01:14:37 +0000 (10:14 +0900)]
Merge branch 'ps/gc-stale-lock-warning'
Give a bit of advice/hint message when "git maintenance" stops finding a
lock file left by another instance that still is potentially running.
* ps/gc-stale-lock-warning:
t7900: fix host-dependent behaviour when testing git-maintenance(1)
builtin/gc: provide hint when maintenance hits a stale schedule lock
Junio C Hamano [Tue, 26 Nov 2024 22:57:06 +0000 (07:57 +0900)]
Merge branch 'jk/gcc15'
GCC 15 compatibility updates.
* jk/gcc15:
object-file: inline empty tree and blob literals
object-file: treat cached_object values as const
object-file: drop oid field from find_cached_object() return value
object-file: move empty_tree struct into find_cached_object()
object-file: drop confusing oid initializer of empty_tree struct
object-file: prefer array-of-bytes initializer for hash literals
Junio C Hamano [Tue, 26 Nov 2024 22:57:04 +0000 (07:57 +0900)]
Merge branch 'ps/clar-build-improvement'
Fix for clar unit tests to support CMake build.
* ps/clar-build-improvement:
Makefile: let clar header targets depend on their scripts
cmake: use verbatim arguments when invoking clar commands
cmake: use SH_EXE to execute clar scripts
t/unit-tests: convert "clar-generate.awk" into a shell script
Junio C Hamano [Tue, 26 Nov 2024 22:57:03 +0000 (07:57 +0900)]
Merge branch 'kh/bundle-docs'
Documentation for "git bundle" saw improvements to more prominently
call out the use of '--all' when creating bundles.
* kh/bundle-docs:
Documentation/git-bundle.txt: discuss naïve backups
Documentation/git-bundle.txt: mention --all in spec. refs
Documentation/git-bundle.txt: remove old `--all` example
Documentation/git-bundle.txt: mention full backup example
shejialuo [Tue, 26 Nov 2024 14:40:57 +0000 (22:40 +0800)]
ref-cache: fix invalid free operation in `free_ref_entry`
In cfd971520e (refs: keep track of unresolved reference value in
iterators, 2024-08-09), we added a new field "referent" into the "struct
ref" structure. In order to free the "referent", we unconditionally
freed the "referent" by simply adding a "free" statement.
However, this is a bad usage. Because when ref entry is either directory
or loose ref, we will always execute the following statement:
free(entry->u.value.referent);
This does not make sense. We should never access the "entry->u.value"
field when "entry" is a directory. However, the change obviously doesn't
break the tests. Let's analysis why.
The anonymous union in the "ref_entry" has two members: one is "struct
ref_value", another is "struct ref_dir". On a 64-bit machine, the size
of "struct ref_dir" is 32 bytes, which is smaller than the 48-byte size
of "struct ref_value". And the offset of "referent" field in "struct
ref_value" is 40 bytes. So, whenever we create a new "ref_entry" for a
directory, we will leave the offset from 40 bytes to 48 bytes untouched,
which means the value for this memory is zero (NULL). It's OK to free a
NULL pointer, but this is merely a coincidence of memory layout.
To fix this issue, we now ensure that "free(entry->u.value.referent)" is
only called when "entry->flag" indicates that it represents a loose
reference and not a directory to avoid the invalid memory operation.
Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Mon, 25 Nov 2024 14:55:30 +0000 (15:55 +0100)]
builtin: pass repository to sub commands
In 9b1cb5070f (builtin: add a repository parameter for builtin
functions, 2024-09-13) the repository was passed down to all builtin
commands. This allowed the repository to be passed down to lower layers
without depending on the global `the_repository` variable.
Continue this work by also passing down the repository parameter from
the command to sub-commands. This will help pass down the repository to
other subsystems and cleanup usage of global variables like
'the_repository' and 'the_hash_algo'.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
bisect: address Coverity warning about potential double free
Coverity has started to warn about a potential double-free in
`find_bisection()`. This warning is triggered because we may modify the
list head of the passed-in `commit_list` in case it is an UNINTERESTING
commit, but still call `free_commit_list()` on the original variable
that points to the now-freed head in case where `do_find_bisection()`
returns a `NULL` pointer.
As far as I can see, this double free cannot happen in practice, as
`do_find_bisection()` only returns a `NULL` pointer when it was passed a
`NULL` input. So in order to trigger the double free we would have to
call `find_bisection()` with a commit list that only consists of
UNINTERESTING commits, but I have not been able to construct a case
where that happens.
Drop the `else` branch entirely as it seems to be a no-op anyway.
Another option might be to instead call `free_commit_list()` on `list`,
which is the modified version of `commit_list` and thus wouldn't cause a
double free. But as mentioned, I couldn't come up with any case where a
passed-in non-NULL list becomes empty, so this shouldn't be necessary.
And if it ever does become necessary we'd notice anyway via the leak
sanitizer.
Interestingly enough we did not have a single test exercising this
branch: all tests pass just fine even when replacing it with a call to
`BUG()`. Add a test that exercises it.
Reported-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 26 Nov 2024 01:21:58 +0000 (10:21 +0900)]
Merge branch 'ps/leakfixes-part-10' into ps/bisect-double-free-fix
* ps/leakfixes-part-10: (27 commits)
t: remove TEST_PASSES_SANITIZE_LEAK annotations
test-lib: unconditionally enable leak checking
t: remove unneeded !SANITIZE_LEAK prerequisites
t: mark some tests as leak free
t5601: work around leak sanitizer issue
git-compat-util: drop now-unused `UNLEAK()` macro
global: drop `UNLEAK()` annotation
t/helper: fix leaking commit graph in "read-graph" subcommand
builtin/branch: fix leaking sorting options
builtin/init-db: fix leaking directory paths
builtin/help: fix leaks in `check_git_cmd()`
help: fix leaking return value from `help_unknown_cmd()`
help: fix leaking `struct cmdnames`
help: refactor to not use globals for reading config
builtin/sparse-checkout: fix leaking sanitized patterns
split-index: fix memory leak in `move_cache_to_base_index()`
git: refactor builtin handling to use a `struct strvec`
git: refactor alias handling to use a `struct strvec`
strvec: introduce new `strvec_splice()` function
line-log: fix leak when rewriting commit parents
...
This says that hash2 and hash3 should be squashed into hash1 and
that hash3’s commit message should be used for the resulting commit.
So the user is presented with an editor where the two first commit
messages are commented out and the third is not. However this does
not work if `core.commentChar`/`core.commentString` is in use since
the comment char is hardcoded (#) in this `sequencer.c` function.
As a result the first commit message will not be commented out.
sequencer: comment `--reference` subject line properly
`git revert --reference <commit>` leaves behind a comment in the
first line:[1]
# *** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***
Meaning that the commit will just consist of the next line if the user
exits the editor directly:
This reverts commit <--format=reference commit>
But the comment char here is hardcoded (#). Which means that the
comment line will inadvertently be included in the commit message if
`core.commentChar`/`core.commentString` is in use.
†1: See 43966ab3156 (revert: optionally refer to commit in the
"reference" format, 2022-05-26)
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git rebase --update-ref` does not insert commands for dependent/sub-
branches which are checked out.[1] Instead it leaves a comment about
that fact. The comment char is hardcoded (#). In turn the comment
line gets interpreted as an invalid command when `core.commentChar`/
`core.commentString` is in use.
†1: See 900b50c242 (rebase: add --update-refs option, 2022-07-19)
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both `struct block_writer` and `struct reftable_writer` have a `buf`
member that is being reused to optimize the number of allocations.
Rename the variable to `scratch` to clarify its intend and provide a
comment explaining why it exists.
Suggested-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: adapt `initial_transaction` flag to be unsigned
The `initial_transaction` flag is tracked as a signed integer, but we
typically pass around flags via unsigned integers. Adapt the type
accordingly.
Suggested-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7900: fix host-dependent behaviour when testing git-maintenance(1)
We have recently added a new test to t7900 that exercises whether
git-maintenance(1) fails as expected when the "schedule.lock" file
exists. The test depends on whether or not the host has the required
executables present to schedule maintenance tasks in the first place,
like systemd or launchctl -- if not, the test fails with an unrelated
error before even checking for the lock file. This fails for example in
our CI systems, where macOS images do not have launchctl available.
Fix this issue by creating a stub systemctl(1) binary and using the
systemd scheduler.
Reported-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 25 Nov 2024 03:29:39 +0000 (12:29 +0900)]
Merge branch 'jk/output-prefix-cleanup' into maint-2.47
Code clean-up.
* jk/output-prefix-cleanup:
diff: store graph prefix buf in git_graph struct
diff: return line_prefix directly when possible
diff: return const char from output_prefix callback
diff: drop line_prefix_length field
line-log: use diff_line_prefix() instead of custom helper
Junio C Hamano [Mon, 25 Nov 2024 03:20:42 +0000 (12:20 +0900)]
Merge branch 'master' of https://github.com/j6t/gitk into maint-2.47
* 'master' of https://github.com/j6t/gitk:
Makefile(s): avoid recipe prefix in conditional statements
doc: switch links to https
doc: update links to current pages
Taylor Blau [Mon, 8 Apr 2024 15:51:44 +0000 (11:51 -0400)]
Makefile(s): avoid recipe prefix in conditional statements
In GNU Make commit 07fcee35 ([SV 64815] Recipe lines cannot contain
conditional statements, 2023-05-22) and following, conditional
statements may no longer be preceded by a tab character (which Make
refers to as the recipe prefix).
There are a handful of spots in our various Makefile(s) which will break
in a future release of Make containing 07fcee35. For instance, trying to
compile the pre-image of this patch with the tip of make.git results in
the following:
$ make -v | head -1 && make
GNU Make 4.4.90
config.mak.uname:842: *** missing 'endif'. Stop.
The kernel addressed this issue in 82175d1f9430 (kbuild: Replace tabs
with spaces when followed by conditionals, 2024-01-28). Address the
issues in Git's tree by applying the same strategy.
When a conditional word (ifeq, ifneq, ifdef, etc.) is preceded by one or
more tab characters, replace each tab character with 8 space characters
with the following:
The "unless /\\$/" removes any false-positives (like "\telse \"
appearing within a shell script as part of a recipe).
After doing so, Git compiles on newer versions of Make:
$ make -v | head -1 && make
GNU Make 4.4.90
GIT_VERSION = 2.44.0.414.gfac1dc44ca9
[...]
$ echo $?
0
Reported-by: Dario Gjorgjevski <dario.gjorgjevski@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cherry-picked-from: 728b9ac0c3b93aaa4ea80280c591deb198051785 Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Josh Soref [Fri, 24 Nov 2023 03:35:13 +0000 (03:35 +0000)]
doc: switch links to https
These sites offer https versions of their content.
Using the https versions provides some protection for users.
Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cherry-picked-from: d05b08cd52cfda627f1d865bdfe6040a2c9521b5 Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Josh Soref [Fri, 24 Nov 2023 03:35:12 +0000 (03:35 +0000)]
doc: update links to current pages
It's somewhat traditional to respect sites' self-identification.
Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cherry-picked-from: 65175d9ea26bebeb9d69977d0e75efc0e88dbced Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Junio C Hamano [Fri, 22 Nov 2024 05:34:17 +0000 (14:34 +0900)]
Merge branch 'jk/fetch-prefetch-double-free-fix'
Double-free fix.
* jk/fetch-prefetch-double-free-fix:
refspec: store raw refspecs inside refspec_item
refspec: drop separate raw_nr count
fetch: adjust refspec->raw_nr when filtering prefetch refspecs
Taylor Blau [Thu, 21 Nov 2024 20:29:24 +0000 (15:29 -0500)]
t/perf: use 'test_file_size' in more places
The perf test suite prefers to use test_file_size over 'wc -c' when
inside of a test_size block. One advantage is that accidentally writign
"wc -c file" (instead of "wc -c <file") does not inadvertently break the
tests (since the former will include the filename in the output of wc).
Both of the two uses of test_size use "wc -c", but let's convert those
to the more conventional test_file_size helper instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 21 Nov 2024 22:50:43 +0000 (17:50 -0500)]
pack-bitmap.c: typofix in `find_boundary_objects()`
In the boundary-based bitmap traversal, we use the given 'rev_info'
structure to first do a commit-only walk in order to determine the
boundary between interesting and uninteresting objects.
That walk only looks at commit objects, regardless of the state of
revs->blob_objects, revs->tree_objects, and so on. In order to do this,
we store the state of these variables in temporary fields before
setting them back to zero, performing the traversal, and then setting
them back.
But there is a typo here that dates back to b0afdce5da (pack-bitmap.c:
use commit boundary during bitmap traversal, 2023-05-08), where we
incorrectly store the value of the "tags" field as "revs->blob_objects".
This could lead to problems later on if, say, the caller wants tag
objects but *not* blob objects. In the pre-image behavior, we'd set
revs->tag_objects back to the old value of revs->blob_objects, thus
emitting fewer objects than expected back to the caller.
Fix that by correctly assigning the value of 'revs->tag_objects' to the
'tmp_tags' field.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there
is no longer a need to have that variable declared in all of our tests.
Drop it.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Over the last two releases we have plugged a couple hundred of memory
leaks exposed by the Git test suite. With the preceding commits we have
finally fixed the last leak exposed by our test suite, which means that
we are now basically leak free wherever we have branch coverage.
From hereon, the Git test suite should ideally stay free of memory
leaks. Most importantly, any test suite that is being added should
automatically be subject to the leak checker, and if that test does not
pass it is a strong signal that the added code introduced new memory
leaks and should not be accepted without further changes.
Drop the infrastructure around TEST_PASSES_SANITIZE_LEAK to reflect this
new requirement. Like this, all test suites will be subject to the leak
checker by default.
This is being intentionally strict, but we still have an escape hatch:
the SANITIZE_LEAK prerequisite. There is one known case in t5601 where
the leak sanitizer itself is buggy, so adding this prereq in such cases
is acceptable. Another acceptable situation is when a newly added test
uncovers preexisting memory leaks: when fixing that memory leak would be
sufficiently complicated it is fine to annotate and document the leak
accordingly. But in any case, the burden is now on the patch author to
explain why exactly they have to add the SANITIZE_LEAK prerequisite.
The TEST_PASSES_SANITIZE_LEAK annotations will be dropped in the next
patch.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a couple of !SANITIZE_LEAK prerequisites for tests that used to
fail due to memory leaks. These have all been fixed by now, so let's
drop the prerequisite.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running t5601 with the leak checker enabled we can see a hang in
our CI systems. This hang seems to be system-specific, as I cannot
reproduce it on my own machine.
As it turns out, the issue is in those testcases that exercise cloning
of `~repo`-style paths. All of the testcases that hang eventually end up
interpreting "repo" as the username and will call getpwnam(3p) with that
username. That should of course be fine, and getpwnam(3p) should just
return an error. But instead, the leak sanitizer seems to be recursing
while handling a call to `free()` in the NSS modules:
#0 0x00007ffff7fd98d5 in _dl_update_slotinfo (req_modid=1, new_gen=2) at ../elf/dl-tls.c:720
#1 0x00007ffff7fd9ac4 in update_get_addr (ti=0x7ffff7a91d80, gen=<optimized out>) at ../elf/dl-tls.c:916
#2 0x00007ffff7fdc85c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55
#3 0x00007ffff7a27e04 in __lsan::GetAllocatorCache () at ../../../../src/libsanitizer/lsan/lsan_linux.cpp:27
#4 0x00007ffff7a2b33a in __lsan::Deallocate (p=0x0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:127
#5 __lsan::lsan_free (p=0x0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:220
...
#261505 0x00007ffff7fd99f2 in free (ptr=<optimized out>) at ../include/rtld-malloc.h:50
#261506 _dl_update_slotinfo (req_modid=1, new_gen=2) at ../elf/dl-tls.c:822
#261507 0x00007ffff7fd9ac4 in update_get_addr (ti=0x7ffff7a91d80, gen=<optimized out>) at ../elf/dl-tls.c:916
#261508 0x00007ffff7fdc85c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55
#261509 0x00007ffff7a27e04 in __lsan::GetAllocatorCache () at ../../../../src/libsanitizer/lsan/lsan_linux.cpp:27
#261510 0x00007ffff7a2b33a in __lsan::Deallocate (p=0x5020000001e0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:127
#261511 __lsan::lsan_free (p=0x5020000001e0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:220
#261512 0x00007ffff793da25 in module_load (module=0x515000000280) at ./nss/nss_module.c:188
#261513 0x00007ffff793dee5 in __nss_module_load (module=0x515000000280) at ./nss/nss_module.c:302
#261514 __nss_module_get_function (module=0x515000000280, name=name@entry=0x7ffff79b9128 "getpwnam_r") at ./nss/nss_module.c:328
#261515 0x00007ffff793e741 in __GI___nss_lookup_function (fct_name=<optimized out>, ni=<optimized out>) at ./nss/nsswitch.c:137
#261516 __GI___nss_next2 (ni=ni@entry=0x7fffffffa458, fct_name=fct_name@entry=0x7ffff79b9128 "getpwnam_r", fct2_name=fct2_name@entry=0x0, fctp=fctp@entry=0x7fffffffa460,
status=status@entry=0, all_values=all_values@entry=0) at ./nss/nsswitch.c:120
#261517 0x00007ffff794c6a7 in __getpwnam_r (name=name@entry=0x501000000060 "repo", resbuf=resbuf@entry=0x7ffff79fb320 <resbuf>, buffer=<optimized out>,
buflen=buflen@entry=1024, result=result@entry=0x7fffffffa4b0) at ../nss/getXXbyYY_r.c:343
#261518 0x00007ffff794c4d8 in getpwnam (name=0x501000000060 "repo") at ../nss/getXXbyYY.c:140
#261519 0x00005555557e37ff in getpw_str (username=0x5020000001a1 "repo", len=4) at path.c:613
#261520 0x00005555557e3937 in interpolate_path (path=0x5020000001a0 "~repo", real_home=0) at path.c:654
#261521 0x00005555557e3aea in enter_repo (path=0x501000000040 "~repo", strict=0) at path.c:718
#261522 0x000055555568f0ba in cmd_upload_pack (argc=1, argv=0x502000000100, prefix=0x0, repo=0x0) at builtin/upload-pack.c:57
#261523 0x0000555555575ba8 in run_builtin (p=0x555555a20c98 <commands+3192>, argc=2, argv=0x502000000100, repo=0x555555a53b20 <the_repo>) at git.c:481
#261524 0x0000555555576067 in handle_builtin (args=0x7fffffffaab0) at git.c:742
#261525 0x000055555557678d in cmd_main (argc=2, argv=0x7fffffffac58) at git.c:912
#261526 0x00005555556963cd in main (argc=2, argv=0x7fffffffac58) at common-main.c:64
Note that this stack is more than 260000 function calls deep. Run under
the debugger this will eventually segfault, but in our CI systems it
seems like this just hangs forever.
I assume that this is a bug either in the leak sanitizer or in glibc, as
I cannot reproduce it on my machine. In any case, let's work around the
bug for now by marking those tests with the "!SANITIZE_LEAK" prereq.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `UNLEAK()` macro has been introduced with 0e5bba53af (add UNLEAK
annotation for reducing leak false positives, 2017-09-08) to help us
reduce the amount of reported memory leaks in cases we don't care about,
e.g. when exiting immediately afterwards. We have since removed all of
its users in favor of freeing the memory and thus don't need the macro
anymore.
Remove it.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two users of `UNLEAK()` left in our codebase:
- In "builtin/clone.c", annotating the `repo` variable. That leak has
already been fixed though as you can see in the context, where we do
know to free `repo_to_free`.
- In "builtin/diff.c", to unleak entries of the `blob[]` array. That
leak has also been fixed, because the entries we assign to that
array come from `rev.pending.objects`, and we do eventually release
`rev`.
This neatly demonstrates one of the issues with `UNLEAK()`: it is quite
easy for the annotation to become stale. A second issue is that its
whole intent is to paper over leaks. And while that has been a necessary
evil in the past, because Git was leaking left and right, it isn't
really much of an issue nowadays where our test suite has no known leaks
anymore.
Remove the last two users of this macro.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/helper: fix leaking commit graph in "read-graph" subcommand
We're leaking the commit-graph in the "test-helper read-graph"
subcommand, but as the leak is annotated with `UNLEAK()` the leak
sanitizer doesn't complain.
Fix the leak by calling `free_commit_graph()`. Besides getting rid of
the `UNLEAK()` annotation, it also increases code coverage because we
properly release resources as Git would do it, as well.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We've got a couple of leaking directory paths in git-init(1), all of
which are marked with `UNLEAK()`. Fixing them is trivial, so let's do
that instead so that we can get rid of `UNLEAK()` entirely.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `check_git_cmd()` function is declared to return a string constant.
And while it sometimes does return a constant, it may also return an
allocated string in two cases:
- When handling aliases. This case is already marked with `UNLEAK()`
to work around the leak.
- When handling unknown commands in case "help.autocorrect" is
enabled. This one is not marked with `UNLEAK()`.
The function only has a single caller, so let's fix its return type to
be non-constant, consistently return an allocated string and free it at
its callsite to plug the leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
help: fix leaking return value from `help_unknown_cmd()`
While `help_unknown_cmd()` would usually die on an unknown command, it
instead returns an autocorrected command when "help.autocorrect" is set.
But while the function is declared to return a string constant, it
actually returns an allocated string in that case. Callers thus aren't
aware that they have to free the string, leading to a memory leak.
Fix the function return type to be non-constant and free the returned
value at its only callsite.
Note that we cannot simply take ownership of `main_cmds.names[0]->name`
and then eventually free it. This is because the `struct cmdname` is
using a flex array to allocate the name, so the name pointer points into
the middle of the structure and thus cannot be freed.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
help: refactor to not use globals for reading config
We're reading the "help.autocorrect" and "alias.*" configuration into
global variables, which makes it hard to manage their lifetime
correctly. Refactor the code to use a struct instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both `git sparse-checkout add` and `git sparse-checkout set` accept a
list of additional directories or patterns. These get massaged via calls
to `sanitize_paths()`, which may end up modifying the passed-in array by
updating its pointers to be prefixed paths. This allocates memory that
we never free.
Refactor the code to instead use a `struct strvec`, which makes it way
easier for us to track the lifetime correctly. The couple of extra
memory allocations likely do not matter as we only ever populate it with
command line arguments.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
split-index: fix memory leak in `move_cache_to_base_index()`
In `move_cache_to_base_index()` we move the index cache of the main
index into the split index, which is used when writing a shared index.
But we don't release the old split index base in case we already had a
split index before this operation, which can thus leak memory.
Plug the leak by releasing the previous base.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git: refactor builtin handling to use a `struct strvec`
Similar as with the preceding commit, `handle_builtin()` does not
properly track lifetimes of the `argv` array and its strings. As it may
end up modifying the array this can lead to memory leaks in case it
contains allocated strings.
Refactor the function to use a `struct strvec` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>