Junio C Hamano [Sun, 10 Dec 2023 00:37:49 +0000 (16:37 -0800)]
Merge branch 'ps/ref-tests-update'
Update ref-related tests.
* ps/ref-tests-update:
t: mark several tests that assume the files backend with REFFILES
t7900: assert the absence of refs via git-for-each-ref(1)
t7300: assert exact states of repo
t4207: delete replace references via git-update-ref(1)
t1450: convert tests to remove worktrees via git-worktree(1)
t: convert tests to not access reflog via the filesystem
t: convert tests to not access symrefs via the filesystem
t: convert tests to not write references via the filesystem
t: allow skipping expected object ID in `ref-store update-ref`
Junio C Hamano [Sun, 10 Dec 2023 00:37:48 +0000 (16:37 -0800)]
Merge branch 'jk/chunk-bounds-more'
Code clean-up for jk/chunk-bounds topic.
* jk/chunk-bounds-more:
commit-graph: mark chunk error messages for translation
commit-graph: drop verify_commit_graph_lite()
commit-graph: check order while reading fanout chunk
commit-graph: use fanout value for graph size
commit-graph: abort as soon as we see a bogus chunk
commit-graph: clarify missing-chunk error messages
commit-graph: drop redundant call to "lite" verification
midx: check consistency of fanout table
commit-graph: handle overflow in chunk_size checks
Junio C Hamano [Sun, 10 Dec 2023 00:37:48 +0000 (16:37 -0800)]
Merge branch 'ps/ci-gitlab'
Add support for GitLab CI.
* ps/ci-gitlab:
ci: add support for GitLab CI
ci: install test dependencies for linux-musl
ci: squelch warnings when testing with unusable Git repo
ci: unify setup of some environment variables
ci: split out logic to set up failed test artifacts
ci: group installation of Docker dependencies
ci: make grouping setup more generic
ci: reorder definitions for grouping functions
Junio C Hamano [Sun, 10 Dec 2023 00:37:47 +0000 (16:37 -0800)]
Merge branch 'js/doc-unit-tests-with-cmake'
Update the base topic to work with CMake builds.
* js/doc-unit-tests-with-cmake:
cmake: handle also unit tests
cmake: use test names instead of full paths
cmake: fix typo in variable name
artifacts-tar: when including `.dll` files, don't forget the unit-tests
unit-tests: do show relative file paths
unit-tests: do not mistake `.pdb` files for being executable
cmake: also build unit tests
Junio C Hamano [Sun, 10 Dec 2023 00:37:46 +0000 (16:37 -0800)]
Merge branch 'ps/httpd-tests-on-nixos'
Portability tweak.
* ps/httpd-tests-on-nixos:
t9164: fix inability to find basename(1) in Subversion hooks
t/lib-httpd: stop using legacy crypt(3) for authentication
t/lib-httpd: dynamically detect httpd and modules path
Victoria Dye [Mon, 13 Nov 2023 23:17:51 +0000 (23:17 +0000)]
glossary: add definitions for dereference & peel
Add 'gitglossary' definitions for "dereference" (as it used for both symrefs
and objects) and "peel". These terms are used in options and documentation
throughout Git, but they are not clearly defined anywhere and the behavior
they refer to depends heavily on context. Provide explicit definitions to
clarify existing documentation to users and help contributors to use the
most appropriate terminology possible in their additions to Git.
Update other definitions in the glossary that use the term "dereference" to
link to 'def_dereference'.
Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a late amendment of 4a6e4b960263 (CI: remove Travis CI support,
2021-11-23), whereby the `.prove` file (being written by the `prove`
command that is used to run the test suite) is no longer retained
between CI builds: This feature was only ever used in the Travis CI
builds, we tried for a while to do the same in Azure Pipelines CI runs
(but I gave up on it after a while), and we never used that feature in
GitHub Actions (nor does the new GitLab CI code use it).
Retaining the Prove cache has been fragile from the start, even though
the idea seemed good at the time, the idea being that the `.prove` file
caches information about previous `prove` runs (`save`) and uses them
(`slow`) to run the tests in the order from longer-running to shorter
ones, making optimal use of the parallelism implied by `--jobs=<N>`.
However, using a Prove cache can cause some surprising behavior: When
the `prove` caches information about a test script it has run,
subsequent `prove` runs (with `--state=slow`) will run the same test
script again even if said script is not specified on the `prove`
command-line!
So far, this bug did not matter. Right until d8f416bbb87c (ci: run unit
tests in CI, 2023-11-09) did it not matter.
But starting with that commit, we invoke `prove` _twice_ in CI, once to
run the regular test suite of regression test scripts, and once to run
the unit tests. Due to the bug, the second invocation re-runs all of the
tests that were already run as part of the first invocation. This not
only wastes build minutes, it also frequently causes the `osx-*` jobs to
fail because they already take a long time and now are likely to run
into a timeout.
The worst part about it is that there is actually no benefit to keep
running with `--state=slow,save`, ever since we decided no longer to
try to reuse the Prove cache between CI runs.
So let's just drop that Prove option and live happily ever after.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t9164: fix inability to find basename(1) in Subversion hooks
Hooks executed by Subversion are spawned with an empty environment. By
default, not even variables like PATH will be propagated to them. In
order to ensure that we're still able to find required executables, we
thus write the current PATH variable into the hook script itself and
then re-export it in t9164.
This happens too late in the script though, as we already tried to
execute the basename(1) utility before exporting the PATH variable. This
tends to work on most platforms as the fallback value of PATH for Bash
(see `getconf PATH`) is likely to contain this binary. But on more
exotic platforms like NixOS this is not the case, and thus the test
fails.
While we could work around this issue by simply setting PATH earlier, it
feels fragile to inject a user-controlled value into the script and have
the shell interpret it. Instead, we can refactor the hook setup to write
a `hooks-env` file that configures PATH for us. Like this, Subversion
will know to set up the environment as expected for all hooks.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/lib-httpd: stop using legacy crypt(3) for authentication
When setting up httpd for our tests, we also install a passwd and
proxy-passwd file that contain the test user's credentials. These
credentials currently use crypt(3) as the password encryption schema.
This schema can be considered deprecated nowadays as it is not safe
anymore. Quoting Apache httpd's documentation [1]:
> Unix only. Uses the traditional Unix crypt(3) function with a
> randomly-generated 32-bit salt (only 12 bits used) and the first 8
> characters of the password. Insecure.
This is starting to cause issues in modern Linux distributions. glibc
has deprecated its libcrypt library that used to provide crypt(3) in
favor of the libxcrypt library. This newer replacement provides a
compile time switch to disable insecure password encryption schemata,
which causes crypt(3) to always return `EINVAL`. The end result is that
httpd tests that exercise authentication will fail on distros that use
libxcrypt without these insecure encryption schematas.
Regenerate the passwd files to instead use the default password
encryption schema, which is md5. While it feels kind of funny that an
MD5-based encryption schema should be more secure than anything else, it
is the current default and supported by all platforms. Furthermore, it
really doesn't matter all that much given that these files are only used
for testing purposes anyway.
t/lib-httpd: dynamically detect httpd and modules path
In order to set up the Apache httpd server, we need to locate both the
httpd binary and its default module path. This is done with a hardcoded
list of locations that we scan. While this works okayish with distros
that more-or-less follow the Filesystem Hierarchy Standard, it falls
apart on others like NixOS that don't.
While it is possible to specify these paths via `LIB_HTTPD_PATH` and
`LIB_HTTPD_MODULE_PATH`, it is not a nice experience for the developer
to figure out how to set those up. And in fact we can do better by
dynamically detecting both httpd and its module path at runtime:
- The httpd binary can be located via PATH.
- The module directory can (in many cases) be derived via the
`HTTPD_ROOT` compile-time variable.
Amend the code to do so.
Note that the new runtime-detected paths will only be used as a fallback
in case none of the hardcoded paths are usable. For the PATH lookup this
is because httpd is typically installed into "/usr/sbin", which is often
not included in the user's PATH variable. And the module path detection
relies on a configured httpd installation and may thus not work in all
cases, either.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unit tests should also be available e.g. in Visual Studio's Test
Explorer when configuring Git's source code via CMake.
Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The primary purpose of Git's CMake definition is to allow developing Git
in Visual Studio. As part of that, the CTest feature allows running
individual test scripts conveniently in Visual Studio's Test Explorer.
However, this Test Explorer's design targets object-oriented languages
and therefore expects the test names in the form
`<namespace>.<class>.<testname>`. And since we specify the full path
of the test scripts instead, including the ugly `/.././t/` part, these
dots confuse the Test Explorer and it uses a large part of the path as
"namespace".
Let's just use `t.suite.<name>` instead. This presents the tests in
Visual Studio's Test Explorer in the following form by default (i.e.
unless the user changes the view via the "Group by" menu):
◢ ◈ git
◢ ◈ t
◢ ◈ suite
◈ t0000-basic
◈ t0001-init
◈ t0002-gitfile
[...]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
artifacts-tar: when including `.dll` files, don't forget the unit-tests
As of recent, Git also builds executables in `t/unit-tests/`. For
technical reasons, when building with CMake and Visual C, the
dependencies (".dll files") need to be copied there, too, otherwise
running the executable will fail "due to missing dependencies".
The CMake definition already contains the directives to copy those
`.dll` files, but we also need to adjust the `artifacts-tar` rule in
the `Makefile` accordingly to let the `vs-test` job in the CI runs
pass successfully.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Visual C interpolates `__FILE__` with the absolute _Windows_ path of
the source file. GCC interpolates it with the relative path, and the
tests even verify that.
So let's make sure that the unit tests only emit such paths.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
unit-tests: do not mistake `.pdb` files for being executable
When building the unit tests via CMake, the `.pdb` files are built.
Those are, essentially, files containing the debug information
separately from the executables.
Let's not confuse them with the executables we actually want to run.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Josh Steadmon [Thu, 9 Nov 2023 18:50:44 +0000 (10:50 -0800)]
ci: run unit tests in CI
Run unit tests in both Cirrus and GitHub CI. For sharded CI instances
(currently just Windows on GitHub), run only on the first shard. This is
OK while we have only a single unit test executable, but we may wish to
distribute tests more evenly when we add new unit tests in the future.
We may also want to add more status output in our unit test framework,
so that we can do similar post-processing as in
ci/lib.sh:handle_failed_tests().
Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phillip Wood [Thu, 9 Nov 2023 18:50:43 +0000 (10:50 -0800)]
unit tests: add TAP unit test framework
This patch contains an implementation for writing unit tests with TAP
output. Each test is a function that contains one or more checks. The
test is run with the TEST() macro and if any of the checks fail then the
test will fail. A complete program that tests STRBUF_INIT would look
like
int main(void)
{
TEST(t_static_init(), "static initialization works);
return test_done();
}
The output of this program would be
ok 1 - static initialization works
1..1
If any of the checks in a test fail then they print a diagnostic message
to aid debugging and the test will be reported as failing. For example a
failing integer check would look like
# check "x >= 3" failed at my-test.c:102
# left: 2
# right: 3
not ok 1 - x is greater than or equal to three
There are a number of check functions implemented so far. check() checks
a boolean condition, check_int(), check_uint() and check_char() take two
values to compare and a comparison operator. check_str() will check if
two strings are equal. Custom checks are simple to implement as shown in
the comments above test_assert() in test-lib.h.
Tests can be skipped with test_skip() which can be supplied with a
reason for skipping which it will print. Tests can print diagnostic
messages with test_msg(). Checks that are known to fail can be wrapped
in TEST_TODO().
There are a couple of example test programs included in this
patch. t-basic.c implements some self-tests and demonstrates the
diagnostic output for failing test. The output of this program is
checked by t0080-unit-test-output.sh. t-strbuf.c shows some example
unit tests for strbuf.c
The unit tests will be built as part of the default "make all" target,
to avoid bitrot. If you wish to build just the unit tests, you can run
"make build-unit-tests". To run the tests, you can use "make unit-tests"
or run the test binaries directly, as in "./t/unit-tests/bin/t-strbuf".
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Josh Steadmon [Thu, 9 Nov 2023 18:50:42 +0000 (10:50 -0800)]
unit tests: add a project plan document
In our current testing environment, we spend a significant amount of
effort crafting end-to-end tests for error conditions that could easily
be captured by unit tests (or we simply forgo some hard-to-setup and
rare error conditions). Describe what we hope to accomplish by
implementing unit tests, and explain some open questions and milestones.
Discuss desired features for test frameworks/harnesses, and provide a
comparison of several different frameworks. Finally, document our
rationale for implementing a custom framework.
Co-authored-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 9 Nov 2023 07:26:28 +0000 (02:26 -0500)]
commit-graph: mark chunk error messages for translation
The patches from f32af12cee (Merge branch 'jk/chunk-bounds', 2023-10-23)
added many new untranslated error messages. While it's unlikely for most
users to see these messages at all, most of the other commit-graph error
messages are translated (and likewise for the matching midx messages).
Let's mark them all for consistency (and to help any poor unfortunate
user who does manage to find a broken graph file).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 9 Nov 2023 07:25:24 +0000 (02:25 -0500)]
commit-graph: drop verify_commit_graph_lite()
As we've moved all of the checks from this function directly into the
chunk-reading code used by the caller (and there is only one caller), we
can just drop it entirely.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 9 Nov 2023 07:25:07 +0000 (02:25 -0500)]
commit-graph: check order while reading fanout chunk
We read the fanout chunk, storing a pointer to it, but only confirm that
the entries are monotonic in a final "lite" verification step. Let's
move that into the actual OIDF chunk callback, so that we can report
problems immediately (for all the reasons given in the previous
"commit-graph: abort as soon as we see a bogus chunk" commit).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 9 Nov 2023 07:24:35 +0000 (02:24 -0500)]
commit-graph: use fanout value for graph size
Commit-graph, midx, and pack idx files all have both a lookup table of
oids and an oid fanout table. In midx and pack idx files, we take the
final entry of the fanout table as the source of truth for the number of
entries, and then verify that the size of the lookup table matches that.
But for commit-graph files, we do the opposite: we use the size of the
lookup table as the source of truth, and then check the final fanout
entry against it.
As noted in 4169d89645 (commit-graph: check consistency of fanout
table, 2023-10-09), either is correct. But there are a few reasons to
prefer the fanout table as the source of truth:
1. The fanout entries are 32-bits on disk, and that defines the
maximum number of entries we can store. But since the size of the
lookup table is only bounded by the filesystem, it can be much
larger. And hence computing it as the commit-graph does means that
we may truncate the result when storing it in a uint32_t.
2. We read the fanout first, then the lookup table. If we're verifying
the chunks as we read them, then we'd want to take the fanout as
truth (we have nothing yet to check it against) and then we can
check that the lookup table matches what we already know.
3. It is pointlessly inconsistent with the midx and pack idx code.
Since the three have to do similar size and bounds checks, it is
easier to reason about all three if they use the same approach.
So this patch moves the assignment of g->num_commits to the fanout
parser, and then we can check the size of the lookup chunk as soon as we
try to load it.
There's already a test covering this situation, which munges the final
fanout entry to 2^32-1. In the current code we complain that it does not
agree with the table size. But now that we treat the munged value as the
source of truth, we'll complain that the lookup table is the wrong size
(again, either is correct). So we'll have to update the message we
expect (and likewise for an earlier test which does similar munging).
There's a similar test for this situation on the midx side, but rather
than making a very-large fanout value, it just truncates the lookup
table. We could do that here, too, but the very-large fanout value
actually shows an interesting corner case. On a 32-bit system,
multiplying to find the expected table size would cause an integer
overflow. Using st_mult() would detect that, but cause us to die()
rather than falling back to the non-graph code path. Checking the size
using division (as we do with existing chunk-size checks) avoids the
overflow entirely, and the test demonstrates this when run on a 32-bit
system.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 9 Nov 2023 07:17:11 +0000 (02:17 -0500)]
commit-graph: abort as soon as we see a bogus chunk
The code to read commit-graph files tries to read all of the required
chunks, but doesn't abort if we can't find one (or if it's corrupted).
It's only at the end of reading the file that we then do some sanity
checks for NULL entries. But it's preferable to detect the errors and
bail immediately, for a few reasons:
1. It's less error-prone. It's easy in the reader functions to flag an
error but still end up setting some struct fields (an error I in
fact made while working on this patch series).
2. It's safer. Since verifying some chunks depends on the values of
other chunks, we may be depending on not-yet-verified data. I don't
know offhand of any case where this can cause problems, but it's
one less subtle thing to worry about in the reader code.
3. It prevents the user from seeing nonsense errors. If we're missing
an OIDL chunk, then g->num_commits will be zero. And so we may
complain that the size of our CDAT chunk (which should have a
fixed-size record for each commit) is wrong unless it's also zero.
But that's misleading; the problem is the missing OIDL chunk; the
CDAT one might be fine!
So let's just check the return value from read_chunk(). This is exactly
how the midx chunk-reading code does it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a required commit-graph chunk cannot be loaded, we leave its entry
in the struct NULL, and then later complain that it is missing. But
that's just one reason we might not have loaded it, as we also do some
data quality checks.
Let's switch these messages to say "missing or corrupted", which is
exactly what the midx code says for the same cases. Likewise, we'll use
the same phrasing and capitalization as those for consistency. And while
we're here, we can mark them for translation (just like the midx ones).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 9 Nov 2023 07:13:12 +0000 (02:13 -0500)]
commit-graph: drop redundant call to "lite" verification
The idea of verify_commit_graph_lite() is to have cheap verification
checks both for everyday use of the graph files (to avoid out of bounds
reads, etc) as well as for doing a full check via "commit-graph verify"
(which will also check the hash, etc).
But the expensive verification checks operate on a commit_graph struct,
which we get by using the normal everyday-reader code! So any problem
we'd find by calling it would have been found before we even got to the
verify_one_commit_graph() function.
Removing it simplifies the code a bit, but also frees us up to move the
"lite" verification steps around within that everyday-reader code.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 9 Nov 2023 07:12:07 +0000 (02:12 -0500)]
midx: check consistency of fanout table
The commit-graph, midx, and pack idx on-disk formats all have oid fanout
tables which are fed to bsearch_hash(). If these tables do not increase
monotonically, then the binary search may not only produce bogus values,
it may cause out of bounds reads.
We fixed this for commit graphs in 4169d89645 (commit-graph: check
consistency of fanout table, 2023-10-09). That commit argued that we did
not need to do the same for midx and pack idx files, because they
already did this check. However, that is wrong. We _do_ check the fanout
table for pack idx files when we load them, but we only do so for midx
files when running "git multi-pack-index verify". So it is possible to
get an out-of-bounds read by running a normal command with a specially
crafted midx file.
Let's fix this using the same solution (and roughly the same test) we
did for the commit-graph in 4169d89645. This replaces the same check
from "multi-pack-index verify", because verify uses the same read
routines, we'd bail on reading the midx much sooner now. So let's make
sure to copy its verbose error message.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 9 Nov 2023 07:09:48 +0000 (02:09 -0500)]
commit-graph: handle overflow in chunk_size checks
We check the size of chunks with fixed records by multiplying the width
of each record by the number of commits in the file. Like:
if (chunk_size != g->num_commits * GRAPH_DATA_WIDTH)
If this multiplication overflows, we may not notice a chunk is too small
(which could later lead to out-of-bound reads).
In the current code this is only possible for the CDAT chunk, but the
reasons are quite subtle. We compute g->num_commits by dividing the size
of the OIDL chunk by the hash length (since it consists of a bunch of
hashes). So we know that any size_t multiplication that uses a value
smaller than the hash length cannot overflow. And the CDAT records are
the only ones that are larger (the others are just 4-byte records). So
it's worth fixing all of these, to make it clear that they're not
subject to overflow (without having to reason about seemingly unrelated
code).
The obvious thing to do is add an st_mult(), like:
if (chunk_size != st_mult(g->num_commits, GRAPH_DATA_WIDTH))
And that certainly works, but it has one downside: if we detect an
overflow, we'll immediately die(). But the commit graph is an optional
file; if we run into other problems loading it, we'll generally return
an error and fall back to accessing the full objects. Using st_mult()
means a malformed file will abort the whole process.
So instead, we can do a division like this:
if (chunk_size / GRAPH_DATA_WIDTH != g->num_commits)
where there's no possibility of overflow. We do lose a little bit of
precision; due to integer division truncation we'd allow up to an extra
GRAPH_DATA_WIDTH-1 bytes of data in the chunk. That's OK. Our main goal
here is making sure we don't have too _few_ bytes, which would cause an
out-of-bounds read (we could actually replace our "!=" with "<", but I
think it's worth being a little pedantic, as a large mismatch could be a
sign of other problems).
I didn't add a test here. We'd need to generate a very large graph file
in order to get g->num_commits large enough to cause an overflow. And a
later patch in this series will use this same division technique in a
way that is much easier to trigger in the tests.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already support Azure Pipelines and GitHub Workflows in the Git
project, but until now we do not have support for GitLab CI. While it is
arguably not in the interest of the Git project to maintain a ton of
different CI platforms, GitLab has recently ramped up its efforts and
tries to contribute to the Git project more regularly.
Part of a problem we hit at GitLab rather frequently is that our own,
custom CI setup we have is so different to the setup that the Git
project has. More esoteric jobs like "linux-TEST-vars" that also set a
couple of environment variables do not exist in GitLab's custom CI
setup, and maintaining them to keep up with what Git does feels like
wasted time. The result is that we regularly send patch series upstream
that fail to compile or pass tests in GitHub Workflows. We would thus
like to integrate the GitLab CI configuration into the Git project to
help us send better patch series upstream and thus reduce overhead for
the maintainer. Results of these pipeline runs will be made available
(at least) in GitLab's mirror of the Git project at [1].
This commit introduces the integration into our regular CI scripts so
that most of the setup continues to be shared across all of the CI
solutions. Note that as the builds on GitLab CI run as unprivileged
user, we need to pull in both sudo and shadow packages to our Alpine
based job to set this up.
[1]: https://gitlab.com/gitlab-org/git
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The linux-musl CI job executes tests on Alpine Linux, which is based on
musl libc instead of glibc. We're missing some test dependencies though,
which causes us to skip a subset of tests.
Install these test dependencies to increase our test coverage on this
platform. There are still some missing test dependecies, but these do
not have a corresponding package in the Alpine repositories:
- p4 and p4d, both parts of the Perforce version control system.
- cvsps, which generates patch sets for CVS.
- Subversion and the SVN::Core Perl library, the latter of which is
not available in the Alpine repositories. While the tool itself is
available, all Subversion-related tests are skipped without the
SVN::Core Perl library anyway.
The Apache2-based tests require a bit more care though. For one, the
module path is different on Alpine Linux, which requires us to add it to
the list of known module paths to detect it. But second, the WebDAV
module on Alpine Linux is broken because it does not bundle the default
database backend [1]. We thus need to skip the WebDAV-based tests on
Alpine Linux for now.
ci: squelch warnings when testing with unusable Git repo
Our CI jobs that run on Docker also use mostly the same architecture to
build and test Git via the "ci/run-build-and-tests.sh" script. These
scripts also provide some functionality to massage the Git repository
we're supposedly operating in.
In our Docker-based infrastructure we may not even have a Git repository
available though, which leads to warnings when those functions execute.
Make the helpers exit gracefully in case either there is no Git in our
PATH, or when not running in a Git repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both GitHub Actions and Azure Pipelines set up the environment variables
GIT_TEST_OPTS, GIT_PROVE_OPTS and MAKEFLAGS. And while most values are
actually the same, the setup is completely duplicate. With the upcoming
support for GitLab CI this duplication would only extend even further.
Unify the setup of those environment variables so that only the uncommon
parts are separated. While at it, we also perform some additional small
improvements:
- We now always pass `--state=failed,slow,save` via GIT_PROVE_OPTS.
It doesn't hurt on platforms where we don't persist the state, so
this further reduces boilerplate.
- When running on Windows systems we set `--no-chain-lint` and
`--no-bin-wrappers`. Interestingly though, we did so _after_
already having exported the respective environment variables.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ci: split out logic to set up failed test artifacts
We have some logic in place to create a directory with the output from
failed tests, which will then subsequently be uploaded as CI artifacts.
We're about to add support for GitLab CI, which will want to reuse the
logic.
Split the logic into a separate function so that it is reusable.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The output of CI jobs tends to be quite long-winded and hard to digest.
To help with this, many CI systems provide the ability to group output
into collapsible sections, and we're also doing this in some of our
scripts.
One notable omission is the script to install Docker dependencies.
Address it to bring more structure to the output for Docker-based jobs.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the grouping setup more generic by always calling `begin_group ()`
and `end_group ()` regardless of whether we have stubbed those functions
or not. This ensures we can more readily add support for additional CI
platforms.
Furthermore, the `group ()` function is made generic so that it is the
same for both GitHub Actions and for other platforms. There is a
semantic conflict here though: GitHub Actions used to call `set +x` in
`group ()` whereas the non-GitHub case unconditionally uses `set -x`.
The latter would get overriden if we kept the `set +x` in the generic
version of `group ()`. To resolve this conflict, we simply drop the `set
+x` in the generic variant of this function. As `begin_group ()` calls
`set -x` anyway this is not much of a change though, as the only
commands that aren't printed anymore now are the ones between the
beginning of `group ()` and the end of `begin_group ()`.
Last, this commit changes `end_group ()` to also accept a parameter that
indicates _which_ group should end. This will be required by a later
commit that introduces support for GitLab CI.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We define a set of grouping functions that are used to group together
output in our CI, where these groups then end up as collapsible sections
in the respective pipeline platform. The way these functions are defined
is not easily extensible though as we have an up front check for the CI
_not_ being GitHub Actions, where we define the non-stub logic in the
else branch.
Reorder the conditional branches such that we explicitly handle GitHub
Actions.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 8 Nov 2023 06:04:41 +0000 (15:04 +0900)]
Merge branch 'ps/leakfixes'
Leakfix.
* ps/leakfixes:
setup: fix leaking repository format
setup: refactor `upgrade_repository_format()` to have common exit
shallow: fix memory leak when registering shallow roots
test-bloom: stop setting up Git directory twice
Junio C Hamano [Wed, 8 Nov 2023 02:04:01 +0000 (11:04 +0900)]
Merge branch 'kn/rev-list-missing-fix'
"git rev-list --missing" did not work for missing commit objects,
which has been corrected.
* kn/rev-list-missing-fix:
rev-list: add commit object support in `--missing` option
rev-list: move `show_commit()` to the bottom
revision: rename bit to `do_not_die_on_missing_objects`
Junio C Hamano [Wed, 8 Nov 2023 02:03:59 +0000 (11:03 +0900)]
Merge branch 'ps/show-ref'
Teach "git show-ref" a mode to check the existence of a ref.
* ps/show-ref:
t: use git-show-ref(1) to check for ref existence
builtin/show-ref: add new mode to check for reference existence
builtin/show-ref: explicitly spell out different modes in synopsis
builtin/show-ref: ensure mutual exclusiveness of subcommands
builtin/show-ref: refactor options for patterns subcommand
builtin/show-ref: stop using global vars for `show_one()`
builtin/show-ref: stop using global variable to count matches
builtin/show-ref: refactor `--exclude-existing` options
builtin/show-ref: fix dead code when passing patterns
builtin/show-ref: fix leaking string buffer
builtin/show-ref: split up different subcommands
builtin/show-ref: convert pattern to a local variable
The codepath to traverse the commit-graph learned to notice that a
commit is missing (e.g., corrupt repository lost an object), even
though it knows something about the commit (like its parents) from
what is in commit-graph.
* ps/do-not-trust-commit-graph-blindly-for-existence:
commit: detect commits that exist in commit-graph but not in the ODB
commit-graph: introduce envvar to disable commit existence checks
Taylor Blau [Mon, 6 Nov 2023 22:56:33 +0000 (17:56 -0500)]
pack-bitmap: drop --unpacked non-commit objects from results
When performing revision queries with `--objects` and
`--use-bitmap-index`, the output may incorrectly contain objects which
are packed, even when the `--unpacked` option is given. This affects
traversals, but also other querying operations, like `--count`,
`--disk-usage`, etc.
Like in the previous commit, the fix is to exclude those objects from
the result set before they are shown to the user (or, in this case,
before the bitmap containing the result of the traversal is enumerated
and its objects listed).
This is performed by a new function in pack-bitmap.c, called
`filter_packed_objects_from_bitmap()`. Note that we do not have to
inspect individual bits in the result bitmap, since we know that the
first N (where N is the number of objects in the bitmap's pack/MIDX)
bits correspond to objects which packed by definition.
In other words, for an object to have a bitmap position (not in the
extended index), it must appear in either the bitmap's pack or one of
the packs in its MIDX.
This presents an appealing optimization to us, which is that we can
simply memset() the corresponding number of `eword_t`'s to zero,
provided that we handle any objects which spill into the next word (but
don't occupy all 64 bits of the word itself).
We only have to handle objects in the bitmap's extended index. These
objects may (or may not) appear in one or more pack(s). Since these
objects are known to not appear in either the bitmap's MIDX or pack,
they may be stored as loose, appear in other pack(s), or both.
Before returning a bitmap containing the result of the traversal back to
the caller, drop any bits from the extended index which appear in one or
more packs. This implements the correct behavior for rev-list operations
which use the bitmap index to compute their result.
Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Mon, 6 Nov 2023 22:56:30 +0000 (17:56 -0500)]
list-objects: drop --unpacked non-commit objects from results
In git-rev-list(1), we describe the `--unpacked` option as:
Only useful with `--objects`; print the object IDs that are not in
packs.
This is true of commits, which we discard via get_commit_action(), but
not of the objects they reach. So if we ask for an --objects traversal
with --unpacked, we may get arbitrarily many objects which are indeed
packed.
I am nearly certain this behavior dates back to the introduction of
`--unpacked` via 12d2a18780 ("git rev-list --unpacked" shows only
unpacked commits, 2005-07-03), but I couldn't get that revision of Git
to compile for me. At least as early as v2.0.0 this has been subtly
broken:
There, only the first, third, and sixth entries are loose, with the
remaining set of objects belonging to at least one pack.
The implications for this are relatively benign: bare 'git repack'
invocations which invoke pack-objects with --unpacked are impacted, and
at worst we'll store a few extra objects that should have been excluded.
Arguably changing this behavior is a backwards-incompatible change,
since it alters the set of objects emitted from rev-list queries with
`--objects` and `--unpacked`. But I argue that this change is still
sensible, since the existing implementation deviates from
clearly-written documentation.
The fix here is straightforward: avoid showing any non-commit objects
which are contained in packs by discarding them within list-objects.c,
before they are shown to the user. Note that similar treatment for
`list-objects.c::show_commit()` is not needed, since that case is
already handled by `revision.c::get_commit_action()`.
Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Todd Zullinger [Fri, 3 Nov 2023 14:17:51 +0000 (10:17 -0400)]
RelNotes: improve wording of credential helper notes
Offer a slightly more verbose description of the issue fixed by 7144dee3ec (credential/libsecret: erase matching creds only, 2023-07-26)
and cb626f8e5c (credential/wincred: erase matching creds only,
2023-07-26).
Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 7 Nov 2023 01:26:44 +0000 (10:26 +0900)]
Merge branch 'ms/send-email-validate-fix'
"git send-email" did not have certain pieces of data computed yet
when it tried to validate the outging messages and its recipient
addresses, which has been sorted out.
Junio C Hamano [Tue, 7 Nov 2023 01:26:43 +0000 (10:26 +0900)]
Merge branch 'jc/grep-f-relative-to-cwd'
"cd sub && git grep -f patterns" tried to read "patterns" file at
the top level of the working tree; it has been corrected to read
"sub/patterns" instead.
* jc/grep-f-relative-to-cwd:
grep: -f <path> is relative to $cwd
While populating the `repository_format` structure may cause us to
allocate memory, we do not call `clear_repository_format()` in some
places and thus potentially leak memory. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
setup: refactor `upgrade_repository_format()` to have common exit
The `upgrade_repository_format()` function has multiple exit paths,
which means that there is no common cleanup of acquired resources.
While this isn't much of a problem right now, we're about to fix a
memory leak that would require us to free the resource in every one of
those exit paths.
Refactor the code to have a common exit path so that the subsequent
memory leak fix becomes easier to implement.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
shallow: fix memory leak when registering shallow roots
When registering shallow roots, we unset the list of parents of the
to-be-registered commit if it's already been parsed. This causes us to
leak memory though because we never free this list. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're setting up the Git directory twice in the `test-tool bloom`
helper, once at the beginning of `cmd_bloom()` and once in the local
subcommand implementation `get_bloom_filter_for_commit()`. This can lead
to memory leaks as we'll overwrite variables of `the_repository` with
newly allocated data structures. On top of that it's simply unnecessary.
Fix this by only setting up the Git directory once.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Joanna Wang [Fri, 3 Nov 2023 16:34:48 +0000 (16:34 +0000)]
attr: enable attr pathspec magic for git-add and git-stash
Allow users to limit or exclude files based on file attributes
during git-add and git-stash.
For example, the chromium project would like to use
$ git add . ':(exclude,attr:submodule)'
as submodules are managed by an external tool, forbidding end users
to record changes with "git add". Allowing "git add" to often
records changes that users do not want in their commits.
This commit does not change any attr magic implementation. It is
only adding attr as an allowed pathspec in git-add and git-stash,
which was previously blocked by GUARD_PATHSPEC and a pathspec mask
in parse_pathspec()).
However, we fix a bug in prefix_magic() where attr values were
unintentionally removed. This was triggerable when parse_pathspec()
is called with PATHSPEC_PREFIX_ORIGIN as a flag, which was the case
for git-stash (Bug originally filed here [*])
Furthermore, while other commands hit this code path it did not
result in unexpected behavior because this bug only impacts the
pathspec->items->original field which is NOT used to filter
paths. However, git-stash does use pathspec->items->original when
building args used to call other git commands. (See add_pathspecs()
usage and implementation in stash.c)
It is possible that when the attr pathspec feature was first added
in b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few
GUARD_PATHSPEC() invocations.
Later, to get a more user-friendly error message when attr was used
with git-add, PATHSPEC_ATTR was added as a mask to git-add's
invocation of parse_pathspec() 84d938b732 (add: do not accept
pathspec magic 'attr', 2018-09-18). However, this user-friendly
error message was never added for git-stash.
Jeff King [Fri, 3 Nov 2023 16:20:19 +0000 (12:20 -0400)]
t: avoid perl's pack/unpack "Q" specifier
The perl script introduced by 86b008ee61 (t: add library for munging
chunk-format files, 2023-10-09) uses pack("Q") and unpack("Q") to read
and write 64-bit values ("quadwords" in perl parlance) from the on-disk
chunk files. However, some builds of perl may not support 64-bit
integers at all, and throw an exception here. While some 32-bit
platforms may still support 64-bit integers in perl (such as our linux32
CI environment), others reportedly don't (the NonStop 32-bit builds).
We can work around this by treating the 64-bit values as two 32-bit
values. We can't ever combine them into a single 64-bit value, but in
practice this is OK. These are representing file offsets, and our files
are much smaller than 4GB. So the upper half of the 64-bit value will
always be 0.
We can just introduce a few helper functions which perform the
translation and double-check our assumptions.
Reported-by: Randall S. Becker <randall.becker@nexbridge.ca> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>