Junio C Hamano [Tue, 28 Jan 2025 21:02:22 +0000 (13:02 -0800)]
Merge branch 'sk/unit-tests'
Move a few more unit tests to the clar test framework.
* sk/unit-tests:
t/unit-tests: convert reftable tree test to use clar test framework
t/unit-tests: adapt priority queue test to use clar test framework
t/unit-tests: convert mem-pool test to use clar test framework
t/unit-tests: handle dashes in test suite filenames
Junio C Hamano [Tue, 28 Jan 2025 21:02:22 +0000 (13:02 -0800)]
Merge branch 'jc/show-usage-help'
The help text from "git $cmd -h" appear on the standard output for
some $cmd and the standard error for others. The built-in commands
have been fixed to show them on the standard output consistently.
* jc/show-usage-help:
builtin: send usage() help text to standard output
oddballs: send usage() help text to standard output
builtins: send usage_with_options() help text to standard output
usage: add show_usage_if_asked()
parse-options: add show_usage_with_options_if_asked()
t0012: optionally check that "-h" output goes to stdout
The "instaweb" bound only to local IP address without "--local" and
to all addresses with "--local", which was the other way around, when
using Python's http.server class, which has been corrected.
* ak/instaweb-python-port-binding-fix:
instaweb: fix ip binding for the python http.server
Extended SHA-1 expression parser did not work well when a branch
with an unusual name (e.g. "foo{bar") is involved.
* en/object-name-with-funny-refname-fix:
object-name: be more strict in parsing describe-like output
object-name: fix resolution of object names containing curly braces
Junio C Hamano [Tue, 21 Jan 2025 16:44:54 +0000 (08:44 -0800)]
Merge branch 'ps/the-repository'
More code paths have a repository passed through the callchain,
instead of assuming the primary the_repository object.
* ps/the-repository:
match-trees: stop using `the_repository`
graph: stop using `the_repository`
add-interactive: stop using `the_repository`
tmp-objdir: stop using `the_repository`
resolve-undo: stop using `the_repository`
credential: stop using `the_repository`
mailinfo: stop using `the_repository`
diagnose: stop using `the_repository`
server-info: stop using `the_repository`
send-pack: stop using `the_repository`
serve: stop using `the_repository`
trace: stop using `the_repository`
pager: stop using `the_repository`
progress: stop using `the_repository`
Junio C Hamano [Tue, 21 Jan 2025 16:44:53 +0000 (08:44 -0800)]
Merge branch 'ps/reftable-get-random-fix'
The code to compute "unique" name used git_rand() which can fail or
get stuck; the callsite does not require cryptographic security.
Introduce the "insecure" mode and use it appropriately.
* ps/reftable-get-random-fix:
reftable/stack: accept insecure random bytes
wrapper: allow generating insecure random bytes
Junio C Hamano [Tue, 21 Jan 2025 16:44:52 +0000 (08:44 -0800)]
Merge branch 'jk/lsan-race-ignore-false-positive'
The code to check LSan results has been simplified and made more
robust.
* jk/lsan-race-ignore-false-positive:
test-lib: add a few comments to LSan log checking
test-lib: simplify lsan results check
test-lib: invert return value of check_test_results_san_file_empty
Seyi Kuforiji [Fri, 17 Jan 2025 12:29:26 +0000 (13:29 +0100)]
t/unit-tests: convert reftable tree test to use clar test framework
Adapts reftable tree test script to clar framework by using clar
assertions where necessary.
Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Seyi Kuforiji [Fri, 17 Jan 2025 12:29:25 +0000 (13:29 +0100)]
t/unit-tests: adapt priority queue test to use clar test framework
Convert the prio-queue test script to clar framework by using clar
assertions where necessary. Test functions are created as a standalone
to test different cases.
update the type of the variable `j` from int to `size_t`, this ensures
compatibility with the type used for result_size, which is also size_t,
preventing a potential warning or error caused by comparisons between
signed and unsigned integers.
Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Seyi Kuforiji [Fri, 17 Jan 2025 12:29:24 +0000 (13:29 +0100)]
t/unit-tests: convert mem-pool test to use clar test framework
Adapt the mem-pool test script to use clar framework by using clar
assertions where necessary.Test functions are created as a standalone to
test different test cases.
Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Seyi Kuforiji [Fri, 17 Jan 2025 12:29:23 +0000 (13:29 +0100)]
t/unit-tests: handle dashes in test suite filenames
"generate-clar-decls.sh" script is designed to extract function
signatures that match a specific pattern derived from the unit test
file's name. The script does not know to massage file names with dashes,
which will make it search for functions that look like, for example,
`test_mem-pool_*`. Having dashes in function names is not allowed
though, so these patterns won't ever match a legal function name.
Adapt script to translate dashes (`-`) in test suite filenames to
underscores (`_`) to correctly extract the function signatures and run
the corresponding tests. This will be used by subsequent commits which
follows the same construct.
Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 16 Jan 2025 21:35:53 +0000 (13:35 -0800)]
builtin: send usage() help text to standard output
Using the show_usage_and_exit_if_asked() helper we introduced
earlier, fix callers of usage() that want to show the help text when
explicitly asked by the end-user. The help text now goes to the
standard output stream for them.
These are the bog standard "if we got only '-h', then that is a
request for help" callers. Their
if (argc == 2 && !strcmp(argv[1], "-h"))
usage(message);
With this, the built-ins tested by t0012 all send their help text to
their standard output stream, so the check in t0012 that was half
tightened earlier is now fully tightened to insist on standard error
stream being empty.
Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 16 Jan 2025 21:35:52 +0000 (13:35 -0800)]
oddballs: send usage() help text to standard output
Using the show_usage_if_asked() helper we introduced earlier, fix
callers of usage() that want to show the help text when explicitly
asked by the end-user. The help text now goes to the standard
output stream for them.
The callers in this step are oddballs in that their invocations of
usage() are *not* guarded by
if (argc == 2 && !strcmp(argv[1], "-h")
usage(...);
There are (unnecessarily) being clever ones that do things like
if (argc != 2 || !strcmp(argv[1], "-h")
usage(...);
to say "I know I take only one argument, so argc != 2 is always an
error regardless of what is in argv[]. Ah, by the way, even if argc
is 2, "-h" is a request for usage text, so we do the same".
Some like "git var -h" just do not treat "-h" any specially, and let
it take the same error code paths as a parameter error.
Now we cannot do the same, so these callers are rewrittin to do the
show_usage_and_exit_if_asked() first and then handle the usage error
the way they used to.
Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 16 Jan 2025 21:35:51 +0000 (13:35 -0800)]
builtins: send usage_with_options() help text to standard output
Using the show_usage_with_options_if_asked() helper we introduced
earlier, fix callers of usage_with_options() that want to show the
help text when explicitly asked by the end-user. The help text now
goes to the standard output stream for them.
The test in t7600 for "git merge -h" may want to be retired, as the
same is covered by t0012 already, but it is specifically testing that
the "-h" option gets a response even with a corrupt index file, so
for now let's leave it there.
Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 16 Jan 2025 21:35:50 +0000 (13:35 -0800)]
usage: add show_usage_if_asked()
Some commands call usage() when they are asked to give the help
message with "git cmd -h", but this has the same problem as we
fixed with callers of usage_with_options() for the same purpose.
Introduce a helper function that captures the common pattern
if (argc == 2 && !strcmp(argv[1], "-h"))
usage(usage);
and replaces it with
show_usage_if_asked(argc, argv, usage);
to help correct these code paths.
Note that this helper function still exits with status 129, and
t0012 insists on it. After converting all the mistaken callers of
usage_with_options() to call this new helper, we may want to address
it---the end user is asking us to give the help text, and we are
doing exactly as asked, so there is no reason to exit with non-zero
status.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many commands call usage_with_options() when they are asked to give
the help message, but it sends the help text to the standard error
stream. When the user asked for it with "git cmd -h", the help
message is the primary output from the command, hence we should send
it to the standard output stream, instead.
Introduce a helper function that captures the common pattern
if (argc == 2 && !strcmp(argv[1], "-h"))
usage_with_options(usage, options);
Note that this helper function still exits with status 129, and
t0012 insists on it. After converting all the mistaken callers of
usage_with_options() to call this new helper, we may want to address
it---the end user is asking us to give the help text, and we are
doing exactly as asked, so there is no reason to exit with non-zero
status.
Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 16 Jan 2025 21:35:48 +0000 (13:35 -0800)]
t0012: optionally check that "-h" output goes to stdout
For most commands, "git foo -h" will send the help output to stdout, as
this is what parse-options.c does. But some commands send it to stderr
instead. This is usually because they call usage_with_options(), and
should be switched to show_usage_help_and_exit_if_asked().
Currently t0012 is permissive and allows either behavior. We'd like it
to eventually enforce that help goes to stdout, and teaching it to do so
identifies the commands that need to be changed. But during the
transition period, we don't want to enforce that for most test runs.
So let's introduce a flag that will let most test runs use the
permissive behavior, and people interested in converting commands can
run:
GIT_TEST_HELP_MUST_BE_STDOUT=1 ./t0012-help.sh
to see the failures. Eventually (when all builtins have been converted)
we'll remove this flag entirely and always check the strict behavior.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 17 Jan 2025 02:05:13 +0000 (18:05 -0800)]
gitcli: document that command line trumps config and env
We centrally explain that "--no-whatever" is the way to countermand
the "--whatever" option. Explain that a configured default and the
value specified by an environment variable can be overridden by the
corresponding command line option, too.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 17 Jan 2025 00:35:14 +0000 (16:35 -0800)]
Merge branch 'ps/meson-weak-sha1-build'
meson-based build now supports the unsafe-sha1 build knob.
* ps/meson-weak-sha1-build:
meson: provide a summary of configured backends
meson: wire up unsafe SHA1 backend
meson: add missing dots for build options
meson: simplify conditions for HTTPS and SHA1 dependencies
meson: require SecurityFramework when it's used as SHA1 backend
meson: deduplicate access to SHA1/SHA256 backend options
meson: consistenlty spell 'CommonCrypto'
Junio C Hamano [Fri, 17 Jan 2025 00:35:14 +0000 (16:35 -0800)]
Merge branch 'ps/more-sign-compare'
More -Wsign-compare fixes.
* ps/more-sign-compare:
sign-compare: avoid comparing ptrdiff with an int/unsigned
commit-reach: use `size_t` to track indices when computing merge bases
shallow: fix -Wsign-compare warnings
builtin/log: fix remaining -Wsign-compare warnings
builtin/log: use `size_t` to track indices
commit-reach: use `size_t` to track indices in `get_reachable_subset()`
commit-reach: use `size_t` to track indices in `remove_redundant()`
commit-reach: fix type of `min_commit_date`
commit-reach: fix index used to loop through unsigned integer
prio-queue: fix type of `insertion_ctr`
Junio C Hamano [Fri, 17 Jan 2025 00:35:13 +0000 (16:35 -0800)]
Merge branch 'as/long-option-help-i18n'
Tweak the help text used for the option value placeholders by
parse-options API so that translations can customize the "<>"
placeholder signal (e.g. "--option=<value>").
* as/long-option-help-i18n:
parse-options: localize mark-up of placeholder text in the short help
Junio C Hamano [Fri, 17 Jan 2025 00:35:13 +0000 (16:35 -0800)]
Merge branch 're/submodule-parse-opt'
"git submodule" learned various ways to spell the same option,
e.g. "--branch=B" can be spelled "--branch B" or "-bB".
* re/submodule-parse-opt:
git-submodule.sh: rename some variables
git-submodule.sh: improve variables readability
git-submodule.sh: add some comments
git-submodule.sh: get rid of unused variable
git-submodule.sh: get rid of isnumber
git-submodule.sh: improve parsing of short options
git-submodule.sh: improve parsing of some long options
Sam James [Tue, 14 Jan 2025 14:47:10 +0000 (14:47 +0000)]
meson: fix missing deps for technical articles
We need an explicit `depends: documentation_deps` so that all of our
Documentation targets know they require asciidoc.conf. This shows up
as parallel build failures with it not yet being available.
Other targets look OK already.
Signed-off-by: Sam James <sam@gentoo.org> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Toon Claes [Tue, 14 Jan 2025 11:15:23 +0000 (12:15 +0100)]
meson: ensure correct version-def.h is used
To build the libgit-version library, Meson first generates
`version-def.h` in the build directory. Then it compiles `version.c`
into a library. During compilation, Meson tells to include both the
build directory and the project root directory.
However, when the user previously has compiled Git using Make, they will
have a `version-def.h` file in project root directory as well. Because
`version-def.h` is included in `version.c` using the #include directive
with double quotes, some preprocessors will look for the header file in
the same directory as the source file. This will cause compilation of
`version.c` ran by Meson to include `version-def.h` previously made by
Make, which might be out of date.
To explicitly tell the preprocessor which `version-def.h` to use, pass
the absolute path of this file as macro GIT_VERSION_H to the
preprocessor using option `-D` and have `version.c` `#include
GIT_VERSION_H`. To remain working with other build systems than Meson,
include "version-def.h" if that macro is not defined.
Co-authored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 13 Jan 2025 20:55:26 +0000 (12:55 -0800)]
Sync with Git 2.47.2
Git 2.47.2
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmdkT1sACgkQsLXohpav
# 5svdhRAAq0WoZIg+33vYNNVSTm3Ux9RJslmXs3lQuhuUJ61hK/28drSLU29GH7x7
# 3nmmjp1cegnXRVLBAfoYDdzPprNNrQFQEHQEzgG/GDZw0OXn+WTZuNyrrUYoa+sd
# QSLlElRj2qrpHIMOsMIBKBSNB+qjJHOMGdxcBAS768TfnQpGIpc1KJa24TxsVBzC
# ScP4uvrFfPyQrqFUgiUhCeqLnO/6T5i/QAn/8cS5a1+zor5ZHSlw28TZTOxN2odo
# Rulp/FtehiDEzmRowgD3M4fImAPY6Ib6VORCYASqpJFFla30tu2bQqEi6raOMTec
# hg5Ibkmj6fHFONaYvoTMRkYHmtUnNgIPU/CYPwswNk8w1+PPQfJ+TYjBXOQgdTLW
# F0azHBHh7NRmEHVydiF9CqjgNVRzjO4IEZfGqXNFPPMvR6UUzDaIkrpYbwXBFMin
# GNPV3QISeXj9ROjJoCv0nclXETwWemykjZlD6b5krXn5TaJlFb+69qJvXrCLq5WY
# EoevSqKkB9HVK9si7P8Sh1cPGOr3kfiFPmMNKFVI8l0+iDFgBywOomWNS/JEzqu1
# nN142DKdL1W/rkeMUhbX2h11CZNvHKIOy3iaA4MTOing8/eMzyUUQ73Ck7odYs4f
# rZ0tTXKJhxojPvBpTxYe9SxM0bDLREiOv0zX76+sIuhbAQCmk0o=
# =MNNf
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 19 Dec 2024 08:52:43 AM PST
# gpg: using RSA key E1F036B1FEE7221FC778ECEFB0B5E88696AFE6CB
# gpg: Good signature from "Junio C Hamano <gitster@pobox.com>" [ultimate]
# gpg: aka "Junio C Hamano <junio@pobox.com>" [ultimate]
# gpg: aka "Junio C Hamano <jch@google.com>" [ultimate]
* tag 'v2.47.2':
Git 2.47.2
Git 2.46.3
Git 2.45.3
Git 2.44.3
Git 2.43.6
Git 2.42.4
Git 2.41.3
Git 2.40.4
credential: disallow Carriage Returns in the protocol by default
credential: sanitize the user prompt
credential_format(): also encode <host>[:<port>]
t7300: work around platform-specific behaviour with long paths on MinGW
compat/regex: fix argument order to calloc(3)
mingw: drop bogus (and unneeded) declaration of `_pgmptr`
ci: remove 'Upload failed tests' directories' step from linux32 jobs
Elijah Newren [Mon, 13 Jan 2025 17:13:37 +0000 (17:13 +0000)]
object-name: be more strict in parsing describe-like output
From Documentation/revisions.txt:
'<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb'::
Output from `git describe`; i.e. a closest tag, optionally
followed by a dash and a number of commits, followed by a dash, a
'g', and an abbreviated object name.
which means that output of the format
${REFNAME}-${INTEGER}-g${HASH}
should parse to fully expanded ${HASH}. This is fine. However, we
currently don't validate any of ${REFNAME}-${INTEGER}, we only parse
-g${HASH} and assume the rest is valid. That is problematic, since it
breaks things like
which, when commit (or tree or blob) affed exists, will not return us
information about the file we are looking for but will instead
erroneously tell us about object affed.
A few additional notes:
- This is a slight backward incompatibility break, because we used
to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}. However,
a backward incompatible break is necessary, because there is no
other way for someone to be more specific and disambiguate that they
want the blob master:path/to/who-gabbed instead of the object abbed.
- There is a possibility that check_refname_format() rules change in
the future. However, we can only realistically loosen the rules
for what that function accepts rather than tighten. If we were to
tighten the rules, some real world repositories may already have
refnames that suddenly become unacceptable and we break those
repositories. As such, any describe-like syntax of the form
${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the
changes in this commit will remain valid in the future.
- The fact that check_refname_format() rules could loosen in the
future is probably also an important reason to make this change. If
the rules loosen, there might be additional cases within
${GARBAGE}-g${HASH} that become ambiguous in the future. While
abbreviated hashes can be disambiguated by abbreviating less, it may
well be that these alternative object names have no way of being
disambiguated (much like pathnames cannot be). Accepting all random
${GARBAGE} thus makes it difficult for us to allow future
extensions to object naming.
So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are
present in the string, and would be considered a valid ref and
non-negative integer.
Also, add a few tests for git describe using object names of the form
${REVISION_NAME}${MODIFIERS}
since an early version of this patch failed on constructs like
git describe v2.48.0-rc2-161-g6c2274cdbc^0
Reported-by: Gabriel Amaral <gabriel-amaral@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Mon, 13 Jan 2025 17:13:36 +0000 (17:13 +0000)]
object-name: fix resolution of object names containing curly braces
Given a branch name of 'foo{bar', commands like
git cat-file -p foo{bar:README.md
should succeed (assuming that branch had a README.md file, of course).
However, the change in cce91a2caef9 (Change 'master@noon' syntax to
'master@{noon}'., 2006-05-19) presumed that curly braces would always
come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md'
to entirely miss the ':' and assume there's no object being referenced.
In short, git would report:
fatal: Not a valid object name foo{bar:README.md
Change the parsing to only make the assumption of paired curly braces
immediately after either a '@' or '^' character appears.
Add tests for this, as well as for a few other test cases that initial
versions of this patch broke:
* 'foo@@{...}'
* 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}'
Note that we'd prefer not duplicating the special logic for "@^" characters
here, because if get_oid_basic() or interpret_nth_prior_checkout() or
get_oid_basic() or similar gain extra methods of using curly braces,
then the logic in get_oid_with_context_1() would need to be updated as
well. But it's not clear how to refactor all of these to have a simple
common callpoint with the specialized logic.
Reported-by: Gabriel Amaral <gabriel-amaral@github.com> Helped-by: Michael Haggerty <mhagger@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
M Hickford [Fri, 10 Jan 2025 22:54:37 +0000 (22:54 +0000)]
docs: discuss caching personal access tokens
Describe problems storing personal access tokens in git-credential-cache
and suggest alternatives.
Research suggests that many users are confused about this:
> the point of passwords is that (ideally) you memorise them [so]
> they're never stored anywhere in plain text. Yet GitHub's personal
> access token system seems to basically force you to store the token in
> plain text?
https://stackoverflow.com/questions/46645843/where-to-store-my-git-personal-access-token#comment89963004_46645843 Signed-off-by: M Hickford <mirth.hickford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
M Hickford [Fri, 10 Jan 2025 22:54:36 +0000 (22:54 +0000)]
docs: list popular credential helpers
git-credential-store saves credentials unencrypted on disk. It is the
least secure choice of credential helper. Nevertheless, it appears
several times more popular than any other credential helper [1].
Junio C Hamano [Fri, 10 Jan 2025 17:19:33 +0000 (09:19 -0800)]
Merge branch 'ps/build-sign-compare'
Last-minute fix for a regression in "git blame --abbrev=<length>"
when insane <length> is specified; we used to correctly cap it to
the hash output length but broke it during the cycle.
* ps/build-sign-compare:
builtin/blame: fix out-of-bounds write with blank boundary commits
builtin/blame: fix out-of-bounds read with excessive `--abbrev`
"Why would one want to run it in parallel?" I hear you ask. I am glad
you are curious, because a curious story is what it is, indeed.
The `GIT-VERSION-GEN` script is quite a pillar of Git's source code,
with most lines being unchanged for the past 15 years. Until the v2.48.0
release candidate cycle.
Its original purpose was to generate the version string and store it in
the `GIT-VERSION-FILE`.
This paradigm changed quite dramatically when support for building with
Meson was introduced. Most crucially, a38edab7c88b (Makefile: generate
doc versions via GIT-VERSION-GEN, 2024-12-06) changed the way the
documentation is built by using the `GIT-VERSION-GEN` file to write out
the `asciidocor-extensions.rb` and `asciidoc.conf` files with now
hard-coded version strings.
Crucially, the Makefile rule to generate those files needs to be run in
every build because `GIT_VERSION` could have been specified in the
`make` command-line, which would require these files to be modified.
This introduced a surprising race condition!
And this is how that race surfaces: When calling `make -j2 html man`
from the top-level directory (a variant of which is invoked in Git for
Windows' release process), two sub-processes are spawned, a `make -C
Documentation html` one and a `make -C Documentation man` one. Both run
the rule to (re-)generate `asciidoctor-extensions.rb` or
`asciidoc.conf`, invoking `GIT-VERSION-GEN` to do so. That script first
generates a temporary file (appending the `+` character to the
filename), then looks whether it contains something different than the
already existing file (if it exists, that is), and either replaces it if
needed, or removes the temporary file. If one of the two parallel
invocations removes that temporary file before the other can compare it,
or even worse: if one tries to replace the target file just after the
other _started_ writing the temporary file (but did not finish writing
it yet), that race condition now causes bad builds.
This may sound highly theoretical, but due to the design of Git's build
process, Git for Windows is forced to use a (slow) POSIX emulation layer
to run that script and in the blink of an eye it becomes very much not
theoretical at all. See Exhibit A: These GitHub workflow runs failed
because one of the two competing `make` processes tried to remove the
temporary file when the other process had already done so:
While it is undesirable to run this script over and over again,
certainly when this involves above-mentioned slow POSIX emulation layer,
the stage of the release cycle in which we are presently finding
ourselves does not lend itself to a re-design where this script could be
run once, and once only, but instead dictates that a quick and reliable
work-around be implemented that prevents the race condition without
changing the overall architecture of the build process.
This patch does that: By using a filename suffix for the temporary file
which is based on the currently-executing script's process ID, We
guarantee that the two competing invocations cannot overwrite or remove
each others' temporary files.
The filename suffix still ends in `+` to ensure that the temporary
artifacts are matched by the `*+` pattern in `.gitignore` that was added
in f9bbaa384ef (Add intermediate build products to .gitignore,
2009-11-08).
Helped-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/blame: fix out-of-bounds write with blank boundary commits
When passing the `-b` flag to git-blame(1), then any blamed boundary
commits which were marked as uninteresting will not get their actual
commit ID printed, but will instead be replaced by a couple of spaces.
The flag can lead to an out-of-bounds write as though when combined with
`--abbrev=` when the abbreviation length is longer than `GIT_MAX_HEXSZ`
as we simply use memset(3p) on that array with the user-provided length
directly. The result is most likely that we segfault.
An obvious fix would be to cull `length` to `GIT_MAX_HEXSZ` many bytes.
But when the underlying object ID is SHA1, and if the abbreviated length
exceeds the SHA1 length, it would cause us to print more bytes than
desired, and the result would be misaligned.
Instead, fix the bug by computing the length via strlen(3p). This makes
us write as many bytes as the formatted object ID requires and thus
effectively limits the length of what we may end up printing to the
length of its hash. If `--abbrev=` asks us to abbreviate to something
shorter than the full length of the underlying hash function it would be
handled by the call to printf(3p) correctly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/blame: fix out-of-bounds read with excessive `--abbrev`
In 6411a0a896 (builtin/blame: fix type of `length` variable when
emitting object ID, 2024-12-06) we have fixed the type of the `length`
variable. In order to avoid a cast from `size_t` to `int` in the call to
printf(3p) with the "%.*s" formatter we have converted the code to
instead use fwrite(3p), which accepts the length as a `size_t`.
It was reported though that this makes us read over the end of the OID
array when the provided `--abbrev=` length exceeds the length of the
object ID. This is because fwrite(3p) of course doesn't stop when it
sees a NUL byte, whereas printf(3p) does.
Fix the bug by reverting back to printf(3p) and culling the provided
length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
`int`.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adam Johnson [Wed, 8 Jan 2025 23:35:23 +0000 (23:35 +0000)]
difftool docs: restore correct position of tool list
2a9dfdf260 (difftool docs: de-duplicate configuration sections, 2022-09-07)
moved the difftool documentation, but missed moving this "include" line that
includes the generated list of diff tools, as referenced in the moved text.
Restore the correct position of the included list.
Signed-off-by: Adam Johnson <me@adamj.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Seyi Kuforiji [Thu, 9 Jan 2025 14:09:52 +0000 (15:09 +0100)]
t/unit-tests: convert hash to use clar test framework
Adapt the hash test functions to clar framework by using clar
assertions where necessary. Following the consensus to convert
the unit-tests scripts found in the t/unit-tests folder to clar driven by
Patrick Steinhardt. Test functions are structured as a standalone to
test individual hash string and literal case.
Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t-reftable-basics: allow for `malloc` to be `#define`d
As indicated by the `#undef malloc` line in `reftable/basics.h`, it is
quite common to use allocators other than the default one by defining
`malloc` constants and friends.
This pattern is used e.g. in Git for Windows, which uses the powerful
and performant `mimalloc` allocator.
Furthermore, in `reftable/basics.c` this `#undef malloc` is
_specifically_ disabled by virtue of defining the
`REFTABLE_ALLOW_BANNED_ALLOCATORS` constant before including
`reftable/basic.h`, to ensure that such a custom allocator is also used
in the reftable code.
However, in 8db127d43f5b (reftable: avoid leaks on realloc error,
2024-12-28) and in 2cca185e8517 (reftable: fix allocation count on
realloc error, 2024-12-28), `reftable_set_alloc()` function calls were
introduced that pass `malloc`, `realloc` and `free` function pointers as
parameters _after_ `reftable/basics.h` ensured that they were no longer
`#define`d. This would override the custom allocator and re-set it to
the default allocator provided by, say, libc or MSVCRT.
This causes problems because those calls happen after the initial
allocator has already been used to initialize an array, which is
subsequently resized using the overridden default `realloc()` allocator.
You cannot mix and match allocators like that, which leads to a
`STATUS_HEAP_CORRUPTION` (C0000374) on Windows, and when running this
unit test through shell and/or `prove` (which only support 7-bit status
codes), it surfaces as exit code 127.
It is actually unnecessary to use those function pointers to
`malloc`/`realloc`/`free`, though: The `reftable` code goes out of its
way to fall back to the initial allocator when passing `NULL` parameters
instead. So let's do that instead of causing heap corruptions.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sam James [Wed, 8 Jan 2025 03:42:37 +0000 (03:42 +0000)]
meson: fix perl dependencies
`generate_perl_command` needs `depends: [git_version_file]` and the uses
in top-level meson.build were fine, but the ones in perl/ weren't, causing
parallel build failures in some cases as GIT-BUILD-OPTIONS wasn't yet
available.
Signed-off-by: Sam James <sam@gentoo.org> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct verb tense, add missing words, avoid double blank lines,
and rephrase things that don’t read well to me like “Turn this linkage
to relative paths”.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Tue, 7 Jan 2025 16:29:15 +0000 (10:29 -0600)]
fsck: reject misconfigured fsck.skipList
In Git, fsck operations can ignore known broken objects via the
`fsck.skipList` configuration. This option expects a path to a file with
the list of object names. When the configuration is specified without a
path, an error message is printed, but the command continues as if the
configuration was not set. Configuring `fsck.skipList` without a value
is a misconfiguration so config parsing should be more strict and reject
it.
Update `git_fsck_config()` to no longer ignore misconfiguration of
`fsck.skipList`. The same behavior is also present for
`fetch.fsck.skipList` and `receive.fsck.skipList` so the configuration
parsers for these are updated to ensure the related operations remain
consistent.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable library uses randomness in two call paths:
- When reading a stack in case some of the referenced tables
disappears. The randomness is used to delay the next read by a
couple of milliseconds.
- When writing a new table, where the randomness gets appended to the
table name (e.g. "0x000000000001-0x000000000002-0b1d8ddf.ref").
In neither of these cases do we need strong randomness.
Unfortunately though, we have observed test failures caused by the
former case. In t0610 we have a test that spawns a 100 processes at
once, all of which try to write a new table to the stack. And given that
all of the processes will require randomness, it can happen that these
processes make the entropy pool run dry, which will then cause us to
die:
+ test_seq 100
+ printf %s commit\trefs/heads/branch-%s\n 68d032e9edd3481ac96382786ececc37ec28709e 1
+ printf %s commit\trefs/heads/branch-%s\n 68d032e9edd3481ac96382786ececc37ec28709e 2
...
+ git update-ref refs/heads/branch-98 HEAD
+ git update-ref refs/heads/branch-97 HEAD
+ git update-ref refs/heads/branch-99 HEAD
+ git update-ref refs/heads/branch-100 HEAD
fatal: unable to get random bytes
fatal: unable to get random bytes
fatal: unable to get random bytes
fatal: unable to get random bytes
fatal: unable to get random bytes
fatal: unable to get random bytes
fatal: unable to get random bytes
The report was for NonStop, which uses OpenSSL as the backend for
randomness. In the preceding commit we have adapted that backend to also
return randomness in case the entropy pool is empty and the caller
passes the `CSPRNG_BYTES_INSECURE` flag. Do so to fix the issue.
Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `csprng_bytes()` function generates randomness and writes it into a
caller-provided buffer. It abstracts over a couple of implementations,
where the exact one that is used depends on the platform.
These implementations have different guarantees: while some guarantee to
never fail (arc4random(3)), others may fail. There are two significant
failures to distinguish from one another:
- Systemic failure, where e.g. opening "/dev/urandom" fails or when
OpenSSL doesn't have a provider configured.
- Entropy failure, where the entropy pool is exhausted, and thus the
function cannot guarantee strong cryptographic randomness.
While we cannot do anything about the former, the latter failure can be
acceptable in some situations where we don't care whether or not the
randomness can be predicted.
Introduce a new `CSPRNG_BYTES_INSECURE` flag that allows callers to opt
into weak cryptographic randomness. The exact behaviour of the flag
depends on the underlying implementation:
- `arc4random_buf()` never returns an error, so it doesn't change.
- `getrandom()` pulls from "/dev/urandom" by default, which never
blocks on modern systems even when the entropy pool is empty.
- `getentropy()` seems to block when there is not enough randomness
available, and there is no way of changing that behaviour.
- `GtlGenRandom()` doesn't mention anything about its specific
failure mode.
- The fallback reads from "/dev/urandom", which also returns bytes in
case the entropy pool is drained in modern Linux systems.
That only leaves OpenSSL with `RAND_bytes()`, which returns an error in
case the returned data wouldn't be cryptographically safe. This function
is replaced with a call to `RAND_pseudo_bytes()`, which can indicate
whether or not the returned data is cryptographically secure via its
return value. If it is insecure, and if the `CSPRNG_BYTES_INSECURE` flag
is set, then we ignore the insecurity and return the data regardless.
It is somewhat questionable whether we really need the flag in the first
place, or whether we wouldn't just ignore the potentially-insecure data.
But the risk of doing that is that we might have or grow callsites that
aren't aware of the potential insecureness of the data in places where
it really matters. So using a flag to opt-in to that behaviour feels
like the more secure choice.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 7 Jan 2025 07:18:24 +0000 (02:18 -0500)]
t7407: use test_grep
There are a few grep calls here that can benefit from test_grep, which
produces more user-friendly output when it fails.
One of these calls also passes "-sq", which is curious. The "-q" option
suppresses the matched output. But test output is either already
redirected to /dev/null in non-verbose mode, and in verbose mode it's
better to see the output. The "-s" option suppresses errors opening
files, but we are just grepping in the "expected" file we just
generated, so it should not be needed. Neither of these was really
hurting anything, but they are not a style we'd like to see emulated. So
get rid of them.
(It is also curious to grep in the expected file in the first place, but
that is because we are auto-generating the expectation from a Git
command. So this is double-checking it did what we wanted).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 7 Jan 2025 07:08:31 +0000 (02:08 -0500)]
test-lib: add a few comments to LSan log checking
Commit b119a687d4 (test-lib: ignore leaks in the sanitizer's thread
code, 2025-01-01) added code to suppress a false positive in the leak
checker. But if you're just reading the code, the obscure grep call is a
bit of a head-scratcher. Let's add a brief comment explaining what's
going on (and anybody digging further can find this commit or that one
for all the details).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 7 Jan 2025 07:07:52 +0000 (02:07 -0500)]
test-lib: simplify lsan results check
We want to know if there are any leaks logged by LSan in the results
directory, so we run "find" on the containing directory and pipe it to
xargs. We can accomplish the same thing by just globbing in the shell
and passing the result to grep, which has a few advantages:
- it's one fewer process to run
- we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we
checked at the beginning of the function, and is the same glob used
to show the logs in check_test_results_san_file_
- this correctly handles the case where TEST_OUTPUT_DIRECTORY has a
space in it. For example doing:
mkdir "/tmp/foo bar"
TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test
would yield a lot of:
grep: /tmp/foo: No such file or directory
grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory
when there are leaks. We could do the same thing with "xargs
--null", but that isn't portable.
We are now subject to command-line length limits, but that is also true
of the globbing cat used to show the logs themselves. This hasn't been a
problem in practice.
We do need to use "grep -s" for the case that the glob does not expand
(i.e., there are not any log files at all). This option is in POSIX, and
has been used in t7407 for several years without anybody complaining.
This also also naturally handles the case where the surrounding
directory has already been removed (in which case there are likewise no
files!), dropping the need to comment about it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 7 Jan 2025 07:05:01 +0000 (02:05 -0500)]
test-lib: invert return value of check_test_results_san_file_empty
We have a function to check whether LSan logged any leaks. It returns
success for no leaks, and non-zero otherwise. This is the simplest thing
for its callers, who want to say "if no leaks then return early". But
because it's implemented as a shell pipeline, you end up with the
awkward:
where the "!" is actually negating the final grep. Switch the return
value (and name) to return success when there are leaks. This should
make the code a little easier to read, and the negation in the callers
still reads pretty naturally.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
D. Ben Knoble [Mon, 6 Jan 2025 21:47:06 +0000 (21:47 +0000)]
completion: repair config completion for Zsh
Commit 1e0ee4087e (completion: add and use
__git_compute_first_level_config_vars_for_section, 2024-02-10) uses an
indirect variable syntax that is only valid for Bash, but the Zsh
completion code relies on the Bash completion code to function. Zsh
supports a different indirect variable expansion using ${(P)var}, but in
`emulate ksh` mode does not support Bash's ${!var}.
This manifests as completing strange config options like
"__git_first_level_config_vars_for_section_remote" as a choice for the
command line
git config set remote.
Using Zsh's C-x ? _complete_debug widget with the cursor at the end of
that command line captures a trace, in which we see (some details
elided):
We perform the test for __git_compute_config_vars correctly, but the
${!this_section} references are not expanded as expected.
Instead, portably expand indirect references through the new
__git_indirect. Contrary to some versions you might find online [1],
this version avoids echo non-portabilities [2] [3] and correctly quotes
the indirect expansion after eval (so that the result is not split or
globbed before being handed to printf).
The following demo program demonstrates how this works:
b=1
indirect() {
eval printf '%s' "\"\$$1\""
}
f() {
# Comment this out to see that it works for globals, too. Or, use
# a value with spaces like '2 3 4' to see how it handles those.
local b=2
local a=b
test -n "$(indirect $a)" && echo nice
}
f
When placed in a file "demo", then both
bash -x demo
and
zsh -xc 'emulate ksh -c ". ./demo"' |& tail
provide traces showing that "$(indirect $a)" produces 2 (or 1, with the
global, or "2 3 4" as a single string, etc.).
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Acked-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-file: retry linking file into place when occluding file vanishes
Prior to 0ad3d65652 (object-file: fix race in object collision check,
2024-12-30), callers could expect that a successful return from
`finalize_object_file()` means that either the file was moved into
place, or the identical bytes were already present. If neither of those
happens, we'd return an error.
Since that commit, if the destination file disappears between our
link(3p) call and the collision check, we'd return success without
actually checking the contents, and without retrying the link. This
solves the common case that the files were indeed the same, but it means
that we may corrupt the repository if they weren't (this implies a hash
collision, but the whole point of this function is protecting against
hash collisions).
We can't be pessimistic and assume they're different; that hurts the
common case that the mentioned commit was trying to fix. But after
seeing that the destination file went away, we can retry linking again.
Adapt the code to do so when we see that the destination file has racily
vanished. This should generally succeed as we have just observed that
the destination file does not exist anymore, except in the very unlikely
event that it gets recreated by another concurrent process again.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-file: don't special-case missing source file in collision check
In 0ad3d65652 (object-file: fix race in object collision check,
2024-12-30) we have started to ignore ENOENT when opening either the
source or destination file of the collision check. This was done to
handle races more gracefully in case either of the potentially-colliding
disappears.
The fix is overly broad though: while the destination file may indeed
vanish racily, this shouldn't ever happen for the source file, which is
a temporary object file (either loose or in packfile format) that we
have just created. So if any concurrent process would have removed that
temporary file it would indicate an actual issue.
Stop treating ENOENT specially for the source file so that we always
bubble up this error.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-file: rename variables in `check_collision()`
Rename variables used in `check_collision()` to clearly identify which
file is the source and which is the destination. This will make the next
step easier to reason about when we start to treat those files different
from one another.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
9e2b7005be (fetch set_head: add warn-if-not-$branch option, 2024-12-05)
tried to expand the advice message for set_head with the new option, but
unfortunately did not manage to add the right incantation. Fix the
advice message with the correct usage of warn-if-not-$branch.
Reported-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7110: replace `test -f` with `test_path_is_*` helpers
`test -f` and `! test -f` do not provide clear error messages when they fail.
To enhance debuggability, use `test_path_is_file` and `test_path_is_missing`,
which instead provide more informative error messages.
Note that `! test -f` checks if a path is not a file, while
`test_path_is_missing` verifies that a path does not exist. In this specific
case the tests are meant to check the absence of the path, making
`test_path_is_missing` a valid replacement.
Signed-off-by: Matteo Bagnolini <matteobagnolini2003@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Martin Ågren [Fri, 3 Jan 2025 11:33:31 +0000 (12:33 +0100)]
gitcli.txt: typeset pathnames as monospace
Commit 1bc1e94091 (doc: option value may be separate for valid reasons,
2024-11-25) added a paragraph discussing tilde-expansion of, e.g.,
~/directory/file.
The tilde character has a special meaning to asciidoc tools. In this
particular case, AsciiDoc matches up the two tildes in "e.g.
~/directory/file or ~u/d/f" and sets the text between them using
subscript. In the manpage, where subscripting is not possible, this
renders as "e.g. /directory/file oru/d/f".
These paths are literal values, which our coding guidelines want typeset
as verbatim using backticks. Do that. One effect of this is indeed that
the asciidoc tools stop interpreting tilde and other special characters.
Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>