René Scharfe [Thu, 3 Oct 2024 15:51:01 +0000 (17:51 +0200)]
archive: fix misleading error message
The error message added by 296743a7ca (archive: load index before
pathspec checks, 2024-09-21) is misleading: unpack_trees() is not
touching the working tree at all here, but just loading a tree into
the index. Correct it.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sat, 21 Sep 2024 20:23:39 +0000 (22:23 +0200)]
archive: load index before pathspec checks
git archive checks whether pathspec arguments match anything to avoid
surprises due to typos and later loads the index to get attributes.
This order was OK when these features were introduced by ba053ea96c
(archive: do not read .gitattributes in working directory, 2009-04-18)
and d5f53d6d6f (archive: complain about path specs that don't match
anything, 2009-12-12).
But when attribute matching was added to pathspec in b0db704652
(pathspec: allow querying for attributes, 2017-03-13), the pathspec
checker in git archive did not support it fully, because it lacks the
attributes from the index.
Load the index earlier, before the pathspec check, to support attr
pathspecs.
Reported-by: Ronan Pigott <ronan@rjp.ie> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 22 Jul 2024 21:17:55 +0000 (14:17 -0700)]
Doc: fix Asciidoctor css workaround
The previous step introduced docinfo.html to be used to tweak the
CSS used by the asciidoctor, that by default renders <code> inside
<pre> as a block element, breaking the SYNOPSIS section of a few
pages that adopted a new convention we use since Git 2.45.
But in this project, HTML files are all generated. We do not force
any human to write HTML by hand, which is an unusual and cruel
punishment. "*.html" is in the .gitignore file, and "make clean"
removes them. Having a tracked .html file makes "make clean" make
the tree dirty by removing the tracked docinfo.html file.
Let's do an obvious, minimum and stupid workaround to generate that
file at runtime instead. The mark-up is being rethought in a major
way for the next development cycle, and the CSS workaround we added
in the previous step may have to adjusted, possibly in a large way,
anyway.
It was reported that t1460-refs-migrate.sh fails when using Cygwin with
errors like the following:
error: could not link file '.git/ref_migration.sr9pEF/reftable' to '.git/reftable': Permission denied
As some debugging surfaced, the root cause of this is that some files of
the newly-initialized ref store are still open when the target format is
the "reftable" format, and Cygwin refuses to rename open files.
Fix this issue by closing the new ref store before renaming its files
into place. This is a slight change in behaviour compared to before,
where we kept the new ref store open and then updated the repository's
ref store to point to it.
While we could re-open the new ref store after we have moved files
around, this is ultimately unnecessary. We know that the only user of
`repo_migrate_ref_storage_format()` is the git-refs(1) command, and it
won't access the ref store after it has been migrated anyway. So
reinitializing the ref store would be a waste of time. Regardless of
that it is still sensible to leave the repository in a consistent state.
But instead of reinitializing the ref store, we can simply unset the
repo's ref store altogether and let `get_main_ref_store()` lazily
initialize the new ref store as required.
Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 76880f0510c (doc: git-clone: apply new documentation formatting
guidelines, 2024-03-29), the synopsis of `git clone`'s manual page is
rendered differently than before; Its parent commit did the same for
`git init`.
The result looks quite nice. When rendered with AsciiDoc, that is. When
rendered using AsciiDoctor and displayed in a graphical web browser such
as Firefox, Chrome, Edge, etc, the result is quite unpleasant to my eye,
reading something like this:
The reason is that AsciiDoctor's default style sheet contains this (see
https://github.com/asciidoctor/asciidoctor/blob/854923b15533/src/stylesheets/asciidoctor.css#L519-L521
for context):
pre > code {
display: block;
}
It is this `display: block` that forces the parts that are enclosed in
`<code>` tags (such as the `git clone` or the `--template=` part) to be
rendered on their own line.
Side note: This seems not to affect console web browsers like `lynx` or
`w3m`, most likely because most style sheet directions cannot be
respected in text terminals and therefore they seem to punt on style
sheets altogether.
To fix this, let's apply the method recommended by AsciiDoctor in
https://docs.asciidoctor.org/asciidoctor/latest/html-backend/default-stylesheet/#customize-docinfo
to partially override AsciiDoctor's default style sheet so that the
`<code>` sections of the synopsis are no longer each rendered on their
own, individual lines.
This fixes https://github.com/git-for-windows/git/issues/5063.
Even on the Git home page, where AsciiDoctor's default stylesheet is
_not_ used, this change resulted in some unpleasant rendering where not
only the font is changed for the `<code>` sections of the synopsis, but
padding and a different background color make the visual impression
quite uneven. This has been addressed in the meantime, via
https://github.com/git/git-scm.com/commit/a492d0565512.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
doc: git-clone fix discrepancy between asciidoc and asciidoctor
Asciidoc.py does not have the concept of generalized roles, whereas
asciidoctor interprets [foo]`blah` as blah with role foo in the
synopsis, making in effect foo disappear in the output. Note that
square brackets not directly followed by an inline markup do not
define a role, which is why we do not have the issue on other parts of
the documentation.
In order to get a consistant result across asciidoctor and
asciidoc.py, the hack is to use the {empty} entity
to split the bracket part from the inline format part.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts b7d6f23a171 (midx-write.c: use `--stdin-packs` when
repacking, 2024-04-01) and then marks the test created in the previous
change as passing.
The fundamental issue with the reverted change is that the focus on
pack-files separates the object selection from how the multi-pack-index
selects a single pack-file for an object ID with multiple copies among
the tracked pack-files.
The change was made with the intention of improving delta compression in
the resulting pack-file, but that can be resolved with the existing
object list mechanism. There are other potential pitfalls of doing an
object walk at this time if the repository is a blobless partial clone,
and that will require additional testing on top of the one that changes
here.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git 2.45.0 included the change b7d6f23a171 (midx-write.c: use
`--stdin-packs` when repacking, 2024-04-01) which caused the 'git
multi-pack-index repack' command to use 'git pack-objects --stdin-packs'
instead of listing the objects to repack. While this change was
motivated by efficient cross-process communication and the ability to
improve delta compression, it breaks a fundamental function of the
'incremental-repack' task that is enabled by default in Scalar clones or
Git repositories that run 'git maintenance start'.
The 'incremental-repack' task performs a two-step process of the
'expire' and 'repack' subcommands of the 'git multi-pack-index' builtin.
The 'expire' command removes any pack-files listed in the
multi-pack-index but without any referenced objects. The 'repack' task
then finds a batch of pack-files to repack and sends their objects to
'git pack-objects'. Both the pack-files chosen for the batch and the
objects chosen to repack are based on the ones that the multi-pack-index
references. Objects that appear in a pack-file but have a duplicate copy
in a newer pack-file are not considered in this case. Since the
multi-pack-index references only the newest copy of an object, this
allows the next 'incremental-repack' task to remove the pack-files in
the next 'expire' task. This delay is intentional due to how Windows
handles may block deletion of files with open read handles.
However, the mentioned commit changed this behavior to divorce the set
of objects referenced by the multi-pack-index and instead use a set of
"included" and "excluded" pack-files in the 'git pack-objects' builtin.
When a pack-file is selected as "included", only the objects it contains
but are not in any "excluded" pack-files are considered for repacking.
This has led to client repositories failing to remove old pack-files as
they still have some referenced objects. This grows over time until the
point that Git is trying to repack the same pack-files over and over.
For now, create a test case that demonstrates the expected behavior, but
also fails in its final line. The setup here it attempting to recreate a
typical situation for a repository that uses a blobless partial clone.
There would be a large initial pack-file from the clone that is never
selected in the 'repack' batch. There are other pack-files that have a
combination of new objects from incremental fetches and possibly blobs
that are not connected to those incremental fetches; these blobs could
be filled in from commands like 'git checkout' or 'git blame'. The
pack-files also have some overlap on purpose so test-1 has some
duplicates in test-2 and test-2 has some duplicates in test-3.
At the end of the test, the test-2 pack-file still exists though it
should have been expired. This test will pass when reverting the
offending commit.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 17 Jul 2024 17:47:27 +0000 (10:47 -0700)]
Merge branch 'js/var-git-shell-path'
"git var GIT_SHELL_PATH" should report the path to the shell used
to spawn external commands, but it didn't do so on Windows, which
has been corrected.
* js/var-git-shell-path:
var(win32): do report the GIT_SHELL_PATH that is actually used
run-command: declare the `git_shell_path()` function globally
run-command(win32): resolve the path to the Unix shell early
mingw(is_msys2_sh): handle forward slashes in the `sh.exe` path, too
win32: override `fspathcmp()` with a directory separator-aware version
strvec: declare the `strvec_push_nodup()` function globally
run-command: refactor getting the Unix shell path into its own function
Junio C Hamano [Wed, 17 Jul 2024 17:47:25 +0000 (10:47 -0700)]
Merge branch 'jc/http-cookiefile'
The http.cookieFile and http.saveCookies configuration variables
have a few values that need to be avoided, which are now ignored
with warning messages.
Junio C Hamano [Wed, 17 Jul 2024 17:47:24 +0000 (10:47 -0700)]
Merge branch 'jk/test-body-in-here-doc'
The test framework learned to take the test body not as a single
string but as a here-document.
* jk/test-body-in-here-doc:
t/.gitattributes: ignore whitespace in chainlint expect files
t: convert some here-doc test bodies
test-lib: allow test snippets as here-docs
chainlint.pl: add tests for test body in heredoc
chainlint.pl: recognize test bodies defined via heredoc
chainlint.pl: check line numbers in expected output
chainlint.pl: force CRLF conversion when opening input files
chainlint.pl: do not spawn more threads than we have scripts
chainlint.pl: only start threads if jobs > 1
chainlint.pl: add test_expect_success call to test snippets
Taylor Blau [Wed, 17 Jul 2024 13:17:31 +0000 (09:17 -0400)]
Documentation: fix default value for core.maxTreeDepth
When `core.maxTreeDepth` was originally introduced via be20128bfa (add
core.maxTreeDepth config, 2023-08-31), its default value was 4096.
There have since been a couple of updates to its default value that were
not reflected in the documentation for `core.maxTreeDepth`:
- 4d5693ba05 (lower core.maxTreeDepth default to 2048, 2023-08-31)
- b64d78ad02 (max_tree_depth: lower it for MSVC to avoid stack
overflows, 2023-11-01)
Commit 4d5693ba05 lowers the default to 2048 for platforms with smaller
stack sizes, and commit b64d78ad02 lowers the default even further when
Git is compiled with MSVC.
Neither of these changes were reflected in the documentation, which I
noticed while merging newer releases back into GitHub's private fork
(which contained the original implementation of `core.maxTreeDepth`).
Update the documentation to reflect what the platform-specific default
values are.
Noticed-by: Keith W. Campbell <keithc@ca.ibm.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Martin Ågren [Wed, 17 Jul 2024 10:54:28 +0000 (12:54 +0200)]
Documentation/gitpacking: make sample configs listing blocks
This document contains a few sample config snippets. At least with
Asciidoctor, the section headers are rendered *more* indented than the
variables that follow:
[bitmapPseudoMerge "all"]
pattern = "refs/"
...
To address this, wrap these listings in AsciiDoc listing blocks. Remove
the indentation from the section headings. This is similar to how we
handle such sample config elsewhere, e.g., in config.txt.
While we're here, fix the nearby "wiht" typo.
Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 17 Jul 2024 07:00:50 +0000 (03:00 -0400)]
t4153: stop redirecting input from /dev/zero
Commit 852a171018 (am: let command-line options override saved options,
2015-08-04) redirected a few "git am" invocations from /dev/zero, even
though it did not expect "am" to read the input. This was necessary at
the time because those tests used test_terminal, and as described in 18d8c26930 (test_terminal: redirect child process' stdin to a pty,
2015-08-04):
Note that due to the way the code is structured, the child's stdin
pseudo-tty will be closed when we finish reading from our stdin. This
means that in the common case, where our stdin is attached to /dev/null,
the child's stdin pseudo-tty will be closed immediately. Some operations
like isatty(), which git-am uses, require the file descriptor to be
open, and hence if the success of the command depends on such functions,
test_terminal's stdin should be redirected to a source with large amount
of data to ensure that the child's stdin is not closed, e.g.
test_terminal git am --3way </dev/zero
But we later dropped the use of test_terminal in 53ce2e3f0a (am: add
explicit "--retry" option, 2024-06-06). That commit dropped one of the
redirections from /dev/zero but not the other.
In theory the remaining one should not cause any problems, but it turns
out that at least one platform (NonStop) does not have /dev/zero at all.
We never noticed before because it also did not pass the TTY prereq,
meaning these tests were not run at all there until 53ce2e3f0a.
So let's drop the useless /dev/zero mention. There are others in the
test suite, but they are run only for tests marked with EXPENSIVE (so
not typically by default).
Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 16 Jul 2024 18:18:58 +0000 (11:18 -0700)]
Merge branch 'bc/gitfaq-more'
A handful of entries are added to the GitFAQ document.
* bc/gitfaq-more:
doc: mention that proxies must be completely transparent
gitfaq: add entry about syncing working trees
gitfaq: give advice on using eol attribute in gitattributes
gitfaq: add documentation on proxies
Address-looking strings found on the trailer are now placed on the Cc: list after running through sanitize_address by "git send-email".
* cb/send-email-sanitize-trailer-addresses:
git-send-email: use sanitized address when reading mbox body
Junio C Hamano [Tue, 16 Jul 2024 18:18:55 +0000 (11:18 -0700)]
Merge branch 'en/ort-inner-merge-error-fix'
The "ort" merge backend saw one bugfix for a crash that happens
when inner merge gets killed, and assorted code clean-ups.
* en/ort-inner-merge-error-fix:
merge-ort: fix missing early return
merge-ort: convert more error() cases to path_msg()
merge-ort: upon merge abort, only show messages causing the abort
merge-ort: loosen commented requirements
merge-ort: clearer propagation of failure-to-function from merge_submodule
merge-ort: fix type of local 'clean' var in handle_content_merge ()
merge-ort: maintain expected invariant for priv member
merge-ort: extract handling of priv member into reusable function
Christian Hesse [Tue, 16 Jul 2024 09:55:44 +0000 (11:55 +0200)]
refs: correct the version numbers in a comment
The paragraph talks about a change made in c8f815c2 (refs: remove
functions without ref store, 2024-05-07), which is v2.46.0-rc0~119^2
and will be published as part of v2.46, not v2.45.
Signed-off-by: Christian Hesse <mail@eworm.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 15 Jul 2024 17:11:43 +0000 (10:11 -0700)]
Merge branch 'cp/unit-test-reftable-record'
A test in reftable library has been rewritten using the unit test
framework.
* cp/unit-test-reftable-record:
t-reftable-record: add tests for reftable_log_record_compare_key()
t-reftable-record: add tests for reftable_ref_record_compare_name()
t-reftable-record: add index tests for reftable_record_is_deletion()
t-reftable-record: add obj tests for reftable_record_is_deletion()
t-reftable-record: add log tests for reftable_record_is_deletion()
t-reftable-record: add ref tests for reftable_record_is_deletion()
t-reftable-record: add comparison tests for obj records
t-reftable-record: add comparison tests for index records
t-reftable-record: add comparison tests for ref records
t-reftable-record: add reftable_record_cmp() tests for log records
t: move reftable/record_test.c to the unit testing framework
Junio C Hamano [Mon, 15 Jul 2024 17:11:41 +0000 (10:11 -0700)]
Merge branch 'jk/tests-without-dns'
Test suite has been taught not to unnecessarily rely on DNS failing
a bogus external name.
* jk/tests-without-dns:
t/lib-bundle-uri: use local fake bundle URLs
t5551: do not confirm that bogus url cannot be used
t5553: use local url for invalid fetch
"git describe --dirty --broken" forgot to refresh the index before
seeing if there is any chang, ("git describe --dirty" correctly did
so), which has been corrected.
* as/describe-broken-refresh-index-fix:
describe: refresh the index when 'broken' flag is used
var(win32): do report the GIT_SHELL_PATH that is actually used
On Windows, Unix-like paths like `/bin/sh` make very little sense. In
the best case, they simply don't work, in the worst case they are
misinterpreted as absolute paths that are relative to the drive
associated with the current directory.
To that end, Git does not actually use the path `/bin/sh` that is
recorded e.g. when `run_command()` is called with a Unix shell
command-line. Instead, as of 776297548e (Do not use SHELL_PATH from
build system in prepare_shell_cmd on Windows, 2012-04-17), it
re-interprets `/bin/sh` as "look up `sh` on the `PATH` and use the
result instead".
This is the logic users expect to be followed when running `git var
GIT_SHELL_PATH`.
However, when 1e65721227 (var: add support for listing the shell,
2023-06-27) introduced support for `git var GIT_SHELL_PATH`, Windows was
not special-cased as above, which is why it outputs `/bin/sh` even
though that disagrees with what Git actually uses.
Let's fix this by using the exact same logic as `prepare_shell_cmd()`,
adjusting the Windows-specific `git var GIT_SHELL_PATH` test case to
verify that it actually finds a working executable.
Reported-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
run-command(win32): resolve the path to the Unix shell early
In 776297548e (Do not use SHELL_PATH from build system in
prepare_shell_cmd on Windows, 2012-04-17), the hard-coded path to the
Unix shell was replaced by passing `sh` instead when executing Unix
shell scripts in Git.
This was done because the hard-coded path to the Unix shell is incorrect
on Windows because it not only is a Unix-style absolute path instead of
a Windows one, but Git uses the runtime prefix feature on Windows, i.e.
the correct path cannot be hard-coded.
Naturally, the `sh` argument will be resolved to the full path of said
executable eventually.
To help fixing the bug where `git var GIT_SHELL_PATH` currently does not
reflect that logic, but shows that incorrect hard-coded Unix-style
absolute path, let's resolve the full path to the `sh` executable early
in the `git_shell_path()` function so that we can use it in `git var`,
too, and be sure that the output is equivalent to what `run_command()`
does when it is asked to execute a command-line using a Unix shell.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
mingw(is_msys2_sh): handle forward slashes in the `sh.exe` path, too
Whether the full path to the MSYS2 Bash is specified using backslashes
or forward slashes, in either case the command-line arguments need to be
quoted in the MSYS2-specific manner instead of using regular Win32
command-line quoting rules.
In preparation for `prepare_shell_cmd()` to use the full path to
`sh.exe` (with forward slashes for consistency), let's teach the
`is_msys2_sh()` function about this; Otherwise 5580.4 'clone with
backslashed path' would fail once `prepare_shell_cmd()` uses the full
path instead of merely `sh`.
This patch relies on the just-introduced fix where `fspathcmp()` handles
backslashes and forward slashes as equivalent on Windows.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
win32: override `fspathcmp()` with a directory separator-aware version
On Windows, the backslash is the directory separator, even if the
forward slash can be used, too, at least since Windows NT.
This means that the paths `a/b` and `a\b` are equivalent, and
`fspathcmp()` needs to be made aware of that fact.
Note that we have to override both `fspathcmp()` and `fspathncmp()`, and
the former cannot be a mere pre-processor constant that transforms calls
to `fspathcmp(a, b)` into `fspathncmp(a, b, (size_t)-1)` because the
function `report_collided_checkout()` in `unpack-trees.c` wants to
assign `list.cmp = fspathcmp`.
Also note that `fspatheq()` does _not_ need to be overridden because it
calls `fspathcmp()` internally.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
strvec: declare the `strvec_push_nodup()` function globally
This function differs from `strvec_push()` in that it takes ownership of
the allocated string that is passed as second argument.
This is useful when appending elements to the string array that have
been freshly allocated and serve no further other purpose after that.
Without declaring this function globally, call sites would allocate the
memory, only to have `strvec_push()` duplicate the string, and then the
first copy would need to be released. Having this function globally
avoids that kind of unnecessary work.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the `oidtree` test helper was turned into a unit test, a new
`lib-oid` source file was added as dependency. This was only done in the
Makefile so far, but also needs to be done in the CMake definition.
This is a companion of ed548408723d (t/: migrate helper/test-oidtree.c
to unit-tests/t-oidtree.c, 2024-06-08).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/push: call set_refspecs after validating remote
When an end-user runs "git push" with an empty string for the remote
repository name, e.g.
$ git push '' main
"git push" fails with a BUG(). Even though this is a nonsense request
that we want to fail, we shouldn't hit a BUG(). Instead we want to give
a sensible error message, e.g., 'bad repository'".
This is because since 9badf97c42 (remote: allow resetting url list,
2024-06-14), we reset the remote URL if the provided URL is empty. When
a user of 'remotes_remote_get' tries to fetch a remote with an empty
repo name, the function initializes the remote via 'make_remote'. But
the remote is still not a valid remote, since the URL is empty, so it
tries to add the URL alias using 'add_url_alias'. This in-turn will call
'add_url', but since the URL is empty we call 'strvec_clear' on the
`remote->url`. Back in 'remotes_remote_get', we again check if the
remote is valid, which fails, so we return 'NULL' for the 'struct
remote *' value.
The 'builtin/push.c' code, calls 'set_refspecs' before validating the
remote. This worked with empty repo names earlier since we would get a
remote, albeit with an empty URL. With the new changes, we get a 'NULL'
remote value, this causes the check for remote to fail and raises the
BUG in 'set_refspecs'.
Do a simple fix by doing remote validation first. Also add a test to
validate the bug fix. With this, we can also now directly pass remote to
'set_refspecs' instead of it trying to lazily obtain it.
Helped-by: Jeff King <peff@peff.net> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Piotr Szlazak [Thu, 11 Jul 2024 08:36:48 +0000 (08:36 +0000)]
doc: update http.cookieFile with in-memory cookie processing
Documentation only mentions how to read cookies from the given file
and how to save them to the file using http.saveCookies.
But underlying libcURL allows the HTTP cookies used only in memory;
cookies from the server will be accepted and sent back in successive
requests within same connection, by using an empty string as the
filename. Document this.
Signed-off-by: Piotr Szlazak <piotr.szlazak@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-lib: GIT_TEST_SANITIZE_LEAK_LOG enabled by default
As we currently describe in t/README, it can happen that:
Some tests run "git" (or "test-tool" etc.) without properly checking
the exit code, or git will invoke itself and fail to ferry the
abort() exit code to the original caller.
Therefore, GIT_TEST_SANITIZE_LEAK_LOG=true is needed to be set to
capture all memory leaks triggered by our tests.
It seems unnecessary to force users to remember this option, as
forgetting it could lead to missed memory leaks.
We could solve the problem by making it "true" by default, but that
might suggest we think "false" makes sense, which isn't the case.
Therefore, the best approach is to remove the option entirely while
maintaining the capability to detect memory leaks in blind spots of our
tests.
Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 10 Jul 2024 08:47:04 +0000 (04:47 -0400)]
t/.gitattributes: ignore whitespace in chainlint expect files
The ".expect" files in t/chainlint/ are snippets of expected output from
the chainlint script, and do not necessarily conform to our usual code
style. Especially with the recent change to retain line numbers, blank
lines in the input script end up with trailing whitespace as we print
"3 " for line 3, for example. The point of these files is to match the
output verbatim, so let's not complain about the trailing spaces.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 10 Jul 2024 08:39:31 +0000 (04:39 -0400)]
t: convert some here-doc test bodies
The t1404 script checks a lot of output from Git which contains single
quotes. Because the test snippets are themselves wrapped in the same
single-quotes, we have to resort to using $SQ to match them. This is
error-prone and makes the tests harder to read.
Instead, let's use the new here-doc feature added in the previous
commit, which lets us write anything in the test body we want (except
the here-doc end marker on a line by itself, of course).
Note that we do use "\" in our marker to avoid interpolation (which is
the whole point). But we don't use "<<-", as we want to preserve
whitespace in the snippet (and running with "-v" before and after shows
that we produce the exact same output, except with the ugly $SQ
references fixed).
I just converted every test here, even though only some of them use
$SQ. But it would be equally correct to mix-and-match styles if we don't
mind the inconsistency.
I've also converted a few tests in t0600 which were moved from t1404 (I
had written this patch before they were moved, but it seemed worth
porting over the changes rather than losing them).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This sometimes makes the snippets awkward to write, because you can't
easily use single quotes within them. We sometimes work around this with
$SQ, or by loosening regexes to use "." instead of a literal quote, or
by using double quotes when we'd prefer to use single-quotes (and just
adding extra backslash-escapes to avoid interpolation).
This commit adds another option: feeding the snippet via the function's
stdin. This doesn't conflict with anything the snippet would want to do,
because we always redirect its stdin from /dev/null anyway (which we'll
continue to do).
A few notes on the implementation:
- it would be nice to push this down into test_run_, but we can't, as
test_expect_success and test_expect_failure want to see the actual
script content to report it for verbose-mode. A helper function
limits the amount of duplication in those callers here.
- The helper function is a little awkward to call, as you feed it the
name of the variable you want to set. The more natural thing in
shell would be command substitution like:
body=$(body_or_stdin "$2")
but that loses trailing whitespace. There are tricks around this,
like:
but we'd prefer to keep such tricks in the helper, not in each
caller.
- I implemented the helper using a sequence of "read" calls. Together
with "-r" and unsetting the IFS, this preserves incoming whitespace.
An alternative is to use "cat" (which then requires the gross "."
trick above). But this saves us a process, which is probably a good
thing. The "read" builtin does use more read() syscalls than
necessary (one per byte), but that is almost certainly a win over a
separate process.
Both are probably slower than passing a single-quoted string, but
the difference is lost in the noise for a script that I converted as
an experiment.
- I handle test_expect_success and test_expect_failure here. If we
like this style, we could easily extend it to other spots (e.g.,
lazy_prereq bodies) on top of this patch.
- even though we are using "local", we have to be careful about our
variable names. Within test_expect_success, any variable we declare
with local will be seen as local by the test snippets themselves (so
it wouldn't persist between tests like normal variables would).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
syntax, where TEST_BODY should be checked in the usual way. Let's make
sure this works by adding a few tests. The "here-doc-body" file tests
the basic syntax, including an embedded here-doc which we should still
be able to recognize.
Likewise the "here-doc-body-indent" checks the same thing, but using the
"<<-" operator. We wouldn't expect this to be used normally, but we
would not want to accidentally miss a body that uses it. The
"pathological" variant checks the opposite: we don't get confused by an
indented tag within the here-doc body.
The "here-doc-double" tests the handling of two here-doc tags on the
same line. This is not something we'd expect anybody to do in practice,
but the code was written defensively to handle this, so let's make sure
it works.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eric Sunshine [Wed, 10 Jul 2024 08:38:31 +0000 (04:38 -0400)]
chainlint.pl: recognize test bodies defined via heredoc
In order to check tests for semantic problems, chainlint.pl scans test
scripts, looking for tests defined as:
test_expect_success [prereq] title '
body
'
where `body` is a single string which is then treated as a standalone
chunk of code and "linted" to detect semantic issues. (The same happens
for `test_expect_failure` definitions.)
The introduction of test definitions in which the test body is instead
presented via a heredoc rather than as a single string creates a blind
spot in the linting process since such invocations are not recognized by
chainlint.pl.
Prepare for this new style by also recognizing tests defined as:
test_expect_success [prereq] title - <<\EOT
body
EOT
A minor complication is that chainlint.pl has never considered heredoc
bodies significant since it doesn't scan them for semantic problems,
thus it has always simply thrown them away. However, with the new
`test_expect_success` calling sequence, heredoc bodies become
meaningful, thus need to be captured.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 10 Jul 2024 08:37:55 +0000 (04:37 -0400)]
chainlint.pl: check line numbers in expected output
While working on chainlint.pl recently, we introduced some bugs that
showed incorrect line numbers in the output. But it was hard to notice,
since we sanitize the output by removing all of the line numbers! It
would be nice to retain these so we can catch any regressions.
The main reason we sanitize is for maintainability: we concatenate all
of the test snippets into a single file, so it's hard for each ".expect"
file to know at which offset its test input will be found. We can handle
that by storing the per-test line numbers in the ".expect" files, and
then dynamically offsetting them as we build the concatenated test and
expect files together.
The changes to the ".expect" files look like tedious boilerplate, but it
actually makes adding new tests easier. You can now just run:
to save the output of the script minus the comment headers (after
checking that it is correct, of course). Whereas before you had to strip
the line numbers. The conversions here were done mechanically using
something like the script above, and then spot-checked manually.
It would be possible to do all of this in shell via the Makefile, but it
gets a bit complicated (and requires a lot of extra processes). Instead,
I've written a short perl script that generates the concatenated files
(we already depend on perl, since chainlint.pl uses it). Incidentally,
this improves a few other things:
- we incorrectly used $(CHAINLINTTMP_SQ) inside a double-quoted
string. So if your test directory required quoting, like:
make "TEST_OUTPUT_DIRECTORY=/tmp/h'orrible"
we'd fail the chainlint tests.
- the shell in the Makefile didn't handle &&-chaining correctly in its
loops (though in practice the "sed" and "cat" invocations are not
likely to fail).
- likewise, the sed invocation to strip numbers was hiding the exit
code of chainlint.pl itself. In practice this isn't a big deal;
since there are linter violations in the test files, we expect it to
exit non-zero. But we could later use exit codes to distinguish
serious errors from expected ones.
- we now use a constant number of processes, instead of scaling with
the number of test scripts. So it should be a little faster (on my
machine, "make check-chainlint" goes from 133ms to 73ms).
There are some alternatives to this approach, but I think this is still
a good intermediate step:
1. We could invoke chainlint.pl individually on each test file, and
compare it to the expected output (and possibly using "make" to
avoid repeating already-done checks). This is a much bigger change
(and we'd have to figure out what to do with the "# LINT" lines in
the inputs). But in this case we'd still want the "expect" files to
be annotated with line numbers. So most of what's in this patch
would be needed anyway.
2. Likewise, we could run a single chainlint.pl and feed it all of the
scripts (with "--jobs=1" to get deterministic output). But we'd
still need to annotate the scripts as we did here, and we'd still
need to either assemble the "expect" file, or break apart the
script output to compare to each individual ".expect" file.
So we may pursue those in the long run, but this patch gives us more
robust tests without too much extra work or moving in a useless
direction.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 10 Jul 2024 08:37:30 +0000 (04:37 -0400)]
chainlint.pl: force CRLF conversion when opening input files
The lexer in chainlint.pl can't handle CRLF line endings; it complains
about an internal error in scan_token() if we see one. For example, in
our Windows CI environment:
This doesn't break "make check-chainlint" (yet), because we assemble a
concatenated input by passing the contents of each file through "sed".
And the "sed" we use will strip out the CRLFs. But the next patch is
going to rework this a bit, which does break check-chainlint on Windows.
Plus it's probably nicer to folks on Windows who might work on chainlint
itself and write new tests.
In theory we could fix the parser to handle this, but it's not really
worth the trouble. We should be able to ask the input layer to translate
the line endings for us. In fact, I'd expect this to happen by default,
as perl's documentation claims Win32 uses the ":unix:crlf" PERLIO layer
by default ("unix" here just refers to using read/write syscalls, and
then "crlf" layers the translation on top). However, this doesn't seem
to be the case in our Windows CI environment. I didn't dig into the
exact reason, but it is perhaps because we are using an msys build of
perl rather than a "true" Win32 build.
At any rate, it is easy-ish to just ask explicitly for the conversion.
In the above example, setting PERLIO=crlf in the environment is enough
to make it work. Curiously, though, this doesn't work when invoking
chainlint via "make". Again, I didn't dig into it, but it may have to do
with msys programs calling Windows programs or vice versa.
We can make it work consistently by just explicitly asking for CRLF
translation when we open the files. This will even work on non-Windows
platforms, though we wouldn't really expect to find CRLF files there.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 10 Jul 2024 08:35:57 +0000 (04:35 -0400)]
chainlint.pl: do not spawn more threads than we have scripts
The chainlint.pl script spawns worker threads to check many scripts in
parallel. This is good if you feed it a lot of scripts. But if you give
it few (or one), then the overhead of spawning the threads dominates. We
can easily notice that we have fewer scripts than threads and scale back
as appropriate.
This patch reduces the time to run:
time for i in chainlint/*.test; do
perl chainlint.pl $i
done >/dev/null
on my system from ~4.1s to ~1.1s, where I have 8+8 cores.
As with the previous patch, this isn't the usual way we run chainlint
(we feed many scripts at once, which is why it supports threading in the
first place). So this won't make a big difference in the real world, but
it may help us out in the future, and it makes experimenting with and
debugging the chainlint tests a bit more pleasant.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 10 Jul 2024 08:35:13 +0000 (04:35 -0400)]
chainlint.pl: only start threads if jobs > 1
If the system supports threads, chainlint.pl will always spawn worker
threads to do the real work. But when --jobs=1, this is pointless, since
we could just do the work in the main thread. And spawning even a single
thread has a high overhead. For example, on my Linux system, running:
for i in chainlint/*.test; do
perl chainlint.pl --jobs=1 $i
done >/dev/null
takes ~1.7s without this patch, and ~1.1s after. We don't usually spawn
a bunch of individual chainlint.pl processes (instead we feed several
scripts at once, and the parallelism outweighs the setup cost). But it's
something we've considered doing, and since we already have fallback
code for systems without thread support, it's pretty easy to make this
work.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 10 Jul 2024 08:34:48 +0000 (04:34 -0400)]
chainlint.pl: add test_expect_success call to test snippets
The chainlint tests are a series of individual files, each holding a
test body. The "make check-chainlint" target assembles them into a
single file, adding a "test_expect_success" function call around each.
Let's instead include that function call in the files themselves. This
is a little more boilerplate, but has several advantages:
1. You can now run chainlint manually on snippets with just "perl
chainlint.perl chainlint/foo.test". This can make developing and
debugging a little easier.
2. Many of the tests implicitly relied on the syntax of the lines
added by the Makefile (in particular the use of single-quotes).
This assumption is much easier to see when the single-quotes are
alongside the test body.
3. We had no way to test how the chainlint program handled
various test_expect_success lines themselves. Now we'll be able to
check variations.
The change to the .test files was done mechanically, using the same
test names they would have been assigned by the Makefile (this is
important to match the expected output). The Makefile has the minimal
change to drop the extra lines; there are more cleanups possible but a
future patch in this series will rewrite this substantially anyway.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 9 Jul 2024 23:03:48 +0000 (16:03 -0700)]
http.c: cookie file tightening
The http.cookiefile configuration variable is used to call
curl_easy_setopt() to set CURLOPT_COOKIEFILE and if http.savecookies
is set, the same value is used for CURLOPT_COOKIEJAR. The former is
used only to read cookies at startup, the latter is used to write
cookies at the end.
The manual pages https://curl.se/libcurl/c/CURLOPT_COOKIEFILE.html
and https://curl.se/libcurl/c/CURLOPT_COOKIEJAR.html talk about two
interesting special values.
* "" (an empty string) given to CURLOPT_COOKIEFILE means not to
read cookies from any file upon startup.
* It is not specified what "" (an empty string) given to
CURLOPT_COOKIEJAR does; presumably open a file whose name is an
empty string and write cookies to it? In any case, that is not
what we want to see happen, ever.
* "-" (a dash) given to CURLOPT_COOKIEFILE makes cURL read cookies
from the standard input, and given to CURLOPT_COOKIEJAR makes
cURL write cookies to the standard output. Neither of which we
want ever to happen.
So, let's make sure we avoid these nonsense cases. Specifically,
when http.cookies is set to "-", ignore it with a warning, and when
it is set to "" and http.savecookies is set, ignore http.savecookies
with a warning.
brian m. carlson [Wed, 10 Jul 2024 00:01:55 +0000 (00:01 +0000)]
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
doc: mention that proxies must be completely transparent
We already document in the FAQ that proxies must be completely
transparent and not modify the request or response in any way, but add
similar documentation to the http.proxy entry. We know that while the
FAQ is very useful, users sometimes are less likely to read in favor of
the documentation specific to an option or command, so adding it in both
places will help users be adequately informed.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users very commonly want to sync their working tree with uncommitted
changes across machines, often to carry across in-progress work or
stashes. Despite this not being a recommended approach, users want to
do it and are not dissuaded by suggestions not to, so let's recommend a
sensible technique.
The technique that many users are using is their preferred cloud syncing
service, which is a bad idea. Users have reported problems where they
end up with duplicate files that won't go away (with names like "file.c
2"), broken references, oddly named references that have date stamps
appended to them, missing objects, and general corruption and data loss.
That's because almost all of these tools sync file by file, which is a
great technique if your project is a single word processing document or
spreadsheet, but is utterly abysmal for Git repositories because they
don't necessarily snapshot the entire repository correctly. They also
tend to sync the files immediately instead of when the repository is
quiescent, so writing multiple files, as occurs during a commit or a gc,
can confuse the tools and lead to corruption.
We know that the old standby, rsync, is up to the task, provided that
the repository is quiescent, so let's suggest that and dissuade people
from using cloud syncing tools. Let's tell people about common things
they should be aware of before doing this and that this is still
potentially risky. Additionally, let's tell people that Git's security
model does not permit sharing working trees across users in case they
planned to do that. While we'd still prefer users didn't try to do
this, hopefully this will lead them in a safer direction.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitfaq: give advice on using eol attribute in gitattributes
In the FAQ, we tell people how to use the text attribute, but we fail to
explain what to do with the eol attribute. As we ourselves have
noticed, most shell implementations do not care for carriage returns,
and as such, people will practically always want them to use LF endings.
Similar things can be said for batch files on Windows, except with CRLF
endings.
Since these are common things to have in a repository, let's help users
make a good decision by recommending that they use the gitattributes
file to correctly check out the endings.
In addition, let's correct the cross-reference to this question, which
originally referred to "the following entry", even though a new entry
has been inserted in between. The cross-reference notation should
prevent this from occurring and provide a link in formats, such as HTML,
which support that.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>