Junio C Hamano [Sun, 22 Jan 2023 01:22:00 +0000 (17:22 -0800)]
Merge branch 'tl/ls-tree-code-clean-up'
Code clean-up.
* tl/ls-tree-code-clean-up:
t3104: remove shift code in 'test_ls_tree_format'
ls-tree: cleanup the redundant SPACE
ls-tree: make "line_termination" less generic
ls-tree: fold "show_tree_data" into "cb" struct
ls-tree: use a "struct options"
ls-tree: don't use "show_tree_data" for "fast" callbacks
Junio C Hamano [Sun, 22 Jan 2023 01:22:00 +0000 (17:22 -0800)]
Merge branch 'pb/doc-orig-head'
Document ORIG_HEAD a bit more.
* pb/doc-orig-head:
git-rebase.txt: add a note about 'ORIG_HEAD' being overwritten
revisions.txt: be explicit about commands writing 'ORIG_HEAD'
git-merge.txt: mention 'ORIG_HEAD' in the Description
git-reset.txt: mention 'ORIG_HEAD' in the Description
git-cherry-pick.txt: do not use 'ORIG_HEAD' in example
Junio C Hamano [Sun, 22 Jan 2023 01:21:59 +0000 (17:21 -0800)]
Merge branch 'ar/test-cleanup'
Test clean-up.
* ar/test-cleanup:
t7527: use test_when_finished in 'case insensitive+preserving'
t6422: drop commented out code
t6003: uncomment test '--max-age=c3, --topo-order'
Junio C Hamano [Sun, 22 Jan 2023 01:21:58 +0000 (17:21 -0800)]
Merge branch 'rs/dup-array'
Code cleaning.
* rs/dup-array:
use DUP_ARRAY
add DUP_ARRAY
do full type check in BARF_UNLESS_COPYABLE
factor out BARF_UNLESS_COPYABLE
mingw: make argv2 in try_shell_exec() non-const
Junio C Hamano [Sun, 22 Jan 2023 01:21:58 +0000 (17:21 -0800)]
Merge branch 'jx/t1301-updates'
Test updates.
* jx/t1301-updates:
t1301: do not change $CWD in "shared=all" test case
t1301: use test_when_finished for cleanup
t1301: fix wrong template dir for git-init
Junio C Hamano [Fri, 20 Jan 2023 23:36:21 +0000 (15:36 -0800)]
Merge branch 'jk/curl-avoid-deprecated-api'
Deal with a few deprecation warning from cURL library.
* jk/curl-avoid-deprecated-api:
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
On platforms where `size_t` does not have the same width as `unsigned
long`, passing a pointer to the former when a pointer to the latter is
expected can lead to problems.
Windows and 32-bit Linux are among the affected platforms.
In this instance, we want to store the size of the blob that was read in
that variable. However, `read_blob_data_from_index()` passes that
pointer to `read_object_file()` which expects an `unsigned long *`.
Which means that on affected platforms, the variable is not fully
populated and part of its value is left uninitialized. (On Big-Endian
platforms, this problem would be even worse.)
The consequence is that depending on the uninitialized memory's
contents, we may erroneously reject perfectly fine attributes.
Let's address this by passing a pointer to a variable of the expected
data type.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 17 Jan 2023 03:04:48 +0000 (22:04 -0500)]
http: support CURLOPT_PROTOCOLS_STR
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.
Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:
- we don't have to worry that silencing curl's deprecation warnings
might cause us to miss other more useful ones
- we'd eventually want to move to the new variant anyway, so this gets
us set up (albeit with some extra ugly boilerplate for the
conditional)
There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:
GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
if (...http is allowed...)
GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.
On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).
This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 17 Jan 2023 03:04:44 +0000 (22:04 -0500)]
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 17 Jan 2023 03:04:38 +0000 (22:04 -0500)]
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
The two options do exactly the same thing, but the latter has been
deprecated and in recent versions of curl may produce a compiler
warning. Since the UPLOAD form is available everywhere (it was
introduced in the year 2000 by curl 7.1), we can just switch to it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
On platforms where `size_t` does not have the same width as `unsigned
long`, passing a pointer to the former when a pointer to the latter is
expected can lead to problems.
Windows and 32-bit Linux are among the affected platforms.
In this instance, we want to store the size of the blob that was read in
that variable. However, `read_blob_data_from_index()` passes that
pointer to `read_object_file()` which expects an `unsigned long *`.
Which means that on affected platforms, the variable is not fully
populated and part of its value is left uninitialized. (On Big-Endian
platforms, this problem would be even worse.)
The consequence is that depending on the uninitialized memory's
contents, we may erroneously reject perfectly fine attributes.
Let's address this by passing a pointer to a variable of the expected
data type.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 16 Jan 2023 20:07:47 +0000 (12:07 -0800)]
Merge branch 'ds/omit-trailing-hash-in-index'
Introduce an optional configuration to allow the trailing hash that
protects the index file from bit flipping.
* ds/omit-trailing-hash-in-index:
features: feature.manyFiles implies fast index writes
test-lib-functions: add helper for trailing hash
read-cache: add index.skipHash config option
hashfile: allow skipping the hash function
Junio C Hamano [Mon, 16 Jan 2023 20:07:47 +0000 (12:07 -0800)]
Merge branch 'ws/single-file-cone'
The logic to see if we are using the "cone" mode by checking the
sparsity patterns has been tightened to avoid mistaking a pattern
that names a single file as specifying a cone.
* ws/single-file-cone:
dir: check for single file cone patterns
Junio C Hamano [Mon, 16 Jan 2023 20:07:46 +0000 (12:07 -0800)]
Merge branch 'jk/ext-diff-with-relative'
"git diff --relative" did not mix well with "git diff --ext-diff",
which has been corrected.
* jk/ext-diff-with-relative:
diff: drop "name" parameter from prepare_temp_file()
diff: clean up external-diff argv setup
diff: use filespec path to set up tempfiles for ext-diff
Junio C Hamano [Mon, 16 Jan 2023 20:07:45 +0000 (12:07 -0800)]
Merge branch 'es/t1509-root-fixes'
Test fixes.
* es/t1509-root-fixes:
t1509: facilitate repeated script invocations
t1509: make "setup" test more robust
t1509: fix failing "root work tree" test due to owner-check
Teng Long [Thu, 12 Jan 2023 09:11:35 +0000 (17:11 +0800)]
t3104: remove shift code in 'test_ls_tree_format'
In t3104-ls-tree-format.sh, There is a legacy 'shift 2' code
and the relevant code block no longer depends on it anymore,
so let's remove it for a small cleanup.
Signed-off-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "ls-tree" command isn't capable of ending "lines" with anything
except '\n' or '\0', and in the latter case we can avoid calling
write_name_quoted_relative() entirely. Let's do that, less for
optimization and more for clarity, the write_name_quoted_relative()
API itself does much the same thing.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the the preceding two commits the only user of the
"show_tree_data" struct needed it along with the "options" member,
let's instead fold all of that into a "show_tree_data" struct that
we'll use only for that callback.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a first step towards being able to turn this code into an API some
day let's change the "static" options in builtin/ls-tree.c into a
"struct ls_tree_options" that can be constructed dynamically without
the help of parse_options().
Because we're now using non-static variables for this we'll need to
clear_pathspec() at the end of cmd_ls_tree(), least various tests
start failing under SANITIZE=leak. The memory leak was already there
before, now it's just being brought to the surface.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-tree: don't use "show_tree_data" for "fast" callbacks
As noted in [1] the code that made it in as part of 9c4d58ff2c3 (ls-tree: split up "fast path" callbacks, 2022-03-23) was
a "maybe a good idea, maybe not" RFC-quality patch. I hadn't looked
very carefully at the resulting patterns.
The implementation shared the "struct show_tree_data data", which was
introduced in e81517155e0 (ls-tree: introduce struct "show_tree_data",
2022-03-23) both for use in 455923e0a15 (ls-tree: introduce "--format"
option, 2022-03-23), and because the "fat" callback hadn't been split
up as 9c4d58ff2c3 did.
Now that that's been done we can see that most of what
show_tree_common() was doing could be done lazily by the callbacks
themselves, who in the pre-image were often using an odd mis-match of
their own arguments and those same arguments stuck into the "data"
structure. Let's also have the callers initialize the "type", rather
than grabbing it from the "data" structure afterwards.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Teng Long <dyronteng@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
John Cai [Fri, 13 Jan 2023 16:15:24 +0000 (16:15 +0000)]
docs: link generating patch sections
Currently, in the git-log documentation, the reference to generating
patches does not match the section title. This can make the section
"Generating patch text with -p" hard to find, since typically readers of
the documentation will copy and paste to search the page.
Let's make this more convenient for readers by linking it directly to
the section.
Since git-log pulls in diff-generate-patch.txt, we can provide a direct
link to the section. Otherwise, change the verbiage to match exactly
what the section title is, to at least make searching for it an easier
task.
Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phillip Wood [Thu, 12 Jan 2023 16:50:01 +0000 (16:50 +0000)]
rebase: cleanup "--exec" option handling
When handling "--exec" rebase collects the commands into a struct
string_list, then prepends "exec " to each command creating a multi line
string and finally splits that string back into a list of commands. This
is an artifact of the scripted rebase and the need to support "rebase
--preserve-merges". Now that "--preserve-merges" no-longer exists we can
cleanup the way the argument is handled. There is no need to add the
"exec " prefix to the commands as that is added by todo_list_to_strbuf().
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrei Rybak [Wed, 11 Jan 2023 23:32:42 +0000 (00:32 +0100)]
t7527: use test_when_finished in 'case insensitive+preserving'
Most tests in t7527-builtin-fsmonitor.sh that start a daemon, use the
helper function test_when_finished with stop_daemon_delete_repo.
Function stop_daemon_delete_repo explicitly stops the daemon. Calling
it via test_when_finished is needed for tests that don't check daemon's
automatic shutdown logic [1] and it is needed to avoid daemons being
left running in case of breakage of the logic of automatic shutdown of
the daemon.
Unlike these tests, test 'case insensitive+preserving' added in [2] has
a call to function test_when_finished commented out. It was commented
out in all versions of the patch [2] during development [3]. This seems
to not be intentional, because neither commit message in [2], nor the
comment above the test mention this line being commented out. Compare
it, for example, to "# unicode_debug=true" which is explicitly described
by a documentation comment above it.
Uncomment test_when_finished for stop_daemon_delete_repo in test 'case
insensitive+preserving' to ensure that daemons are not left running in
cases when automatic shutdown logic of daemon itself is broken.
[1] See documentation in "fsmonitor--daemon.h" for details.
[2] caa9c37ec0 (t7527: test FSMonitor on case insensitive+preserving
file system, 2022-05-26)
[3] See mailing list thread
https://lore.kernel.org/git/41f8cbc2ae45cb86e299eb230ad3cb0319256c37.1653601644.git.gitgitgadget@gmail.com/T/#t
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrei Rybak [Wed, 11 Jan 2023 23:32:40 +0000 (00:32 +0100)]
t6003: uncomment test '--max-age=c3, --topo-order'
Test '--max-age=c3, --topo-order' in t6003-rev-list-topo-order.sh has
been commented out as failing since its introduction in [1]. However,
the test is successful at least since commit [2] -- bisecting further is
harder because of incompatibility of such old Git code with modern
header file <openssl/bn.h> [3].
Uncomment this test to gain test coverage.
[1] f573571a21 ([PATCH] Add t/t6003 with some --topo-order tests,
2005-07-07)
[2] 765ac8ec46 (Rip out merge-order and make "git log <paths>..." work
again., 2006-02-28)
[3] BIGNUM used in git's `epoch.c` which was removed in [2] changed
significantly between OpenSSL 1.0.2 and OpenSSL 1.1.0
See also https://stackoverflow.com/a/42295243/1083697 and
https://lore.kernel.org/git/Y71qiCs+oAS2OegH@coredump.intra.peff.net/
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrei Rybak [Tue, 10 Jan 2023 09:32:51 +0000 (10:32 +0100)]
git-bisect-lk2009: update nist report link
Commit d656218a83 (docs/bisect-lk2009: update nist report link,
2017-04-20) replaced a dead link to news release on nist.gov. However,
this might be confusing to the reader (like myself) because the article
git-bisect-lk2009.txt quotes from the news release but the exact quote
cannot be found in the full report. In addition to that, the link added
in 2017 is also dead in 2023.
Replace the reference to nist.gov with an version of the original NIST
news release archived to the Wayback Machine. Include also an updated
link to a live version of the full report.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrei Rybak [Tue, 10 Jan 2023 09:32:50 +0000 (10:32 +0100)]
git-bisect-lk2009: update java code conventions link
A reference to Java Code Conventions in git-bisect-lk2009.txt uses an
outdated URL that redirects to table of contents for the conventions.
The actual claim about "80%" that this reference backs up is on the
first page of the conventions:
ISO 8601 permits "reduced precision" time representations to omit the
seconds value or both the minutes and the seconds values. The
abbreviate times could look like 17:45 or 1745 to omit the seconds,
or simply as 17 to omit both the minutes and the seconds.
parse_date_basic accepts the 17:45 format but it rejects the other two.
Change it to accept 4-digit and 2-digit time values when they follow a
recognized date and a 'T'.
Note: ISO 8601 also allows reduced precision date strings such as
"2022-12" and "2022". This patch does not attempt to address these.
Reported-by: Pat LaVarre <plavarre@purestorage.com> Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 12 Jan 2023 16:39:36 +0000 (11:39 -0500)]
t/interop: report which vanilla git command failed
The interop test library sets up wrappers "git.a" and "git.b" to
represent the two versions to be tested. It also wraps vanilla "git" to
report an error, with the goal of catching tests which accidentally fail
to use one of the version-specific wrappers (which could invalidate the
tests in a very subtle way).
But when it catches an invocation of vanilla git, it doesn't give any
details, which makes it very hard to debug exactly which invocation is
responsible (especially if it's buried in a function invocation, etc).
Let's report the arguments passed to git, which helps narrow it down.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eric Sunshine [Mon, 9 Jan 2023 19:45:08 +0000 (19:45 +0000)]
githooks: discuss Git operations in foreign repositories
Hook authors are periodically caught off-guard by difficult-to-diagnose
errors when their hook invokes Git commands in a repository other than
the local one. In particular, Git environment variables, such as GIT_DIR
and GIT_WORK_TREE, which reference the local repository cause the Git
commands to operate on the local repository rather than on the
repository which the author intended. This is true whether the
environment variables have been set manually by the user or
automatically by Git itself. The same problem crops up when a hook
invokes Git commands in a different worktree of the same repository, as
well.
Recommended best-practice[1,2,3,4,5,6] for avoiding this problem is for
the hook to ensure that Git variables are unset before invoking Git
commands in foreign repositories or other worktrees:
unset $(git rev-parse --local-env-vars)
However, this advice is not documented anywhere. Rectify this
shortcoming by mentioning it in githooks.txt documentation.
Yutaro Ohno [Mon, 9 Jan 2023 10:47:17 +0000 (10:47 +0000)]
doc: add "git switch -c" as another option on detached HEAD
In the "DETACHED HEAD" section in the git-checkout doc, it suggests
using "git checkout -b <branch-name>" to create a new branch on the
detached head.
On the other hand, when you checkout a commit that is not at the tip of
any named branch (e.g., when you checkout a tag), git suggests using
"git switch -c <branch-name>".
Add "git switch -c" as another option and mitigate this inconsistency.
Signed-off-by: Yutaro Ohno <yutaro.ono.418@gmail.com> Acked-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Tue, 10 Jan 2023 13:15:21 +0000 (13:15 +0000)]
git-rebase.txt: add a note about 'ORIG_HEAD' being overwritten
'ORIG_HEAD' is written at the start of the rebase, but is not guaranteed
to still point to the original branch tip at the end of the rebase.
Indeed, using other commands that write 'ORIG_HEAD' during the rebase,
like splitting a commit using 'git reset HEAD^', will lead to 'ORIG_HEAD'
being overwritten. This causes confusion for some users [1].
Add a note about that in the 'Description' section, and mention the more
robust alternative of using the branch's reflog.
Reported-by: Erik Cervin Edin <erik@cervined.in> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Tue, 10 Jan 2023 13:15:20 +0000 (13:15 +0000)]
revisions.txt: be explicit about commands writing 'ORIG_HEAD'
When mentioning 'ORIG_HEAD', be explicit about which command write that
pseudo-ref, namely 'git am', 'git merge', 'git rebase' and 'git reset'.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Tue, 10 Jan 2023 13:15:19 +0000 (13:15 +0000)]
git-merge.txt: mention 'ORIG_HEAD' in the Description
The fact that 'git merge' writes 'ORIG_HEAD' before performing the merge
is missing from the documentation of the command.
Mention it in the 'Description' section.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Tue, 10 Jan 2023 13:15:18 +0000 (13:15 +0000)]
git-reset.txt: mention 'ORIG_HEAD' in the Description
The fact that 'git reset' writes 'ORIG_HEAD' before changing HEAD is
mentioned in an example, but is missing from the 'Description' section.
Mention it in the discussion of the "'git reset' [<mode>] [<commit>]"
form of the command.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Tue, 10 Jan 2023 13:15:17 +0000 (13:15 +0000)]
git-cherry-pick.txt: do not use 'ORIG_HEAD' in example
Commit 67ac1e1d57 (cherry-pick/revert: add support for
-X/--strategy-option, 2010-12-10) added an example to the documentation
of 'git cherry-pick'. This example mentions how to abort a failed
cherry-pick and retry with an additional merge strategy option.
The command used in the example to abort the cherry-pick is 'git reset
--merge ORIG_HEAD', but cherry-pick does not write 'ORIG_HEAD' before
starting its operation. So this command would checkout a commit
unrelated to what was at HEAD when the user invoked cherry-pick.
Use 'git cherry-pick --abort' instead.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 12 Jan 2023 16:06:49 +0000 (11:06 -0500)]
object-file: fix indent-with-space
Commit b25562e63f (object-file: inline calls to read_object(),
2023-01-07) accidentally indented a conditional block with spaces
instead of a tab.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sun, 1 Jan 2023 21:14:12 +0000 (22:14 +0100)]
add DUP_ARRAY
Add a macro for allocating and populating a shallow copy of an array.
It is intended to replace a sequence like this:
ALLOC_ARRAY(dst, n);
COPY_ARRAY(dst, src, n);
With the less repetitve:
DUP_ARRAY(dst, src, n);
It checks whether the types of source and destination are compatible to
ensure the copy can be used safely.
An easier alternative would be to only consider the source and return
a void pointer, that could be used like this:
dst = ARRAY_DUP(src, n);
That would be more versatile, as it could be used in declarations as
well. Making it type-safe would require the use of typeof_unqual from
C23, though.
So use the safe and compatible variant for now.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sun, 1 Jan 2023 21:11:20 +0000 (22:11 +0100)]
do full type check in BARF_UNLESS_COPYABLE
Use __builtin_types_compatible_p to perform a full type check if
possible. Otherwise fall back to the old size comparison, but add a
non-evaluated assignment to catch more type mismatches. It doesn't flag
copies between arrays with different signedness, but that's as close to
a full type check as it gets without the builtin, as far as I can see.
Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sun, 1 Jan 2023 21:08:53 +0000 (22:08 +0100)]
factor out BARF_UNLESS_COPYABLE
Move the common basic element type check of COPY_ARRAY and MOVE_ARRAY to
a new macro. This reduces code duplication and simplifies adding more
elaborate checks.
Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sun, 8 Jan 2023 10:10:59 +0000 (11:10 +0100)]
mingw: make argv2 in try_shell_exec() non-const
Prepare for a stricter type check in COPY_ARRAY by removing the const
qualifier of argv2, like we already do to placate Visual Studio. We
have to add it back using explicit casts when actually using the
variable, unfortunately, because GCC (rightly) refuses to add it
implicitly. Similar casts are already used in mingw_execv().
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sun, 8 Jan 2023 04:25:19 +0000 (13:25 +0900)]
Merge branch 'cw/ci-whitespace'
CI updates. We probably want a clean-up to move the long shell
script embedded in yaml file into a separate file, but that can
come later.
* cw/ci-whitespace:
ci (check-whitespace): move to actions/checkout@v3
ci (check-whitespace): add links to job output
ci (check-whitespace): suggest fixes for errors
Junio C Hamano [Sun, 8 Jan 2023 04:25:19 +0000 (13:25 +0900)]
Merge branch 'js/drop-mingw-test-cmp'
Use `git diff --no-index` as a test_cmp on Windows.
We'd probably need to revisit "do we really want to, and have to,
lose CRLF vs LF?" later, at which time we may be able to further
clean this up by replacing "git diff --no-index" with "diff -u".
* js/drop-mingw-test-cmp:
tests(mingw): avoid very slow `mingw_test_cmp`
Jeff King [Sat, 7 Jan 2023 13:50:53 +0000 (08:50 -0500)]
packfile: inline custom read_object()
When the pack code was split into its own file[1], it got a copy of the
static read_object() function. But there's only one caller here, so we
could just inline it. And it's worth doing so, as the name read_object()
invites comparisons to the public read_object_file(), but the two don't
behave quite the same.
[1] The move happened over several commits, but the relevant one here is f1d8130be0 (pack: move clear_delta_base_cache(), packed_object_info(),
unpack_entry(), 2017-08-18).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only caller of read_object_file_extended() is the thin wrapper of
repo_read_object_file(). Instead of wrapping, let's just rename the
inner function and let people call it directly. This cleans up the
namespace and reduces confusion.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sat, 7 Jan 2023 13:50:19 +0000 (08:50 -0500)]
read_object_file_extended(): drop lookup_replace option
Our sole caller always passes in "1", so we can just drop the parameter
entirely. Anybody who doesn't want this behavior could easily call
oid_object_info_extended() themselves, as we're just a thin wrapper
around it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sat, 7 Jan 2023 13:49:15 +0000 (08:49 -0500)]
streaming: inline call to read_object_file_extended()
The open_istream_incore() function is the only direct user of
read_object_file_extended(), and the only caller which unsets the
lookup_replace flag. Since read_object_file_extended() is now just a
thin wrapper around oid_object_info_extended(), let's inline the call.
That will let us simplify read_object_file_extended() in the next patch.
The inlined version here is a few more lines because of the query setup,
but it's much more flexible, since we can pass (or omit) any flags we
want.
Note the updated comment in the istream struct definition. It was
already slightly wrong (we never called read_object(); it has been
read_object_file_extended() since day one), but should now be accurate.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sat, 7 Jan 2023 13:48:55 +0000 (08:48 -0500)]
object-file: inline calls to read_object()
Since read_object() is these days just a thin wrapper around
oid_object_info_extended(), and since it only has two callers, let's
just inline those calls. This has a few positive outcomes:
- it's a net reduction in source code lines
- even though the callers end up with a few extra lines, they're now
more flexible and can use object_info flags directly. So no more
need to convert die_if_corrupt between parameter/flag, and we can
ask for lookup replacement with a flag rather than doing it
ourselves.
- there's one fewer function in an already crowded namespace (e.g.,
the difference between read_object() and read_object_file() was not
immediately obvious; now we only have one of them).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sat, 7 Jan 2023 13:26:44 +0000 (08:26 -0500)]
convert trivial uses of strncmp() to skip_prefix()
As with the previous patch, using skip_prefix() is more readable and
less error-prone than a raw strncmp(), because it avoids a
manually-computed length. These cases differ from the previous patch
that uses starts_with() because they care about the value after the
matched prefix.
We can convert these to use skip_prefix() by introducing an extra
variable to hold the out-pointer.
Note in the case in ws.c that to get rid of the magic number "9"
completely, we also switch out "len" for recomputing the pointer
difference. These are equivalent because "len" is always "ep - string".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sat, 7 Jan 2023 13:26:18 +0000 (08:26 -0500)]
convert trivial uses of strncmp() to starts_with()
It's more readable to use starts_with() instead of strncmp() to match a
prefix, as the latter requires a manually-computed length, and has the
funny "matching is zero" return value common to cmp functions. This
patch converts several cases which were found with:
git grep 'strncmp(.*, [0-9]*)'
But note that it doesn't convert all such cases. There are several where
the magic length number is repeated elsewhere in the code, like:
/* handle "buf" which isn't NUL-terminated and might be too small */
if (len >= 3 && !strncmp(buf, "foo", 3))
or:
/* exact match for "foo", but within a larger string */
if (end - buf == 3 && !strncmp(buf, "foo", 3))
While it would not produce the wrong outcome to use starts_with() in
these cases, we'd still be left with one instance of "3". We're better
to leave them for now, as the repeated "3" makes it clear that the two
are linked (there may be other refactorings that handle both, but
they're out of scope for this patch).
A few things to note while reading the patch:
- all cases but one are trying to match, and so lose the extra "!".
The case in the first hunk of urlmatch.c is not-matching, and hence
gains a "!".
- the case in remote-fd.c is matching the beginning of "connect foo",
but we never look at str+8 to parse the "foo" part (which would make
this a candidate for skip_prefix(), not starts_with()). This seems
at first glance like a bug, but is a limitation of how remote-fd
works.
- the second hunk in urlmatch.c shows some cases adjacent to other
strncmp() calls that are left. These are of the "exact match within
a larger string" type, as described above.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrei Rybak [Sat, 7 Jan 2023 13:56:55 +0000 (14:56 +0100)]
*: fix typos which duplicate a word
Fix typos in code comments which repeat various words. Most of the
cases are simple in that they repeat a word that usually cannot be
repeated in a grammatically correct sentence. Just remove the
incorrectly duplicated word in these cases and rewrap text, if needed.
A tricky case is usage of "that that", which is sometimes grammatically
correct. However, an instance of this in "t7527-builtin-fsmonitor.sh"
doesn't need two words "that", because there is only one daemon being
discussed, so replace the second "that" with "the".
Reword code comment "entries exist on on-disk index" in function
update_one in file cache-tree.c, by replacing incorrect preposition "on"
with "in".
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Fri, 6 Jan 2023 16:31:56 +0000 (16:31 +0000)]
features: feature.manyFiles implies fast index writes
The recent addition of the index.skipHash config option allows index
writes to speed up by skipping the hash computation for the trailing
checksum. This is particularly critical for repositories with many files
at HEAD, so add this config option to two cases where users in that
scenario may opt-in to such behavior:
1. The feature.manyFiles config option enables some options that are
helpful for repositories with many files at HEAD.
2. 'scalar register' and 'scalar reconfigure' set config options that
optimize for large repositories.
In both of these cases, set index.skipHash=true to gain this
speedup. Add tests that demonstrate the proper way that
index.skipHash=true can override feature.manyFiles=true.
Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Fri, 6 Jan 2023 16:31:55 +0000 (16:31 +0000)]
test-lib-functions: add helper for trailing hash
It can be helpful to check that a file format with a trailing hash has a
specific hash in the final bytes of a written file. This is made more
apparent by recent changes that allow skipping the hash algorithm and
writing a null hash at the end of the file instead.
Add a new test_trailing_hash helper and use it in t1600 to verify that
index.skipHash=true really does skip the hash computation, since
'git fsck' does not actually verify the hash. This confirms that when
the config is disabled explicitly in a super project but enabled in a
submodule, then the use of repo_config_get_bool() loads config from the
correct repository in the case of 'git add'. There are other cases where
istate->repo is NULL and thus this config is loaded instead from
the_repository, but that's due to many different code paths initializing
index_state structs in their own way.
Keep the 'git fsck' call to ensure that any potential future change to
check the index hash does not cause an error in this case.
Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Fri, 6 Jan 2023 16:31:54 +0000 (16:31 +0000)]
read-cache: add index.skipHash config option
The previous change allowed skipping the hashing portion of the
hashwrite API, using it instead as a buffered write API. Disabling the
hashwrite can be particularly helpful when the write operation is in a
critical path.
One such critical path is the writing of the index. This operation is so
critical that the sparse index was created specifically to reduce the
size of the index to make these writes (and reads) faster.
This trade-off between file stability at rest and write-time performance
is not easy to balance. The index is an interesting case for a couple
reasons:
1. Writes block users. Writing the index takes place in many user-
blocking foreground operations. The speed improvement directly
impacts their use. Other file formats are typically written in the
background (commit-graph, multi-pack-index) or are super-critical to
correctness (pack-files).
2. Index files are short lived. It is rare that a user leaves an index
for a long time with many staged changes. Outside of staged changes,
the index can be completely destroyed and rewritten with minimal
impact to the user.
Following a similar approach to one used in the microsoft/git fork [1],
add a new config option (index.skipHash) that allows disabling this
hashing during the index write. The cost is that we can no longer
validate the contents for corruption-at-rest using the trailing hash.
We load this config from the repository config given by istate->repo,
with a fallback to the_repository if it is not set.
While older Git versions will not recognize the null hash as a special
case, the file format itself is still being met in terms of its
structure. Using this null hash will still allow Git operations to
function across older versions.
The one exception is 'git fsck' which checks the hash of the index file.
This used to be a check on every index read, but was split out to just
the index in a33fc72fe91 (read-cache: force_verify_index_checksum,
2017-04-14) and released first in Git 2.13.0. Document the versions that
relaxed these restrictions, with the optimistic expectation that this
change will be included in Git 2.40.0.
Here, we disable this check if the trailing hash is all zeroes. We add a
warning to the config option that this may cause undesirable behavior
with older Git versions.
As a quick comparison, I tested 'git update-index --force-write' with
and without index.skipHash=true on a copy of the Linux kernel
repository.
Benchmark 1: with hash
Time (mean ± σ): 46.3 ms ± 13.8 ms [User: 34.3 ms, System: 11.9 ms]
Range (min … max): 34.3 ms … 79.1 ms 82 runs
Benchmark 2: without hash
Time (mean ± σ): 26.0 ms ± 7.9 ms [User: 11.8 ms, System: 14.2 ms]
Range (min … max): 16.3 ms … 42.0 ms 69 runs
Summary
'without hash' ran
1.78 ± 0.76 times faster than 'with hash'
These performance benefits are substantial enough to allow users the
ability to opt-in to this feature, even with the potential confusion
with older 'git fsck' versions.
Test this new config option, both at a command-line level and within a
submodule. The confirmation is currently limited to confirm that 'git
fsck' does not complain about the index. Future updates will make this
test more robust.
It is critical that this test is placed before the test_index_version
tests, since those tests obliterate the .git/config file and hence lose
the setting from GIT_TEST_DEFAULT_HASH, if set.
Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Fri, 6 Jan 2023 16:31:53 +0000 (16:31 +0000)]
hashfile: allow skipping the hash function
The hashfile API is useful for generating files that include a trailing
hash of the file's contents up to that point. Using such a hash is
helpful for verifying the file for corruption-at-rest, such as a faulty
drive causing flipped bits.
Git's index file includes this trailing hash, so it uses a 'struct
hashfile' to handle the I/O to the file. This was very convenient to
allow using the hashfile methods during these operations.
However, hashing the file contents during write comes at a performance
penalty. It's slower to hash the bytes on their way to the disk than
without that step. This problem is made worse by the replacement of
hardware-accelerated SHA1 computations with the software-based sha1dc
computation.
This write cost is significant, and the checksum capability is likely
not worth that cost for such a short-lived file. The index is rewritten
frequently and the only time the checksum is checked is during 'git
fsck'. Thus, it would be helpful to allow a user to opt-out of the hash
computation.
We first need to allow Git to opt-out of the hash computation in the
hashfile API. The buffered writes of the API are still helpful, so it
makes sense to make the change here.
Introduce a new 'skip_hash' option to 'struct hashfile'. When set, the
update_fn and final_fn members of the_hash_algo are skipped. When
finalizing the hashfile, the trailing hash is replaced with the null
hash.
This use of a trailing null hash would be desireable in either case,
since we do not want to special case a file format to have a different
length depending on whether it was hashed or not. When the final bytes
of a file are all zero, we can infer that it was written without
hashing, and thus that verification is not available as a check for file
consistency. This also means that we could easily toggle hashing for any
file format we desire.
A version of this patch has existed in the microsoft/git fork since
2017 [1] (the linked commit was rebased in 2018, but the original dates
back to January 2017). Here, the change to make the index use this fast
path is delayed until a later change.
Co-authored-by: Kevin Willford <kewillf@microsoft.com> Signed-off-by: Kevin Willford <kewillf@microsoft.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 6 Jan 2023 11:05:00 +0000 (06:05 -0500)]
diff: drop "name" parameter from prepare_temp_file()
The prepare_temp_file() function takes a diff_filespec as well as a
filename. But it is almost certainly an error to pass in a name that
isn't the filespec's "path" parameter, since that is the only thing that
reliably tells us how to find the content (and indeed, this was the
source of a recently-fixed bug).
So let's drop the redundant "name" parameter and just use one->path
throughout the function. This simplifies the interface a little bit, and
makes it impossible for calling code to get it wrong.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 6 Jan 2023 11:04:18 +0000 (06:04 -0500)]
diff: clean up external-diff argv setup
Since the previous commit, setting up the tempfile for an external diff
uses df->path from the diff_filespec, rather than the logical name. This
means add_external_diff_name() does not need to take a "name" parameter
at all, and we can drop it. And that in turn lets us simplify the
conditional for handling renames (when the "other" name is non-NULL).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 6 Jan 2023 11:03:57 +0000 (06:03 -0500)]
diff: use filespec path to set up tempfiles for ext-diff
When we're going to run an external diff, we have to make the contents
of the pre- and post-images available either by dumping them to a
tempfile, or by pointing at a valid file in the worktree. The logic of
this is all handled by prepare_temp_file(), and we just pass in the
filename and the diff_filespec.
But there's a gotcha here. The "filename" we have is a logical filename
and not necessarily a path on disk or in the repository. This matters in
at least one case: when using "--relative", we may have a name like
"foo", even though the file content is found at "subdir/foo". As a
result, we look for the wrong path, fail to find "foo", and claim that
the file has been deleted (passing "/dev/null" to the external diff,
rather than the correct worktree path).
We can fix this by passing the pathname from the diff_filespec, which
should always be a full repository path (and that's what we want even if
reusing a worktree file, since we're always operating from the top-level
of the working tree).
The breakage seems to go all the way back to cd676a5136 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12). As far as I can tell, before then "name" would always have
been the same as the filespec's "path".
There are two related cases I looked at that aren't buggy:
1. the only other caller of prepare_temp_file() is run_textconv(). But
it always passes the filespec's path field, so it's OK.
2. I wondered if file renames/copies might cause similar confusion.
But they don't, because run_external_diff() receives two names in
that case: "name" and "other", which correspond to the two sides of
the diff. And we did correctly pass "other" when handling the
post-image side. Barring the use of "--relative", that would always
match "two->path", the path of the second filespec (and the rename
destination).
So the only bug is just the interaction with external diff drivers and
--relative.
Reported-by: Carl Baldwin <carl@ecbaldwin.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 6 Jan 2023 08:48:23 +0000 (03:48 -0500)]
test-bundle-uri: drop unused variables
Commit 70b9c10373 (bundle-uri client: add helper for testing server,
2022-12-22) added a cmd_ls_remote() function which contains "uploadpack"
and "server_options" variables. Neither of these variables is ever
modified after being initialized, so the code to handle non-NULL and
non-empty values is impossible to reach.
While in theory we might add command-line parsing to set these, let's
drop the dead code for now in the name of cleanliness. It's easy enough
to add it back later if need be.
Noticed by Coverity.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 5 Jan 2023 06:07:22 +0000 (15:07 +0900)]
Merge branch 'ab/no-more-git-global-super-prefix'
Stop using "git --super-prefix" and narrow the scope of its use to
the submodule--helper.
* ab/no-more-git-global-super-prefix:
read-tree: add "--super-prefix" option, eliminate global
submodule--helper: convert "{update,clone}" to their own "--super-prefix"
submodule--helper: convert "status" to its own "--super-prefix"
submodule--helper: convert "sync" to its own "--super-prefix"
submodule--helper: convert "foreach" to its own "--super-prefix"
submodule--helper: don't use global --super-prefix in "absorbgitdirs"
submodule.c & submodule--helper: pass along "super_prefix" param
read-tree + fetch tests: test failing "--super-prefix" interaction
submodule absorbgitdirs tests: add missing "Migrating git..." tests