Junio C Hamano [Tue, 13 Sep 2022 18:38:24 +0000 (11:38 -0700)]
Merge branch 'sg/parse-options-subcommand'
The codepath for the OPT_SUBCOMMAND facility has been cleaned up.
* sg/parse-options-subcommand:
notes, remote: show unknown subcommands between `'
notes: simplify default operation mode arguments check
test-parse-options.c: fix style of comparison with zero
test-parse-options.c: don't use for loop initial declaration
t0040-parse-options: remove leftover debugging
Junio C Hamano [Tue, 13 Sep 2022 18:38:24 +0000 (11:38 -0700)]
Merge branch 'jk/rev-list-verify-objects-fix'
"git rev-list --verify-objects" ought to inspect the contents of
objects and notice corrupted ones, but it didn't when the commit
graph is in use, which has been corrected.
* jk/rev-list-verify-objects-fix:
rev-list: disable commit graph with --verify-objects
lookup_commit_in_graph(): use prepare_commit_graph() to check for graph
Junio C Hamano [Tue, 13 Sep 2022 18:38:23 +0000 (11:38 -0700)]
Merge branch 'jk/upload-pack-skip-hash-check'
The server side that responds to "git fetch" and "git clone"
request has been optimized by allowing it to send objects in its
object store without recomputing and validating the object names.
* jk/upload-pack-skip-hash-check:
t1060: check partial clone of misnamed blob
parse_object(): check commit-graph when skip_hash set
upload-pack: skip parse-object re-hashing of "want" objects
parse_object(): allow skipping hash check
Junio C Hamano [Tue, 13 Sep 2022 18:38:23 +0000 (11:38 -0700)]
Merge branch 'rs/diff-no-index-cleanup'
"git diff --no-index A B" managed its the pathnames of its two
input files rather haphazardly, sometimes leaking them. The
command line argument processing has been straightened out to clean
it up.
* rs/diff-no-index-cleanup:
diff-no-index: simplify argv index calculation
diff-no-index: release prefixed filenames
diff-no-index: release strbuf on queue error
Junio C Hamano [Fri, 9 Sep 2022 19:02:25 +0000 (12:02 -0700)]
Merge branch 'jc/format-patch-force-in-body-from'
"git format-patch --from=<ident>" can be told to add an in-body
"From:" line even for commits that are authored by the given
<ident> with "--force-in-body-from"option.
* jc/format-patch-force-in-body-from:
format-patch: learn format.forceInBodyFrom configuration variable
format-patch: allow forcing the use of in-body From: header
pretty: separate out the logic to decide the use of in-body from
Junio C Hamano [Fri, 9 Sep 2022 19:02:24 +0000 (12:02 -0700)]
Merge branch 'js/add-p-diff-parsing-fix'
Those who use diff-so-fancy as the diff-filter noticed a regression
or two in the code that parses the diff output in the built-in
version of "add -p", which has been corrected.
* js/add-p-diff-parsing-fix:
add -p: ignore dirty submodules
add -p: gracefully handle unparseable hunk headers in colored diffs
add -p: detect more mismatches between plain vs colored diffs
Jeff King [Wed, 7 Sep 2022 21:02:20 +0000 (17:02 -0400)]
t1060: check partial clone of misnamed blob
A recent commit (upload-pack: skip parse-object re-hashing of "want"
objects, 2022-09-06) loosened the behavior of upload-pack so that it
does not verify the sha1 of objects it receives directly via "want"
requests. The existing corruption tests in t1060 aren't affected by
this: the corruptions are blobs reachable from commits, and the client
requests the commits.
The more interesting case here is a partial clone, where the client will
directly ask for the corrupted blob when it does an on-demand fetch of
the filtered object. And that is not covered at all, so let's add a
test.
It's important here that we use the "misnamed" corruption and not
"bit-error". The latter is sufficiently corrupted that upload-pack
cannot even figure out the type of the object, so it bails identically
both before and after the recent change. But with "misnamed", with the
hash-checks enabled it sees the problem (though the error messages are a
bit confusing because of the inability to create a "struct object" to
store the flags):
After the change to skip the hash check, the server side happily sends
the bogus object, but the client correctly realizes that it did not get
the necessary data:
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (1/1), 49 bytes | 49.00 KiB/s, done.
fatal: bad revision 'd95f3ad14dee633a758d2e331151e950dd13e4ed'
error: [...]/misnamed did not send all necessary objects
which is exactly what we expect to happen.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Wed, 7 Sep 2022 11:45:42 +0000 (13:45 +0200)]
diff-no-index: simplify argv index calculation
Since 16bb3d714d (diff --no-index: use parse_options() instead of
diff_opt_parse(), 2019-03-24) argc must be 2 if we reach the loop, i.e.
argc - 2 == 0. Remove that inconsequential term.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Wed, 7 Sep 2022 11:36:53 +0000 (13:36 +0200)]
diff-no-index: release strbuf on queue error
The strbuf is small and we are about to exit, so we could leave its
cleanup to the OS. If we release it explicitly at all, however, then we
should do it on early exit as well. Move the strbuf_release call to a
new cleanup section at the end and make sure all execution paths go
through it.
Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 6 Sep 2022 23:06:25 +0000 (19:06 -0400)]
parse_object(): check commit-graph when skip_hash set
If the caller told us that they don't care about us checking the object
hash, then we're free to implement any optimizations that get us the
parsed value more quickly. An obvious one is to check the commit graph
before loading an object from disk. And in fact, both of the callers who
pass in this flag are already doing so before they call parse_object()!
So we can simplify those callers, as well as any possible future ones,
by moving the logic into parse_object().
There are two subtle things to note in the diff, but neither has any
impact in practice:
- it seems least-surprising here to do the graph lookup on the
git-replace'd oid, rather than the original. This is in theory a
change of behavior from the earlier code, as neither caller did a
replace lookup itself. But in practice it doesn't matter, as we
disable the commit graph entirely if there are any replace refs.
- the caller in get_reference() passes the skip_hash flag only if
revs->verify_objects isn't set, whereas it would look in the commit
graph unconditionally. In practice this should not matter as we
should disable the commit graph entirely when using verify_objects
(and that was done recently in another patch).
So this should be a pure cleanup with no behavior change.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 6 Sep 2022 23:05:42 +0000 (19:05 -0400)]
upload-pack: skip parse-object re-hashing of "want" objects
Imagine we have a history with commit C pointing to a large blob B.
If a client asks us for C, we can generally serve both objects to them
without accessing the uncompressed contents of B. In upload-pack, we
figure out which commits we have and what the client has, and feed those
tips to pack-objects. In pack-objects, we traverse the commits and trees
(or use bitmaps!) to find the set of objects needed, but we never open
up B. When we serve it to the client, we can often pass the compressed
bytes directly from the on-disk packfile over the wire.
But if a client asks us directly for B, perhaps because they are doing
an on-demand fetch to fill in the missing blob of a partial clone, we
end up much slower. Upload-pack calls parse_object() on the oid we
receive, which opens up the object and re-checks its hash (even though
if it were a commit, we might skip this parse entirely in favor of the
commit graph!). And then we feed the oid directly to pack-objects, which
again calls parse_object() and opens the object. And then finally, when
we write out the result, we may send bytes straight from disk, but only
after having unnecessarily uncompressed and computed the sha1 of the
object twice!
This patch teaches both code paths to use the new SKIP_HASH_CHECK flag
for parse_object(). You can see the speed-up in p5600, which does a
blob:none clone followed by a checkout. The savings for git.git are
modest:
Test HEAD^ HEAD
----------------------------------------------------------------------
5600.3: checkout of result 2.23(4.19+0.24) 1.72(3.79+0.18) -22.9%
But the savings scale with the number of bytes. So on a repository like
linux.git with more files, we see more improvement (in both absolute and
relative numbers):
Test HEAD^ HEAD
----------------------------------------------------------------------------
5600.3: checkout of result 51.62(77.26+2.76) 34.86(61.41+2.63) -32.5%
And here's an even more extreme case. This is the android gradle-plugin
repository, whose tip checkout has ~3.7GB of files:
Test HEAD^ HEAD
--------------------------------------------------------------------------
5600.3: checkout of result 79.51(90.84+5.55) 40.28(51.88+5.67) -49.3%
Keep in mind that these timings are of the whole checkout operation. So
they count the client indexing the pack and actually writing out the
files. If we want to see just the server's view, we can hack up the
GIT_TRACE_PACKET output from those operations and replay it via
upload-pack. For the gradle example, that gives me:
Benchmark 1: GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input
Time (mean ± σ): 50.884 s ± 0.239 s [User: 51.450 s, System: 1.726 s]
Range (min … max): 50.608 s … 51.025 s 3 runs
Benchmark 2: GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input
Time (mean ± σ): 9.728 s ± 0.112 s [User: 10.466 s, System: 1.535 s]
Range (min … max): 9.618 s … 9.842 s 3 runs
Summary
'GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input' ran
5.23 ± 0.07 times faster than 'GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input'
So a server would see an 80% reduction in CPU serving the initial
checkout of a partial clone for this repository. Or possibly even more
depending on the packing; most of the time spent in the faster one were
objects we had to open during the write phase.
In both cases skipping the extra hashing on the server should be pretty
safe. The client doesn't trust the server anyway, so it will re-hash all
of the objects via index-pack. There is one thing to note, though: the
change in get_reference() affects not just pack-objects, but rev-list,
git-log, etc. We could use a flag to limit to index-pack here, but we
may already skip hash checks in this instance. For commits, we'd skip
anything we load via the commit-graph. And while before this commit we
would check a blob fed directly to rev-list on the command-line, we'd
skip checking that same blob if we found it by traversing a tree.
The exception for both is if --verify-objects is used. In that case,
we'll skip this optimization, and the new test makes sure we do this
correctly.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 6 Sep 2022 23:01:34 +0000 (19:01 -0400)]
parse_object(): allow skipping hash check
The parse_object() function checks the object hash of any object it
parses. This is a nice feature, as it means we may catch bit corruption
during normal use, rather than waiting for specific fsck operations.
But it also can be slow. It's particularly noticeable for blobs, where
except for the hash check, we could return without loading the object
contents at all. Now one may wonder what is the point of calling
parse_object() on a blob in the first place then, but usually it's not
intentional: we were fed an oid from somewhere, don't know the type, and
want an object struct. For commits and trees, the parsing is usually
helpful; we're about to look at the contents anyway. But this is less
true for blobs, where we may be collecting them as part of a
reachability traversal, etc, and don't actually care what's in them. And
blobs, of course, tend to be larger.
We don't want to just throw out the hash-checks for blobs, though. We do
depend on them in some circumstances (e.g., rev-list --verify-objects
uses parse_object() to check them). It's only the callers that know
how they're going to use the result. And so we can help them by
providing a special flag to skip the hash check.
We could just apply this to blobs, as they're going to be the main
source of performance improvement. But if a caller doesn't care about
checking the hash, we might as well skip it for other object types, too.
Even though we can't avoid reading the object contents, we can still
skip the actual hash computation.
If this seems like it is making Git a little bit less safe against
corruption, it may be. But it's part of a series of tradeoffs we're
already making. For instance, "rev-list --objects" does not open the
contents of blobs it prints. And when a commit graph is present, we skip
opening most commits entirely. The important thing will be to use this
flag in cases where it's safe to skip the check. For instance, when
serving a pack for a fetch, we know the client will fully index the
objects and do a connectivity check itself. There's little to be gained
from the server side re-hashing a blob itself. And indeed, most of the
time we don't! The revision machinery won't open up a blob reached by
traversal, but only one requested directly with a "want" line. So
applied properly, this new feature shouldn't make anything less safe in
practice.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Mon, 5 Sep 2022 18:50:07 +0000 (20:50 +0200)]
notes, remote: show unknown subcommands between `'
Update the "unknown subcommand" error message in 'git notes' and 'git
remote' to wrap the offending argument between `', to make it
consistent with the "unknown switch/option/subcommand" error messages
in parse-options.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git notes' has a default operation mode, but when invoked without a
subcommand it doesn't accept any arguments (although the 'list'
subcommand implementing the default operation mode does accept
arguments). The condition checking this ended up a bit awkward, so
let's make it clearer.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 6 Sep 2022 21:04:35 +0000 (17:04 -0400)]
rev-list: disable commit graph with --verify-objects
Since the point of --verify-objects is to actually load and checksum the
bytes of each object, optimizing out reads using the commit graph runs
contrary to our goal.
The most targeted way to implement this would be for the revision
traversal code to check revs->verify_objects and avoid using the commit
graph. But it's difficult to be sure we've hit all of the correct spots.
For instance, I started this patch by writing the first of the included
test cases, where the corrupted commit is directly on rev-list's command
line. And that is easy to fix by teaching get_reference() to check
revs->verify_objects before calling lookup_commit_in_graph().
But that doesn't cover the second test case: when we traverse to a
corrupted commit, we'd parse the parent in process_parents(). So we'd
need to check there, too. And it keeps going. In handle_commit() we
sometimes parses commits, too, though I couldn't figure out a way to
trigger it that did not already parse via get_reference() or tag
peeling. And try_to_simplify_commit() has its own parse call, and so on.
So it seems like the safest thing is to just disable the commit graph
for the whole process when we see the --verify-objects option. We can do
that either in builtin/rev-list.c, where we use the option, or in
revision.c, where we parse it. There are some subtleties:
- putting it in rev-list.c is less surprising in some ways, because
there we know we are just doing a single traversal. In a command
which does multiple traversals in a single process, it's rather
unexpected to globally disable the commit graph.
- putting it in revision.c is less surprising in some ways, because
the caller does not have to remember to disable the graph
themselves. But this is already tricky! The verify_objects flag in
rev_info doesn't do anything by itself. The caller has to provide an
object callback which does the right thing.
- for that reason, in practice nobody but rev-list uses this option in
the first place. So the distinction is probably not important either
way. Arguably it should just be an option of rev-list, and not the
general revision machinery; right now you can run "git log
--verify-objects", but it does not actually do anything useful.
- checking for a parsed revs.verify_objects flag in rev-list.c is too
late. By that time we've already passed the arguments to
setup_revisions(), which will have parsed the commits using the
graph.
So this commit disables the graph as soon as we see the option in
revision.c. That's a pretty broad hammer, but it does what we want, and
in practice nobody but rev-list is using this flag anyway.
The tests cover both the "tip" and "parent" cases. Obviously our hammer
hits them both in this case, but it's good to check both in case
somebody later tries the more focused approach.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 6 Sep 2022 21:02:13 +0000 (17:02 -0400)]
lookup_commit_in_graph(): use prepare_commit_graph() to check for graph
We exit early from lookup_commit_in_graph() if the commit_graph pointer
is NULL, under the assumption that we don't have a graph to look at. But
the graph pointer is lazy-loaded; if no other code happens to have
called prepare_commit_graph(), we'll incorrectly assume that one isn't
available at all.
This has a pretty small performance impact in practice, because the
fallback will generally be to call parse_object() instead. That ends up
in parse_commit_buffer(), which loads the graph data itself. So the
first commit we see won't use the graph, but subsequent ones will. Since
using the graph is just an optimization there's generally no
user-visible difference, but if you instrument rev-list like so:
diff --git a/revision.c b/revision.c
index ee702e498a..63c488ffb6 100644
--- a/revision.c
+++ b/revision.c
@@ -381,6 +381,9 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
* parsing commit data from disk.
*/
commit = lookup_commit_in_graph(revs->repo, oid);
+ warning("%s %s in commit graph",
+ commit ? "found" : "did not find",
+ name);
if (commit)
object = &commit->object;
else
warning: did not find origin/master in commit graph
warning: found origin/next in commit graph
After this patch, you'll see that we find both:
warning: found origin/master in commit graph
warning: found origin/next in commit graph
Even though the performance implication is small here, there are two
important reasons to do this:
- it's downright confusing if you are hunting a bug triggered by the
use of the commit graph. It may or may not trigger depending on the
number and ordering of tips you ask for.
- prepare_commit_graph() has other policy logic, too. In particular,
if we've loaded a commit graph and then disabled the graph via
disable_commit_graph(), that should take precedence.
I'm not sure if this can trigger bad behavior in practice. The only
caller there is upload-pack's deepen_by_rev_list(), which should be
avoiding the commit graph for its traversal tips, but probably
wasn't before this patch. Whether you could come up with a case
where that mattered is unclear. Still, this is obviously the right
thing to be doing.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 6 Sep 2022 01:33:40 +0000 (18:33 -0700)]
Merge branch 'es/t4301-sed-portability-fix'
Test clean-up.
* es/t4301-sed-portability-fix:
t4301: emit blank line in more idiomatic fashion
t4301: fix broken &&-chains and add missing loop termination
t4301: account for behavior differences between sed implementations
Junio C Hamano [Tue, 6 Sep 2022 01:33:40 +0000 (18:33 -0700)]
Merge branch 'rs/tempfile-cleanup-race-fix'
The clean-up of temporary files created via mks_tempfile_dt() was
racy and attempted to unlink() the leading directory when signals
are involved, which has been corrected.
Ensure 'is_sparse_directory_entry()' receives a valid 'name_entry *' if one
exists in the list of tree(s) being unpacked in 'unpack_callback()'.
Currently, 'is_sparse_directory_entry()' is called with the first
'name_entry' in the 'names' list of entries on 'unpack_callback()'. However,
this entry may be empty even when other elements of 'names' are not (such as
when switching from an orphan branch back to a "normal" branch). As a
result, 'is_sparse_directory_entry()' could incorrectly indicate that a
sparse directory is *not* actually sparse because the name of the index
entry does not match the (empty) 'name_entry' path.
Fix the issue by using the existing 'name_entry *p' value in
'unpack_callback()', which points to the first non-empty entry in 'names'.
Because 'p' is 'const', also update 'is_sparse_directory_entry()'s
'name_entry *' argument to be 'const'.
Finally, add a regression test case.
Reported-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix bad config API usage added in a452128a36c (submodule--helper:
introduce add-config subcommand, 2021-08-06). After
git_config_get_string() returns successfully we know the "char **dest"
will be non-NULL.
A coccinelle patch that transforms this turns up a couple of other
such issues, one in fetch-pack.c, and another in upload-pack.c:
submodule--helper: libify even more "die" paths for module_update()
As noted in a preceding commit the get_default_remote_submodule() and
remote_submodule_branch() functions would invoke die(), and thus leave
update_submodule() only partially lib-ified. We've addressed the
former of those in a preceding commit, let's now address the latter.
In addition to lib-ifying the function this fixes a potential (but
obscure) segfault introduced by a logic error in 1012a5cbc3f (submodule--helper run-update-procedure: learn --remote,
2022-03-04):
We were assuming that remote_submodule_branch() would always return
non-NULL, but if the submodule_from_path() call in that function fails
we'll return NULL. See its introduction in 92bbe7ccf1f (submodule--helper: add remote-branch helper,
2016-08-03). I.e. we'd previously have segfaulted in the xstrfmt()
call in update_submodule() seen in the context.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: libify more "die" paths for module_update()
As noted in a preceding commit the get_default_remote_submodule() and
remote_submodule_branch() functions would invoke die(), and thus leave
update_submodule() only partially lib-ified. Let's address the former
of those cases.
Change the functions to return an int exit code (non-zero on failure),
while leaving the get_default_remote() function for the callers that
still want the die() semantics.
This change addresses 1/2 of the "die" issue in these two lines in
update_submodule():
We can safely remove the "!default_remote" case from sync_submodule(),
because our get_default_remote_submodule() function now returns a
die_message() on failure, so we can have it and other callers check if
the exit code should be non-zero instead.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix code added in ce125d431aa (submodule: extract path to submodule
gitdir func, 2021-09-15) and a77c3fcb5ec (submodule--helper: get
remote names from any repository, 2022-03-04) which failed to check
the return values of repo_init() and repo_submodule_init(). If we
failed to initialize the repository or submodule we could segfault
when trying to access the invalid repository structs.
Let's also check that these were the only such logic errors in the
codebase by making use of the "warn_unused_result" attribute. This is
valid as of GCC 3.4.0 (and clang will catch it via its faking of
__GNUC__ ).
As the comment being added to git-compat-util.h we're piggy-backing on
the LAST_ARG_MUST_BE_NULL version check out of lazyness. See 9fe3edc47f1 (Add the LAST_ARG_MUST_BE_NULL macro, 2013-07-18) for its
addition. The marginal benefit of covering gcc 3.4.0..4.0.0 is
near-zero (or zero) at this point. It mostly matters that we catch
this somewhere.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue the libification of codepaths that previously relied on
"must_die_on_failure". In these cases we've always been early aborting
by calling die(), but as we know that these codepaths will properly
handle return codes of 128 to mean an early abort let's have them use
die_message() instead.
This still isn't a complete migration away from die() for these
codepaths, in particular this code in update_submodule() will still call die() in some cases:
When "git submodule update" runs it might call "checkout", "merge",
"rebase", or a custom command. Ever since run_update_command() was
added in c51f8f94e5b (submodule--helper: run update procedures from C,
2021-08-24) we'd either exit immediately if the
"submodule.<name>.update" method failed, or in the case of "checkout"
continue trying to update other submodules.
This code used to use the magical "2" return code, but in 55b3f12cb54 (submodule update: use die_message(), 2022-03-15) it was
made to exit(128), which in preceding commits has been changed to
return that 128 code to the top-level.
Let's "libify" this code even more by not having it arbitrarily
override the return code. In practice this doesn't change anything as
the code "git checkout" would return on any normal failure is "1", but
we'll now in principle properly abort the operation if "git checkout"
were to exit with 128.
It would make sense to follow-up this change with a change to allow
the "submodule.<name>.update = !..." (SM_UPDATE_COMMAND) method the
same liberties as "checkout", and perhaps to do the same with a failed
"merge" or "rebase". But let's leave that for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preceding commits the codepaths around update_submodules() were
changed from using exit() or die() to ferrying up a
"must_die_on_failure" in the cases where we'd exit(), and in most
cases where we'd die().
We needed to do this this to ensure that we'd early exit or otherwise
abort the update_submodules() processing before it was completed.
Now that those preceding changes have shown that we've converted those
paths, we can remove the remaining "ret == 128" special-cases, leaving
the only such special-case in update_submodules(). I.e. we now know
after having gone through the various codepaths that we were only
returning 128 if we meant to early abort.
In update_submodules() we'll for now set any non-zero non-128 exit
codes to "1", but will start ferrying up the exit code as-is in a
subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Libify the determine_submodule_update_strategy() by having it invoke
die_message() rather than die(), and returning the code die_message()
returns on failure.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: don't exit() on failure, return
Change code downstream of module_update() to short-circuit and return
to the top-level on failure, rather than calling exit().
To do so we need to diligently check whether we "must_die_on_failure",
which is a pattern started in c51f8f94e5b (submodule--helper: run
update procedures from C, 2021-08-24), but which hadn't been completed
to the point where we could avoid calling exit() here.
This introduces no functional changes, but makes it easier to both
call these routines as a library in the future, and to eventually
avoid leaking memory.
This and similar control flow in submodule--helper.c could be made
simpler by properly "libifying" it, i.e. to have it consistently
return -1 on failures, and to early return on any non-success.
But let's leave that larger project for now, and (mostly) emulate what
were doing with the "exit(128)" before this change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: use "code" in run_update_command()
Apply some DRY principles in run_update_command() and don't have two
"switch" statements over "ud->update_strategy.type" determine the same
thing.
First we were setting "must_die_on_failure = 1" in all cases except
"SM_UPDATE_CHECKOUT" (and we'd BUG(...) out on the rest). This code
was added in c51f8f94e5b (submodule--helper: run update procedures
from C, 2021-08-24).
Then we'd duplicate same "switch" logic when we were using the
"must_die_on_failure" variable.
Let's instead have the "case" branches in that inner "switch"
determine whether or not the "update must continue" by picking an exit
code.
This also mostly avoids hardcoding the "128" exit code, instead we can
make use of the return value of the die_message() function, which
we've been calling here since 55b3f12cb54 (submodule update: use
die_message(), 2022-03-15). We're still hardcoding it to determine if
we "exit()", but subsequent commit(s) will address that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule API: don't handle SM_..{UNSPECIFIED,COMMAND} in to_string()
Change the submodule_strategy_to_string() function added in 3604242f080 (submodule: port init from shell to C, 2016-04-15) to
really return a "const char *". In the "SM_UPDATE_COMMAND" case it
would return a strbuf_detach().
Furthermore, this function would return NULL on SM_UPDATE_UNSPECIFIED,
so it wasn't safe to xstrdup() its return value in the general case,
or to use it in a sprintf() format as the code removed in the
preceding commit did.
But its callers would never call it with either SM_UPDATE_UNSPECIFIED
or SM_UPDATE_COMMAND. Let's have its behavior reflect how its only
user expects it to behave, and BUG() out on the rest.
By doing this we can also stop needlessly xstrdup()-ing and free()-ing
the memory for the config we're setting. We can instead always use
constant strings. We can also use the *_tmp() variant of
git_config_get_string().
Let's also rename this submodule_strategy_to_string() function to
submodule_update_type_to_string(). Now that it's only tasked with
returning a string version of the "enum submodule_update_type type".
Before it would look at the "command" field in "struct
submodule_update_strategy".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: don't call submodule_strategy_to_string() in BUG()
Don't call submodule_strategy_to_string() in a BUG() message. These
calls added in c51f8f94e5b (submodule--helper: run update procedures
from C, 2021-08-24) don't need the extra information
submodule_strategy_to_string() gives us, as we'll never reach the
SM_UPDATE_COMMAND case here.
That case is the only one where we'd get any information beyond the
straightforward number-to-string mapping.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: add missing braces to "else" arm
Add missing braces to an "else" arm in init_submodule(), this
stylistic change makes this code conform to the CodingGuidelines, and
makes a subsequent commit smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: return "ret", not "1" from update_submodule()
Amend the update_submodule() function to return the failing "ret" on
error, instead of overriding it with "1".
This code was added in b3c5f5cb048 (submodule: move core cmd_update()
logic to C, 2022-03-15), and this change ends up not making a
difference as this function is only called in update_submodules(). If
we return non-zero here we'll always in turn return "1" in
module_update().
But if we didn't do that and returned any other non-zero exit code in
update_submodules() we'd fail the test that's being amended
here. We're still testing the status quo here.
This change makes subsequent refactoring of update_submodule() easier,
as we'll no longer need to worry about clobbering the "ret" we get
from the run_command().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the "res" variable added in b3c5f5cb048 (submodule: move core
cmd_update() logic to C, 2022-03-15) to "ret", which is the convention
in the rest of this file.
Eventual follow-up commits will change the code in update_submodule()
to a "goto cleanup" pattern, let's have the post image look consistent
with the rest. For update_submodules() let's also use a "ret" for
consistency, that use was also added in b3c5f5cb048. We'll be
modifying that codepath in subsequent commits.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: don't redundantly check "else if (res)"
The "res" variable must be true at this point in update_submodule(),
as just a few lines above this we've unconditionally:
if (!res)
return 0;
So we don't need to guard the "return 1" with an "else if (res)", we
can return unconditionally at this point. See b3c5f5cb048 (submodule:
move core cmd_update() logic to C, 2022-03-15) for the initial
introduction of this code, this check of "res" has always been
redundant.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Glen Choo [Wed, 31 Aug 2022 23:17:59 +0000 (01:17 +0200)]
submodule--helper: refactor "errmsg_str" to be a "struct strbuf"
Refactor code added in e83e3333b57 (submodule: port submodule
subcommand 'summary' from shell to C, 2020-08-13) so that "errmsg" and
"errmsg_str" are folded into one. The distinction between the empty
string and NULL is something that's tested for by
e.g. "t/t7401-submodule-summary.sh".
This is in preparation for fixing a memory leak the "struct strbuf" in
the pre-image.
Let's also pass a "const char *" to print_submodule_summary(), as it
should not be modifying the "errmsg".
Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: add "const" to passed "struct update_data"
Add a "const" to the "struct update_data" passed to
run_update_procedure(), which it in turn passes along (peeled) to
is_tip_reachable() and fetch_in_submodule()).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: add "const" to passed "module_clone_data"
Add "const" to the "struct module_clone_data" that we pass to
clone_submodule(), which makes the ownership clear, and stops us from
clobbering the "clone_data->path".
We still need to add to the "reference" member, which is a "struct
string_list". Let's do this by having clone_submodule() create its
own, and copy the contents over, allowing us to pass it as a
separate parameter.
This new "struct string_list" still leaks memory, just as the "struct
module_clone_data" did before. let's not fix that for now, to fix that
we'll need to add some "goto cleanup" to the relevant code. That will
eventually be done in follow-up commits, this change makes it easier
to fix the memory leak.
The scope of the new "reference" variable in add_submodule() could be
narrowed to the "else" block, but as we'll eventually free it with a
"goto cleanup" let's declare it at the start of the function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: move "sb" in clone_submodule() to its own scope
Refactor the only remaining use of a "struct strbuf sb" in
clone_submodule() to live in its own scope. This makes the code
clearer by limiting its lifetime.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: use xstrfmt() in clone_submodule()
Use xstrfmt() in clone_submodule() instead of a "struct strbuf" in two
cases where we weren't getting anything out of using the "struct
strbuf".
This changes code that was was added along with other uses of "struct
strbuf" in this function in ee8838d1577 (submodule: rewrite
`module_clone` shell function in C, 2015-09-08).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: replace memset() with { 0 }-initialization
Use the less verbose { 0 }-initialization syntax rather than memset()
in builtin/submodule--helper.c, this doesn't make a difference in
terms of behavior, but as we're about to modify adjacent code makes
this more consistent, and lets us avoid worrying about when the
memset() happens v.s. a "goto cleanup".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper style: add \n\n after variable declarations
Since the preceding commit fixed style issues with \n\n among the
declared variables let's fix the minor stylistic issues with those
variables not being consistently followed by a \n\n.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper style: don't separate declared variables with \n\n
The usual style in the codebase is to separate declared variables with
a single newline, not two, let's adjust this code to conform to
that. This makes the eventual addition of various "int ret" variables
more consistent.
In doing this the comment added in 2964d6e5e1e (submodule: port
subcommand 'set-branch' from shell to C, 2020-06-02) might become
ambiguous to some, although it should be clear what it's referring to,
let's move it above the 'OPT_NOOP_NOARG('q', "quiet")' to make that
clearer.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: move "resolve-relative-url-test" to a test-tool
As its name suggests the "resolve-relative-url-test" has never been
used outside of the test suite, see 63e95beb085 (submodule: port
resolve_relative_url from shell to C, 2016-04-15) for its original
addition.
Perhaps it would make sense to drop this code entirely, as we feel
that we've got enough indirect test coverage, but let's leave that
question to a possible follow-up change. For now let's keep the test
coverage this gives us.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: move "check-name" to a test-tool
Move the "check-name" helper to a test-tool, since a6226fd772b (submodule--helper: convert the bulk of cmd_add() to C,
2021-08-10) it has only been used by this test, not git-submodule.sh.
As noted with its introduction in 0383bbb9015 (submodule-config:
verify submodule names as paths, 2018-04-30) the intent of
t7450-bad-git-dotfiles.sh has always been to unit test the
check_submodule_name() function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule--helper: move "is-active" to a test-tool
Create a new "test-tool submodule" and move the "is-active" subcommand
over to it. It was added in 5c2bd8b77ae (submodule--helper: add
is-active subcommand, 2017-03-16), since a452128a36c (submodule--helper: introduce add-config subcommand,
2021-08-06) it hasn't been used by git-submodule.sh.
Since we're creating a command dispatch similar to test-tool.c itself
let's split out the "struct test_cmd" into a new test-tool-utils.h,
which both this new code and test-tool.c itself can use.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
No test has used this "--url" parameter since the test code that made
use of it was removed in 32bc548329d (submodule-config: remove support
for overlaying repository config, 2017-08-03).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "submodule--helper list" sub-command, which hasn't been
used by git-submodule.sh since 2964d6e5e1e (submodule: port subcommand
'set-branch' from shell to C, 2020-06-02).
There was a test added in 2b56bb7a87a (submodule helper list: respect
correct path prefix, 2016-02-24) which relied on it, but the right
thing to do here is to delete that test as well.
That test was regression testing the "list" subcommand itself. We're
not getting anything useful from the "list | cut -f2" invocation that
we couldn't get from "foreach 'echo $sm_path'".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule tests: test for "add <repository> <abs-path>"
Add a missing test for ""add <repository> <path>" where "<path>" is an
absolute path. This tests code added in [1] and later turned into an
"else" branch in clone_submodule() in [2] that's never been tested.
This needs to be skipped on WINDOWS because all of $PWD, $(pwd) and
the "$(pwd -P)" we get via "$submodurl" would fail in CI with e.g.:
fatal: could not create directory 'D:/a/git/git/t/trash
directory.t7400-submodule-basic/.git/modules/D:/a/git/git/t/trash
directory.t7400-submodule-basic/add-abs'
I.e. we can't handle these sorts of paths in this context on that
platform.
I'm not sure where we run into the edges of "$PWD" behavior on
Windows (see [1] for a previous loose end on the topic), but for the
purposes of this test it's sufficient that we test this on other
platforms.
1. ee8838d1577 (submodule: rewrite `module_clone` shell function in C,
2015-09-08)
2. f8eaa0ba98b (submodule--helper, module_clone: always operate on
absolute paths, 2016-03-31)
Test what exit code and output we emit on "git submodule -h", how we
handle "--" when no subcommand is specified, and how the top-level
"--recursive" option is handled.
For "-h" this doesn't make sense, but let's test for it so that any
subsequent eventual behavior change will become clear.
For "--" this follows up on 68cabbfda36 (submodule: document default
behavior, 2019-02-15) and tests that "status" doesn't support
the "--" delimiter. There's no intrinsically good reason not to
support that. We behave this way due to edge cases in
git-submodule.sh's implementation, but as with "-h" let's assert our
current long-standing behavior for now.
For "--recursive" the exclusion of it from the top-level appears to
have been an omission in 15fc56a8536 (git submodule foreach: Add
--recursive to recurse into nested submodules, 2009-08-19), there
doesn't seem to be a reason not to support it alongside "--quiet" and
"--cached", but let's likewise assert our existing behavior for now.
I.e. as long as "status" is optional it would make sense to support
all of its options when it's omitted, but we only do that with
"--quiet" and "--cached", and curiously omit "--recursive".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The auto-stashed local changes created by "git merge --autostash"
was mixed into a conflicted state left in the working tree, which
has been corrected.
* en/merge-unstash-only-on-clean-merge:
merge: only apply autostash when appropriate
Junio C Hamano [Thu, 1 Sep 2022 20:40:18 +0000 (13:40 -0700)]
Merge branch 'sg/parse-options-subcommand'
Introduce the "subcommand" mode to parse-options API and update the
command line parser of Git commands with subcommands.
* sg/parse-options-subcommand: (23 commits)
remote: run "remote rm" argv through parse_options()
maintenance: add parse-options boilerplate for subcommands
pass subcommand "prefix" arguments to parse_options()
builtin/worktree.c: let parse-options parse subcommands
builtin/stash.c: let parse-options parse subcommands
builtin/sparse-checkout.c: let parse-options parse subcommands
builtin/remote.c: let parse-options parse subcommands
builtin/reflog.c: let parse-options parse subcommands
builtin/notes.c: let parse-options parse subcommands
builtin/multi-pack-index.c: let parse-options parse subcommands
builtin/hook.c: let parse-options parse subcommands
builtin/gc.c: let parse-options parse 'git maintenance's subcommands
builtin/commit-graph.c: let parse-options parse subcommands
builtin/bundle.c: let parse-options parse subcommands
parse-options: add support for parsing subcommands
parse-options: drop leading space from '--git-completion-helper' output
parse-options: clarify the limitations of PARSE_OPT_NODASH
parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options
api-parse-options.txt: fix description of OPT_CMDMODE
t0040-parse-options: test parse_options() with various 'parse_opt_flags'
...
Junio C Hamano [Thu, 1 Sep 2022 20:40:17 +0000 (13:40 -0700)]
Merge branch 'ds/bundle-uri-clone'
Implement "git clone --bundle-uri".
* ds/bundle-uri-clone:
clone: warn on failure to repo_init()
clone: --bundle-uri cannot be combined with --depth
bundle-uri: add support for http(s):// and file://
clone: add --bundle-uri option
bundle-uri: create basic file-copy logic
remote-curl: add 'get' capability
Thanks to always running `diff-index` and `diff-files` with the
`--numstat` option (the latter with `--ignore-submodules=dirty`) before
even generating any real diff to parse, the Perl version of `git add -p`
simply ignored dirty submodules and does not even offer them up for
staging.
However, the built-in variant did not use that flag because it tries to
run only one `diff` command, skipping the unneeded
`diff-index`/`diff-files` invocation of the Perl variant and therefore
only faithfully recapitulates what the Perl code does once it _does_
generate and parse the real diff.
This causes a problem when running the built-in `add -p` with
`diff-so-fancy` because that diff colorizer always inserts an empty line
before the diff header to ensure that it produces 4 lines as expected by
`git add -p` (the equivalent of the non-colorized `diff`, `index`, `---`
and `+++` lines). But `git diff-files` does not produce any `index` line
for dirty submodules.
The underlying problem is not even the discrepancy in lines, but that
`git add -p` presents diffs for dirty submodules: there is nothing that
_can_ be staged for those.
Let's fix that bug, and teach the built-in `add -p` to ignore dirty
submodules, too. This _incidentally_ also fixes the `diff-so-fancy`
problem ;-)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add -p: gracefully handle unparseable hunk headers in colored diffs
In
https://lore.kernel.org/git/ecf6f5be-22ca-299f-a8f1-bda38e5ca246@gmail.com,
Phillipe Blain reported that the built-in `git add -p` command fails
when asked to use [`diff-so-fancy`][diff-so-fancy] to colorize the diff.
The reason is that this tool produces colored diffs with a hunk header
that does not contain any parseable `@@ ... @@` line range information,
and therefore we cannot detect any part in that header that comes after
the line range.
As proposed by Phillip Wood, let's take that for a clear indicator that
we should show the hunk headers verbatim. This is what the Perl version
of the interactive `add` command did, too.
Reported-by: Philippe Blain <levraiphilippeblain@gmail.com> Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add -p: detect more mismatches between plain vs colored diffs
When parsing the colored version of a diff, the interactive `add`
command really relies on the colored version having the same number of
lines as the plain (uncolored) version. That is an invariant.
We already have code to verify correctly when the colored diff has less
lines than the plain diff. Modulo an off-by-one bug: If the last diff
line has no matching colored one, the code pretends to succeed, still.
To make matters worse, when we adjusted the test in 1e4ffc765db (t3701:
adjust difffilter test, 2020-01-14), we did not catch this because `add
-p` fails for a _different_ reason: it does not find any colored hunk
header that contains a parseable line range.
If we change the test case so that the line range _can_ be parsed, the
bug is exposed.
Let's address all of the above by
- fixing the off-by-one,
- adjusting the test case to allow `add -p` to parse the line range
- making the test case more stringent by verifying that the expected
error message is shown
Also adjust a misleading code comment about the now-fixed code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commit $(C_OBJ) added in c373991375a (Makefile: list
generated object files in OBJECTS, 2010-01-26) became synonymous with
$(OBJECTS). Let's avoid the indirection and use the $(OBJECTS)
variable directly instead.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the PPC_SHA1 implementation added in a6ef3518f9a ([PATCH] PPC
assembly implementation of SHA1, 2005-04-22). When this was added
Apple consumer hardware used the PPC architecture, and the
implementation was intended to improve SHA-1 speed there.
Since it was added we've moved to using sha1collisiondetection by
default, and anyone wanting hard-rolled non-DC SHA-1 implementation
can use OpenSSL's via the OPENSSL_SHA1 knob.
The PPC_SHA1 originally originally targeted 32 bit PPC, and later the
64 bit PPC 970 (a.k.a. Apple PowerPC G5). See 926172c5e48 (block-sha1:
improve code on large-register-set machines, 2009-08-10) for a
reference about the performance on G5 (a comment in block-sha1/sha1.c
being removed here).
I can't get it to do anything but segfault on both the BE and LE POWER
machines in the GCC compile farm[1]. Anyone who's concerned about
performance on PPC these days is likely to be using the IBM POWER
processors.
There have been proposals to entirely remove non-sha1collisiondetection
implementations from the tree[2]. I think per [3] that would be a bit
overzealous. I.e. there are various set-ups git's speed is going to be
more important than the relatively implausible SHA-1 collision attack,
or where such attacks are entirely mitigated by other means (e.g. by
incoming objects being checked with DC_SHA1).
But that really doesn't apply to PPC_SHA1 in particular, which seems
to have outlived its usefulness.
As this gets rid of the only in-tree *.S assembly file we can remove
the small bits of logic from the Makefile needed to build objects
from *.S (as opposed to *.c)
The code being removed here was also throwing warnings with the
"-pedantic" flag, it could have been fixed as 544d93bc3b4 (block-sha1:
remove use of obsolete x86 assembly, 2022-03-10) did for block-sha1/*,
but as noted above let's remove it instead.
1. https://cfarm.tetaneutral.net/machines/list/
Tested on gcc{110,112,135,203}, a mixture of POWER [789] ppc64 and
ppc64le. All segfault in anything needing object
hashing (e.g. t/t1007-hash-object.sh) when compiled with
PPC_SHA1=Y.
2. https://lore.kernel.org/git/20200223223758.120941-1-mh@glandium.org/
3. https://lore.kernel.org/git/20200224044732.GK1018190@coredump.intra.peff.net/
Acked-by: brian m. carlson" <sandals@crustytoothpaste.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 30 Aug 2022 20:40:56 +0000 (16:40 -0400)]
test-crontab: minor memory and error handling fixes
Since ee69e7884e (gc: use temporary file for editing crontab,
2022-08-28), we now insist that "argc == 3" (and otherwise return an
error). Coverity notes that this causes some dead code:
if (argc == 3)
fclose(from);
else
fclose(to);
as we will never trigger the else. This also causes a memory leak, since
we'll never close "to".
Now that all paths require 2 arguments, we can just reorganize the
function to check argc up front, and tweak the cleanup to do the right
thing for all cases.
While we're here, we can also notice some minor problems:
- we return a negative int via error() from what is essentially a
main() function; we should return a positive non-zero value for
error. Or better yet, we can just use usage(), which gives a better
message.
- while writing the usage message, we can note the one in the comment
was made out of date by ee69e7884e. But it also had a typo already,
calling the subcommand "cron" and not "crontab"
- we didn't check for an error from fopen(), meaning we would segfault
if the to-be-read file was missing. We can use xfopen() to catch
this.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 30 Aug 2022 19:46:01 +0000 (15:46 -0400)]
tempfile: update comment describing state transitions
Back when 1a9d15db25 (tempfile: a new module for handling temporary
files, 2015-08-10) added this comment, tempfile structs were held in
memory for the life of a process, and there were various guarantees
about which fields were valid in which states.
Since 422a21c6a0 (tempfile: remove deactivated list entries, 2017-09-05)
and 076aa2cbda (tempfile: auto-allocate tempfiles on heap, 2017-09-05),
the flow is quite different: objects come and go from the list, and
inactive ones are deallocated. And the previous commit removed the
"active" flag from the struct entirely.
Let's bring the comment up to date with the current code.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 30 Aug 2022 19:45:06 +0000 (15:45 -0400)]
tempfile: drop active flag
Our tempfile struct contains an "active" flag. Long ago, this flag was
important: tempfile structs were always allocated for the lifetime of
the program and added to a global linked list, and the active flag was
what told us whether a struct's tempfile needed to be cleaned up on
exit.
But since 422a21c6a0 (tempfile: remove deactivated list entries,
2017-09-05) and 076aa2cbda (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we actually remove items from the list, and the active flag
is generally always set to true for any allocated struct. We set it to
true in all of the creation functions, and in the normal code flow it
becomes false only in deactivate_tempfile(), which then immediately
frees the struct.
So the flag isn't performing that role anymore, and in fact makes things
more confusing. Dscho noted that delete_tempfile() is a noop for an
inactive struct. Since 076aa2cbda taught it to free the struct when
deactivating, we'd leak any struct whose active flag is unset. But in
practice it's not a leak, because again, we'll free when we unset the
flag, and never see the allocated-but-inactive state.
Can we just get rid of the flag? The answer is yes, but it requires
looking at a few other spots:
1. I said above that the flag only becomes false before we deallocate,
but there's one exception: when we call remove_tempfiles() from a
signal or atexit handler, we unset the active flag as we remove
each file. This isn't important for delete_tempfile(), as nobody
would call it anymore, since we're exiting.
It does in theory provide us some protection against racily
double-removing a tempfile. If we receive a second signal while we
are already in the cleanup routines, we'll start the cleanup loop
again, and may visit the same tempfile. But this race already
exists, because calling unlink() and unsetting the active flag
aren't atomic! And it's OK in practice, because unlink() is
idempotent (barring the unlikely event that some other process
chooses our exact temp filename in that instant).
So dropping the active flag widens the race a bit, but it was
already there, and is fairly harmless in practice. If we really
care about addressing it, the right thing is probably to block
further signals while we're doing our cleanup (which we could
actually do atomically).
2. The active flag is declared as "volatile sig_atomic_t". The idea is
that it's the final bit that gets set to tell the cleanup routines
that the tempfile is ready to be used (or not used), and it's safe
to receive a signal racing with regular code which adds or removes
a tempfile from the list.
In practice, I don't think this is buying us anything. The presence
on the linked list is really what tells the cleanup routines to
look at the struct. That is already marked as "volatile". It's not
a sig_atomic_t, so it's possible that we could see a sheared write
there as an entry is added or removed. But that is true of the
current code, too! Before we can even look at the "active" flag,
we'd have to follow a link to the struct itself. If we see a
sheared write in the pointer to the struct, then we'll look at
garbage memory anyway, and there's not much we can do.
This patch removes the active flag entirely, using presence on the
global linked list as an indicator that a tempfile ought to be cleaned
up. We are already careful to add to the list as the final step in
activating. On deactivation, we'll make sure to remove from the list as
the first step, before freeing any fields. The use of the volatile
keyword should mean that those things happen in the expected order.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Christian Couder [Tue, 30 Aug 2022 10:50:46 +0000 (12:50 +0200)]
Documentation: clarify whitespace rules for trailers
Commit e4319562bc (trailer: be stricter in parsing separators, 2016-11-02)
restricted whitespaces allowed by `git interpret-trailers` in the "token"
part of the trailers it reads. This commit didn't update the related
documentation in Documentation/git-interpret-trailers.txt though.
Also commit 60ef86a162 (trailer: support values folded to multiple lines,
2016-10-21) updated the documentation, but didn't make it clear how many
whitespace characters are allowed at the beginning of new lines in folded
values.
Let's fix both of these issues by rewriting the paragraph describing
what whitespaces are allowed by `git interpret-trailers` in the trailers
it reads.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 29 Aug 2022 21:55:15 +0000 (14:55 -0700)]
Merge branch 'es/fix-chained-tests'
Fix broken "&&-" chains and failures in early iterations of a loop.
* es/fix-chained-tests:
t5329: notice a failure within a loop
t: detect and signal failure within loop
t1092: fix buggy sparse "blame" test
t2407: fix broken &&-chains in compound statement
Junio C Hamano [Mon, 29 Aug 2022 21:55:14 +0000 (14:55 -0700)]
Merge branch 'sg/xcalloc-cocci-fix'
xcalloc(), imitating calloc(), takes "number of elements of the
array", and "size of a single element", in this order. A call that
does not follow this ordering has been corrected.
* sg/xcalloc-cocci-fix:
promisor-remote: fix xcalloc() argument order
Junio C Hamano [Mon, 29 Aug 2022 21:55:13 +0000 (14:55 -0700)]
Merge branch 'mg/sequencer-untranslate-reflog'
The sequencer machinery translated messages left in the reflog by
mistake, which has been corrected.
* mg/sequencer-untranslate-reflog:
sequencer: do not translate command names
sequencer: do not translate parameters to error_resolve_conflict()
sequencer: do not translate reflog messages
Junio C Hamano [Mon, 29 Aug 2022 21:55:12 +0000 (14:55 -0700)]
Merge branch 'jk/unused-fixes'
Code clean-up to remove unused function parameters.
* jk/unused-fixes:
xdiff: drop unused mmfile parameters from xdl_do_patience_diff()
reflog: assert PARSE_OPT_NONEG in parse-options callbacks
reftable: drop unused parameter from reader_seek_linear()
verify_one_sparse(): drop unused parameters
match_pathname(): drop unused "flags" parameter
log-tree: drop unused commit param in remerge_diff()
xdiff: drop unused mmfile parameters from xdl_do_histogram_diff()