Junio C Hamano [Wed, 2 Aug 2023 16:37:23 +0000 (09:37 -0700)]
Merge branch 'ah/sequencer-rewrite-todo-fix'
When the user edits "rebase -i" todo file so that it starts with a
"fixup", which would make it invalid, the command truncated the
rest of the file before giving an error and returning the control
back to the user. Stop truncating to make it easier to correct
such a malformed todo file.
* ah/sequencer-rewrite-todo-fix:
sequencer: finish parsing the todo list despite an invalid first line
Junio C Hamano [Wed, 2 Aug 2023 16:37:23 +0000 (09:37 -0700)]
Merge branch 'ah/autoconf-fixes'
"./configure --with-expat=no" did not work as a way to refuse use
of the expat library on a system with the library installed, which
has been corrected.
* ah/autoconf-fixes:
configure.ac: always save NO_ICONV to config.status
configure.ac: don't overwrite NO_CURL option
configure.ac: don't overwrite NO_EXPAT option
Junio C Hamano [Wed, 2 Aug 2023 16:37:23 +0000 (09:37 -0700)]
Merge branch 'jc/tree-walk-drop-base-offset'
Code simplification.
* jc/tree-walk-drop-base-offset:
tree-walk: drop unused base_offset from do_match()
tree-walk: lose base_offset that is never used in tree_entry_interesting
Junio C Hamano [Tue, 25 Jul 2023 19:05:24 +0000 (12:05 -0700)]
Merge branch 'jk/nested-points-at'
"git tag --list --points-at X" showed tags that directly refers to
object X, but did not list a tag that points at such a tag, which
has been corrected.
* jk/nested-points-at:
ref-filter: simplify return type of match_points_at
ref-filter: avoid parsing non-tags in match_points_at()
ref-filter: avoid parsing tagged objects in match_points_at()
ref-filter: handle nested tags in --points-at option
Junio C Hamano [Tue, 25 Jul 2023 19:05:24 +0000 (12:05 -0700)]
Merge branch 'jk/unused-parameter'
Mark-up unused parameters in the code so that we can eventually
enable -Wunused-parameter by default.
* jk/unused-parameter:
t/helper: mark unused callback void data parameters
tag: mark unused parameters in each_tag_name_fn callbacks
rev-parse: mark unused parameter in for_each_abbrev callback
replace: mark unused parameter in each_mergetag_fn callback
replace: mark unused parameter in ref callback
merge-tree: mark unused parameter in traverse callback
fsck: mark unused parameters in various fsck callbacks
revisions: drop unused "opt" parameter in "tweak" callbacks
count-objects: mark unused parameter in alternates callback
am: mark unused keep_cr parameters
http-push: mark unused parameter in xml callback
http: mark unused parameters in curl callbacks
do_for_each_ref_helper(): mark unused repository parameter
test-ref-store: drop unimplemented reflog-expire command
Junio C Hamano [Tue, 25 Jul 2023 19:05:23 +0000 (12:05 -0700)]
Merge branch 'mh/mingw-case-sensitive-build'
Names of MinGW header files are spelled in mixed case in some
source files, but the build host can be using case sensitive
filesystem with header files with their name spelled in all
lowercase.
* mh/mingw-case-sensitive-build:
mingw: use lowercase includes for some Windows headers
Various offset computation in the code that accesses the packfiles
and other data in the object layer has been hardened against
arithmetic overflow, especially on 32-bit systems.
* tb/object-access-overflow-protection:
commit-graph.c: prevent overflow in `verify_commit_graph()`
commit-graph.c: prevent overflow in `write_commit_graph()`
commit-graph.c: prevent overflow in `merge_commit_graph()`
commit-graph.c: prevent overflow in `split_graph_merge_strategy()`
commit-graph.c: prevent overflow in `load_tree_for_commit()`
commit-graph.c: prevent overflow in `fill_commit_in_graph()`
commit-graph.c: prevent overflow in `fill_commit_graph_info()`
commit-graph.c: prevent overflow in `load_oid_from_graph()`
commit-graph.c: prevent overflow in add_graph_to_chain()
commit-graph.c: prevent overflow in `write_commit_graph_file()`
pack-bitmap.c: ensure that eindex lookups don't overflow
midx.c: prevent overflow in `fill_included_packs_batch()`
midx.c: prevent overflow in `write_midx_internal()`
midx.c: store `nr`, `alloc` variables as `size_t`'s
midx.c: prevent overflow in `nth_midxed_offset()`
midx.c: prevent overflow in `nth_midxed_object_oid()`
midx.c: use `size_t`'s for fanout nr and alloc
packfile.c: use checked arithmetic in `nth_packed_object_offset()`
packfile.c: prevent overflow in `load_idx()`
packfile.c: prevent overflow in `nth_packed_object_id()`
Junio C Hamano [Tue, 25 Jul 2023 19:05:23 +0000 (12:05 -0700)]
Merge branch 'ah/advise-force-pushing'
Help newbies by suggesting that there are cases where force-pushing
is a valid and sensible thing to update a branch at a remote
repository, rather than reconciling with merge/rebase.
* ah/advise-force-pushing:
push: don't imply that integration is always required before pushing
remote: don't imply that integration is always required before pushing
wt-status: don't show divergence advice when committing
Alex Henrie [Sat, 22 Jul 2023 21:28:25 +0000 (15:28 -0600)]
sequencer: finish parsing the todo list despite an invalid first line
Before the todo list is edited it is rewritten to shorten the OIDs of
the commits being picked and to append advice about editing the list.
The exact advice depends on whether the todo list is being edited for
the first time or not. After the todo list has been edited it is
rewritten to lengthen the OIDs of the commits being picked and to remove
the advice. If the edited list cannot be parsed then this last step is
skipped.
Prior to db81e50724 (rebase-interactive: use todo_list_write_to_file()
in edit_todo_list(), 2019-03-05) if the existing todo list could not be
parsed then the initial rewrite was skipped as well. This had the
unfortunate consequence that if the list could not be parsed after the
initial edit the advice given to the user was wrong when they re-edited
the list. This change relied on todo_list_parse_insn_buffer() returning
the whole todo list even when it cannot be parsed. Unfortunately if the
list starts with a "fixup" command then it will be truncated and the
remaining lines are lost. Fix this by continuing to parse after an
initial "fixup" commit as we do when we see any other invalid line.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
[jc: removed an apparently unneeded subshell around the test body] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 21 Jul 2023 20:47:26 +0000 (13:47 -0700)]
Merge branch 'tb/refs-exclusion-and-packed-refs'
Enumerating refs in the packed-refs file, while excluding refs that
match certain patterns, has been optimized.
* tb/refs-exclusion-and-packed-refs:
ls-refs.c: avoid enumerating hidden refs where possible
upload-pack.c: avoid enumerating hidden refs where possible
builtin/receive-pack.c: avoid enumerating hidden references
refs.h: implement `hidden_refs_to_excludes()`
refs.h: let `for_each_namespaced_ref()` take excluded patterns
revision.h: store hidden refs in a `strvec`
refs/packed-backend.c: add trace2 counters for jump list
refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)
refs/packed-backend.c: refactor `find_reference_location()`
refs: plumb `exclude_patterns` argument throughout
builtin/for-each-ref.c: add `--exclude` option
ref-filter.c: parameterize match functions over patterns
ref-filter: add `ref_filter_clear()`
ref-filter: clear reachable list pointers after freeing
ref-filter.h: provide `REF_FILTER_INIT`
refs.c: rename `ref_filter`
René Scharfe [Fri, 21 Jul 2023 12:41:53 +0000 (14:41 +0200)]
pack-objects: fix --no-quiet
Since 99fb6e04cb (pack-objects: convert to use parse_options(),
2012-02-01) git pack-objects has accepted the option --no-quiet, but it
does the same as --quiet. That's because it's defined using OPT_SET_INT
with a value of 0, which sets 0 when negated, too.
Make --no-quiet equivalent to --progress and ignore it if --all-progress
was given.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Fri, 21 Jul 2023 12:41:56 +0000 (14:41 +0200)]
pack-objects: fix --no-keep-true-parents
Since 99fb6e04cb (pack-objects: convert to use parse_options(),
2012-02-01) git pack-objects has accepted --no-keep-true-parents, but
this option does the same as --keep-true-parents. That's because it's
defined using OPT_SET_INT with a value of 0, which sets 0 when negated
as well.
Turn --no-keep-true-parents into the opposite of --keep-true-parents by
using OPT_BOOL and storing the option's status directly in a variable
named "grafts_keep_true_parents" instead of in negative form in
"grafts_replace_parents".
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Fri, 21 Jul 2023 13:41:33 +0000 (15:41 +0200)]
describe: fix --no-exact-match
Since 2c33f75754 (Teach git-describe --exact-match to avoid expensive
tag searches, 2008-02-24) git describe accepts --no-exact-match, but it
does the same as --exact-match, an alias for --candidates=0. That's
because it's defined using OPT_SET_INT with a value of 0, which sets 0
when negated as well.
Let --no-exact-match set the number of candidates to the default value
instead. Users that need a more specific lack of exactitude can specify
their preferred value using --candidates, as before.
The "--no-exact-match" option was not covered in the tests, so let's
add a few. Also add a case where --exact-match option is used on a
commit that cannot be described without distance from tags and make
sure the command fails.
Signed-off-by: René Scharfe <l.s.r@web.de>
[jc: added trivial tests] Signed-off-by: Junio C Hamano <gitster@pobox.com>
wrapper: use trace2 counters to collect fsync stats
As mentioned in the thread starting at [1], trace2 counters should be
used to count events instead of ad-hoc static variables.
Convert the two fsync static variables to trace2 counters, reducing the
coupling between wrapper.c and the trace2 subsystem. Adjust t/t5351 to
match the trace2 counter output format.
The counters are not per-thread because the ones being replaced also
were not.
"git reset --no-mixed" behaved exactly like "git reset --mixed",
which was nonsense.
If there were only two kinds, e.g. "mixed" vs "separate", it might
have made sense to make "git reset --no-mixed" behave identically to
"git reset --separate" and vice-versa, but because we have many
types of reset, let's just forbid "--no-mixed" and negated form of
other types.
Junio C Hamano [Wed, 19 Jul 2023 16:32:36 +0000 (09:32 -0700)]
show-branch: reject --[no-](topo|date)-order
"git show-branch --no-topo-order" behaved exactly the same way as
"git show-branch --topo-order" did, which was nonsense. This was
because we choose between topo- and date- by setting a variable to
either REV_SORT_IN_GRAPH_ORDER or REV_SORT_BY_COMMIT_DATE with
OPT_SET_INT() and REV_SORT_IN_GRAPH_ORDER happens to be 0. The
OPT_SET_INT() macro assigns 0 to the target variable in respose to
the negated form of its option.
"--no-date-order" by luck behaves identically to "--topo-order"
exactly for the same reason, and it sort-of makes sense right now,
but the "sort-of makes sense" will quickly break down once we add a
third way to sort. Not-A may be B when there are only two choices
between A and B, but once your choices become among A, B, and C,
not-A does not mean B.
Just mark these two ordering options to reject negation, and add a
test, which was missing. "git show-branch --no-reflog" is also
unnegatable, so throw in a test for that while we are at it.
When the trace2 counter mechanism was added in 81071626ba (trace2: add
global counter mechanism, 2022-10-24), the name of the file where new
counters are added was misspelled in a comment.
Use the correct file name.
Signed-off-by: Beat Bolli <dev+git@drbeat.li> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andreas Herrmann [Wed, 19 Jul 2023 14:29:56 +0000 (16:29 +0200)]
configure.ac: don't overwrite NO_CURL option
Even if 'configure --with-curl=no' was run, curl support is used,
because library detection overwrites it. Avoid this overwrite.
Configure should obey what the user has specified.
Signed-off-by: Andreas Herrmann <aherrmann@suse.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andreas Herrmann [Wed, 19 Jul 2023 14:29:54 +0000 (16:29 +0200)]
configure.ac: don't overwrite NO_EXPAT option
Even if 'configure --with-expat=no' was run, expat support is used,
because library detection overwrites it. Avoid this overwrite.
Configure should obey what the user has specified.
Signed-off-by: Andreas Herrmann <aherrmann@suse.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 19 Jul 2023 11:37:39 +0000 (04:37 -0700)]
show-branch: --no-sparse should give dense output
"git show-branch --no-sparse" behaved exactly the same way as "git
show-branch --sparse", which did not make any sense. This was
because it used a variable "dense" initialized to 1 by default to
give "non sparse" behaviour, and OPT_SET_INT() to set the varilable
to 0 in response to the "--sparse" option. Unfortunately,
OPT_SET_INT() sets 0 to the given variable when the option is
negated.
Flip the polarity of the variable "dense" by renaming it to "sparse"
and initializing it to 0, and have OPT_SET_INT() set the variable to
1 when "--sparse" is given. This way, "--no-sparse" would set 0 to
the variable and would give us the "dense" behaviour.
Junio C Hamano [Tue, 18 Jul 2023 21:45:59 +0000 (14:45 -0700)]
fetch: reject --no-ipv[46]
Now we have introduced OPT_IPVERSION(), tweak its implementation so
that "git clone", "git fetch", and "git push" reject the negated
form of "Use only IP version N" options.
Junio C Hamano [Tue, 18 Jul 2023 21:34:33 +0000 (14:34 -0700)]
parse-options: introduce OPT_IPVERSION()
The command line option parsing for "git clone", "git fetch", and
"git push" have duplicated implementations of parsing "--ipv4" and
"--ipv6" options, by having two OPT_SET_INT() for "ipv4" and "ipv6".
Introduce a new OPT_IPVERSION() macro and use it in these three
commands.
Junio C Hamano [Tue, 18 Jul 2023 18:27:49 +0000 (11:27 -0700)]
branch: reject "--no-all" and "--no-remotes" early
As the command line parser for "git branch --all" forgets to use
PARSE_OPT_NONEG, it accepted "git branch --no-all", and then passed
a nonsense value to the underlying machinery, leading to a fatal
error "filter_refs: invalid type". The "--remotes" option had
exactly the same issue.
Catch the unsupported options early in the option parser.
Junio C Hamano [Tue, 18 Jul 2023 18:22:40 +0000 (11:22 -0700)]
am: simplify parsing of "--[no-]keep-cr"
Command line options "--keep-cr" and its negation trigger
OPT_SET_INT_F(PARSE_OPT_NONEG) to set a variable to 1 and 0
respectively. Using OPT_SET_INT() to implement the positive variant
that sets the variable to 1 without specifying PARSE_OPT_NONEG gives
us the negative variant to set it to 0 for free.
Junio C Hamano [Tue, 18 Jul 2023 17:47:39 +0000 (10:47 -0700)]
gitignore.txt: mark up explanation of patterns consistently
In the "PATTERN FORMAT" section, all the other pattern elements are
shown as `monospace` literals inside "double quoted" strings. Do
the same for the explanation of a slash to make it consistent.
René Scharfe [Tue, 18 Jul 2023 15:44:13 +0000 (17:44 +0200)]
ls-tree: fix --no-full-name
Since 61fdbcf98b (ls-tree: migrate to parse-options, 2009-11-13) git
ls-tree has accepted the option --no-full-name, but it does the same
as --full-name, contrary to convention. That's because it's defined
using OPT_SET_INT with a value of 0, where the negative variant sets
0 as well.
Turn --no-full-name into the opposite of --full-name by using OPT_BOOL
instead and storing the option's status directly in a variable named
"full_name" instead of in negated form in "chomp_prefix".
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 18 Jul 2023 14:28:53 +0000 (07:28 -0700)]
Merge branch 'tb/repack-cleanup'
The recent change to "git repack" made it react less nicely when a
leftover .idx file that no longer has the corresponding .pack file
in the repository, which has been corrected.
* tb/repack-cleanup:
builtin/repack.c: avoid dir traversal in `collect_pack_filenames()`
builtin/repack.c: only repack `.pack`s that exist
Jeff King [Sun, 2 Jul 2023 22:38:29 +0000 (18:38 -0400)]
ref-filter: simplify return type of match_points_at
We return the oid that matched, but the sole caller only cares whether
we matched anything at all. This is mostly academic, since there's only
one caller, but the lifetime of the returned pointer is not immediately
clear. Sometimes it points to an oid in a tag struct, which should live
forever. And sometimes to the oid passed in, which only lives as long as
the each_ref_fn callback we're called from.
Simplify this to a boolean return which is more direct and obvious. As a
bonus, this lets us avoid the weird pattern of overwriting our "oid"
parameter in the loop (since we now only refer to the tagged oid one
time, and can just inline the call to get it).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sun, 2 Jul 2023 22:37:47 +0000 (18:37 -0400)]
ref-filter: avoid parsing non-tags in match_points_at()
When handling --points-at, we have to try to peel each ref to see if
it's a tag that points at a requested oid. We start this process by
calling parse_object() on the oid pointed to by each ref.
The cost of parsing each object adds up, especially in an output that
doesn't otherwise need to open the objects at all. Ideally we'd use
peel_iterated_oid() here, which uses the cached information in the
packed-refs file. But we can't, because our --points-at must match not
only the fully peeled value, but any interim values (so if tag A points
to tag B which points to commit C, we should match --points-at=B, but
peel_iterated_oid() will only tell us about C).
So the best we can do (absent changes to the packed-refs peel traits) is
to avoid parsing non-tags. The obvious way to do that is to call
oid_object_info() to check the type before parsing. But there are a few
gotchas there, like checking if the object has already been parsed.
Instead we can just tell parse_object() that we are OK skipping the hash
check, which lets it turn on several optimizations. Commits can be
loaded via the commit graph (so it's both fast and we have the benefit
of the parsed data if we need it later at the output stage). Blobs are
not loaded at all. Trees are still loaded, but it's rather rare to have
a ref point directly to a tree (and since this is just an optimization,
kicking in 99% of the time is OK).
Even though we're paying for an extra lookup, the cost to avoid parsing
the non-tags is a net benefit. In my git.git repository with 941 tags
and 1440 other refs pointing to commits, this significantly cuts the
runtime:
Benchmark 1: ./git.old for-each-ref --points-at=HEAD
Time (mean ± σ): 26.8 ms ± 0.5 ms [User: 24.5 ms, System: 2.2 ms]
Range (min … max): 25.9 ms … 29.2 ms 107 runs
Benchmark 2: ./git.new for-each-ref --points-at=HEAD
Time (mean ± σ): 9.1 ms ± 0.3 ms [User: 6.8 ms, System: 2.2 ms]
Range (min … max): 8.6 ms … 10.2 ms 308 runs
Summary
'./git.new for-each-ref --points-at=HEAD' ran
2.96 ± 0.10 times faster than './git.old for-each-ref --points-at=HEAD'
In a repository that is mostly annotated tags, we'd expect less
improvement (we might still skip a few object loads, but that's balanced
by the extra lookups). In my clone of linux.git, which has 782 tags and
3 branches, the run-time is about the same (it's actually ~1% faster on
average after this patch, but that's within the run-to-run noise).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sun, 2 Jul 2023 22:35:40 +0000 (18:35 -0400)]
ref-filter: avoid parsing tagged objects in match_points_at()
When we peel tags to check if they match a --points-at oid, we
recursively parse the tagged object to see if it is also a tag. But
since the tag itself tells us the type of the object it points to (and
even gives us the appropriate object struct via its "tagged" member), we
can use that directly.
We do still have to make sure to call parse_tag() before looking at each
tag. This is redundant for the outermost tag (since we did call
parse_object() to find its type), but that's OK; parse_tag() is smart
enough to make this a noop when the tag has already been parsed.
In my clone of linux.git, with 782 tags (and only 3 non-tags), this
yields a significant speedup (bringing us back where we were before the
commit before this one started recursively dereferencing tags):
Benchmark 1: ./git.old for-each-ref --points-at=HEAD --format="%(refname)"
Time (mean ± σ): 20.3 ms ± 0.5 ms [User: 11.1 ms, System: 9.1 ms]
Range (min … max): 19.6 ms … 21.5 ms 141 runs
Benchmark 2: ./git.new for-each-ref --points-at=HEAD --format="%(refname)"
Time (mean ± σ): 11.4 ms ± 0.2 ms [User: 6.3 ms, System: 5.0 ms]
Range (min … max): 11.0 ms … 12.2 ms 250 runs
Summary
'./git.new for-each-ref --points-at=HEAD --format="%(refname)"' ran
1.79 ± 0.05 times faster than './git.old for-each-ref --points-at=HEAD --format="%(refname)"'
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jan Klötzke [Sat, 1 Jul 2023 20:57:02 +0000 (22:57 +0200)]
ref-filter: handle nested tags in --points-at option
Tags are dereferenced until reaching a different object type to handle
nested tags, e.g. on checkout. In contrast, "git tag --points-at=..."
fails to list such nested tags because only one level of indirection is
obtained in filter_refs(). Implement the recursive dereferencing for the
"--points-at" option when filtering refs to unify the behaviour.
Signed-off-by: Jan Klötzke <jan@kloetzke.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 17 Jul 2023 18:30:42 +0000 (11:30 -0700)]
Merge branch 'cw/compat-util-header-cleanup'
Further shuffling of declarations across header files to streamline
file dependencies.
* cw/compat-util-header-cleanup:
git-compat-util: move alloc macros to git-compat-util.h
treewide: remove unnecessary includes for wrapper.h
kwset: move translation table from ctype
sane-ctype.h: create header for sane-ctype macros
git-compat-util: move wrapper.c funcs to its header
git-compat-util: move strbuf.c funcs to its header
Junio C Hamano [Mon, 17 Jul 2023 18:30:41 +0000 (11:30 -0700)]
Merge branch 'pw/diff-no-index-from-named-pipes'
"git diff --no-index" learned to read from named pipes as if they
were regular files, to allow "git diff <(process) <(substitution)"
some shells support.
* pw/diff-no-index-from-named-pipes:
diff --no-index: support reading from named pipes
t4054: test diff --no-index with stdin
diff --no-index: die on error reading stdin
diff --no-index: refuse to compare stdin to a directory
René Scharfe [Sun, 16 Jul 2023 08:52:33 +0000 (10:52 +0200)]
strbuf: use skip_prefix() in strbuf_addftime()
Use the now common skip_prefix() cascade instead of a case statement to
parse the strftime(3) format in strbuf_addftime(). skip_prefix() parses
the "fmt" pointer and advances it appropriately, making additional
pointer arithmetic unnecessary. The resulting code is more compact and
consistent with most other strbuf_expand_step() loops.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sun, 16 Jul 2023 08:17:35 +0000 (10:17 +0200)]
t6300: fix setup with GPGSSH but without GPG
In a test introduced by 26c9c03f0a (ref-filter: add new "signature"
atom, 2023-06-04) the file named "file" is added by a setup step that
requires GPG and modified by a second setup step that requires GPGSSH.
Systems lacking the first prerequisite skip the initial setup step and
then "git commit -a" in the second one doesn't find the modified file.
Add it explicitly.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"imap-send" codepaths got cleaned up to get rid of unused
parameters.
* jk/imap-send-unused-variable-cleanup:
imap-send: drop unused fields from imap_cmd_cb
imap-send: drop unused parameter from imap_cmd_cb callback
imap-send: use server conf argument in setup_curl()
D. Ben Knoble [Fri, 14 Jul 2023 13:30:10 +0000 (13:30 +0000)]
t4002: fix "diff can read from stdin" syntax
I noticed this test was producing output like
```
t4002-diff-basic.sh: test_expect_successdiff can read from stdin: not found
```
which is rather odd. Investigation shows an error of shell syntax:
foo'abc' is the same as fooabc to the shell. Perhaps obviously, this is
not a valid command for the test.
I am surprised this doesn't count as an error in the test, but that
accounts for it going unnoticed.
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:38:13 +0000 (19:38 -0400)]
commit-graph.c: prevent overflow in `merge_commit_graph()`
When merging two commit graphs, ensure that we don't attempt to merge
two graphs which, when combined, have more total commits than the 32-bit
unsigned maximum.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:38:11 +0000 (19:38 -0400)]
commit-graph.c: prevent overflow in `split_graph_merge_strategy()`
In a similar spirit as previous commits, ensure that we don't overflow
when choosing how to split and merge different layers of the
commit-graph.
In particular, avoid a potential overflow between `size_mult` and
`num_commits`, as well as a potential overflow between the number of
commits currently in the merged graph, and the number of commits in the
graph about to be merged.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:38:08 +0000 (19:38 -0400)]
commit-graph.c: prevent overflow in `load_tree_for_commit()`
In a similar spirit as previous commits, ensure that we don't overflow
when computing an offset into the commit_data chunk when the (relative)
graph position exceeds 2^32-1/GRAPH_DATA_WIDTH.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:38:05 +0000 (19:38 -0400)]
commit-graph.c: prevent overflow in `fill_commit_in_graph()`
In a similar spirit as previous commits, ensure that we don't overflow
when the lex_index of the commit we are trying to fill out exceeds
2^32-1/(g->hash_len+16).
The other hunk touched in this patch is not susceptible to overflow,
since an explicit cast is made to a 64-bit unsigned value. For clarity
and consistency with the rest of the commits in this series, avoid a
tricky to reason about cast, and use `st_mult()` directly.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:38:03 +0000 (19:38 -0400)]
commit-graph.c: prevent overflow in `fill_commit_graph_info()`
In a similar spirit as previous commits, ensure that we don't overflow
in a few spots within `fill_commit_graph_info()`:
- First, when computing an offset into the commit data chunk, which
can occur when the `lex_index` of the item we're looking up exceeds
2^32-1/GRAPH_DATA_WIDTH.
- A similar issue when computing the generation date offset for
commits with `lex_index` greater than 2^32-1/4. Note that in
practice this will never overflow, since the left-hand operand is
from calling `sizeof(...)` and is thus already a `size_t`. But wrap
that in an `st_mult()` to make it clear that we intend to perform
this computation using 64-bit operands.
- Finally, a nearly identical issue as above when computing an offset
into the `generation_data_overflow` chunk.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:38:00 +0000 (19:38 -0400)]
commit-graph.c: prevent overflow in `load_oid_from_graph()`
In a similar spirit as previous commits, ensure that we don't overflow
when trying to compute an offset into the `chunk_oid_lookup` table when
the `lex_index` of the item we're trying to look up exceeds
`2^32-1/g->hash_len`.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:37:57 +0000 (19:37 -0400)]
commit-graph.c: prevent overflow in add_graph_to_chain()
The commit-graph uses a fanout table with 4-byte entries to store the
number of commits at each shard of the commit-graph. So it is OK to have
a commit graph with as many as 2^32-1 stored commits. But we risk
overflowing any computation which may exceed the 32-bit (unsigned)
maximum when those computations are (incorrectly) performed using 32-bit
operands.
There are a couple of spots in `add_graph_to_chain()` where we could
potentially overflow the result:
- First, when comparing the list of existing entries in the
commit-graph chain. It is unlikely that this should ever overflow,
since it would require having roughly 2^32-1/g->hash_len
commit-graphs in the chain. But let's guard that computation with a
`st_mult()` just to be safe.
- Second, when computing the number of commits in the graph added to
the front of the chain. This value is also a 32-bit unsigned, but we
should make sure that it does not grow beyond the maximum value.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:37:54 +0000 (19:37 -0400)]
commit-graph.c: prevent overflow in `write_commit_graph_file()`
When writing a commit-graph, we use the chunk-format API to write out
each individual chunk of the commit-graph. Each chunk of the
commit-graph is tracked via a call to `add_chunk()`, along with the
expected size of that chunk.
Similar to an earlier commit which handled the identical issue in the
MIDX machinery, guard against overflow when dealing with a commit-graph
with a large number of entries to avoid corrupting the contents of the
commit-graph itself.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:37:52 +0000 (19:37 -0400)]
pack-bitmap.c: ensure that eindex lookups don't overflow
When a bitmap is used to answer some reachability query, it creates a
pseudo-bitmap called the "extended index" on top of any existing bitmaps
to store objects that are relevant to the query, but not mentioned in
the bitmap.
When looking up the ith object in the extended index in a bitmap, it is
common to write something like:
bitmap_get(result, i + bitmap_num_objects(bitmap_git))
, indicating that we want the ith object following all other objects
mentioned in the bitmap_git.
Since the type of `i` and the return type of `bitmap_num_objects()` are
both `uint32_t`s, But if there are either a large number of objects in
the bitmap, or a large number of objects in the extended index (or
both), this addition can overflow when the sum is greater than 2^32-1.
Having that large of a bitmap position is entirely acceptable, but we
need to ensure that the computed bitmap position for that object is
performed using 64-bits and doesn't overflow.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:37:49 +0000 (19:37 -0400)]
midx.c: prevent overflow in `fill_included_packs_batch()`
In a similar spirit as in previous commits, avoid an integer overflow
when computing the expected size of a MIDX.
(Note that this is also OK as-is, since `p->pack_size` is an `off_t`, so
this computation should already be done as 64-bit integers. But again,
let's use `st_mult()` to make this fact clear).
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:37:46 +0000 (19:37 -0400)]
midx.c: prevent overflow in `write_midx_internal()`
When writing a MIDX, we use the chunk-format API to write out each
individual chunk of the MIDX. Each chunk of the MIDX is tracked via a
call to `add_chunk()`, along with the expected size of that chunk.
Guard against overflow when dealing with a MIDX with a large number of
entries (and consequently, large chunks within the MIDX file itself) to
avoid corrupting the contents of the MIDX itself.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:37:44 +0000 (19:37 -0400)]
midx.c: store `nr`, `alloc` variables as `size_t`'s
In the `write_midx_context` structure, we use two `uint32_t`'s to track
the length and allocated size of the packs, and one `uint32_t` to track
the number of objects in the MIDX.
In practice, having these be 32-bit unsigned values shouldn't cause any
problems since we are unlikely to have that many objects or packs in any
real-world repository. But these values should be `size_t`'s, so change
their type to reflect that.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:37:41 +0000 (19:37 -0400)]
midx.c: prevent overflow in `nth_midxed_offset()`
In a similar spirit as previous patches, avoid an overflow when looking
up object offsets in the MIDX's large offset table by guarding the
computation via `st_mult()`.
This instance is also OK as-is, since the left operand is the result of
`sizeof(...)`, which is already a `size_t`. But use `st_mult()` instead
here to make it explicit that this computation is to be performed using
64-bit unsigned integers.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:37:38 +0000 (19:37 -0400)]
midx.c: prevent overflow in `nth_midxed_object_oid()`
In a similar spirit as previous commits, avoid overflow when looking up
an object's OID in a MIDX when its position is greater than
`2^32-1/m->hash_len`.
As usual, it is perfectly OK for a MIDX to have as many as 2^32-1
objects (since we use 32-bit fields to count the number of objects at
each fanout layer). But if we have more than `2^32-1/m->hash_len` number
of objects, we will incorrectly perform the computation using 32-bit
integers, overflowing the result.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:37:36 +0000 (19:37 -0400)]
midx.c: use `size_t`'s for fanout nr and alloc
The `midx_fanout` struct is used to keep track of a set of OIDs
corresponding to each layer of the MIDX's fanout table. It stores an
array of entries, along with the number of entries in the table, and the
allocated size of the array.
Both `nr` and `alloc` are stored as 32-bit unsigned integers. In
practice, this should never cause any problems, since most packs have
far fewer than 2^32-1 objects.
But storing these as `size_t`'s is more appropriate, and prevents us
from accidentally overflowing some result when multiplying or adding to
either of these values. Update these struct members to be `size_t`'s as
appropriate.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Wed, 12 Jul 2023 23:37:32 +0000 (19:37 -0400)]
packfile.c: use checked arithmetic in `nth_packed_object_offset()`
In a similar spirit as the previous commits, ensure that we use
`st_add()` or `st_mult()` when computing values that may overflow the
32-bit unsigned limit.
Note that in each of these instances, we prevent 32-bit overflow
already since we have explicit casts to `size_t`.
So this code is OK as-is, but let's clarify it by using the `st_xyz()`
helpers to make it obvious that we are performing the relevant
computations using 64 bits.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Fri, 14 Jul 2023 00:54:54 +0000 (20:54 -0400)]
packfile.c: prevent overflow in `load_idx()`
Prevent an overflow when locating a pack's CRC offset when the number
of packed items is greater than 2^32-1/hashsz by guarding the
computation with an `st_mult()`.
Note that to avoid truncating the result, the `crc_offset` member must
itself become a `size_t`. The only usage of this variable (besides the
assignment in `load_idx()`) is in `read_v2_anomalous_offsets()` in the
index-pack code. There we use the `crc_offset` as a pointer offset, so
we are already equipped to handle the type change.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 3 Jul 2023 06:44:36 +0000 (02:44 -0400)]
t/helper: mark unused callback void data parameters
Many callback interfaces have an extra void data parameter, but we don't
always need it (especially for dumping functions like the ones in test
helpers). Mark them as unused to avoid -Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 3 Jul 2023 06:44:33 +0000 (02:44 -0400)]
tag: mark unused parameters in each_tag_name_fn callbacks
We iterate over the set of input tag names using callbacks. But not all
operations need the same inputs, so some parameters go unused (but of
course not the same ones for each operation). Mark the unused ones to
avoid -Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 3 Jul 2023 06:44:28 +0000 (02:44 -0400)]
replace: mark unused parameter in each_mergetag_fn callback
We don't look at the "commit" parameter to our callback, as our
"mergetag_data" pointer contains the original name "ref", which we use
instead. But we can't get rid of it, since other for_each_mergetag
callbacks do use it. Let's mark the parameter to avoid
-Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 3 Jul 2023 06:44:25 +0000 (02:44 -0400)]
replace: mark unused parameter in ref callback
We don't look at the "flags" parameter, which is natural for something
that is just printing the contents of the replace refs. But let's mark
it to appease -Wunused-parameter.
This probably should have been part of 63e14ee2d6 (refs: mark unused
each_ref_fn parameters, 2022-08-19), but I missed it as this one is a
repo_each_ref_fn, which takes an extra repository argument.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 3 Jul 2023 06:44:21 +0000 (02:44 -0400)]
merge-tree: mark unused parameter in traverse callback
Our threeway_callback() does not bother to look at its "n" parameter. It
is static in this file and used only by trivial_merge_trees(), which
always passes 3 trees (hence the name "threeway"). It also does not look
at "dirmask". This is OK, as it handles directories specifically by
looking at the mode bits.
Other traverse_info callbacks need these, so we can't get drop them from
the interface. But let's annotate these ones to avoid complaints from
-Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 3 Jul 2023 06:44:18 +0000 (02:44 -0400)]
fsck: mark unused parameters in various fsck callbacks
There are a few callback functions which are used with the fsck code,
but it's natural that not all callbacks need all parameters. For
reporting, even something as obvious as "the oid of the object which had
a problem" is not always used, as some callers are only checking a
single object in the first place. And for both reporting and walking,
things like void data pointers and the fsck_options aren't always
necessary.
But since each such parameter is used by _some_ callback, we have to
keep them in the interface. Mark the unused ones in specific callbacks
to avoid triggering -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 3 Jul 2023 06:44:16 +0000 (02:44 -0400)]
revisions: drop unused "opt" parameter in "tweak" callbacks
The setup_revision_opt struct has a "tweak" function pointer, which can
be used to adjust parameters after setup_revisions() parses arguments,
but before it finalizes setup. In addition to the rev_info struct, the
callback receives a pointer to the setup_revision_opt, as well.
But none of the existing callbacks looks at the extra "opt" parameter,
leading to -Wunused-parameter warnings.
We could mark it as UNUSED, but instead let's remove it entirely. It's
conceivable that it could be useful for a callback to have access to the
"opt" struct. But in the 13 years that this mechanism has existed,
nobody has used it. So let's just drop it in the name of simplifying.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>