SZEDER Gábor [Mon, 5 Sep 2022 18:50:07 +0000 (20:50 +0200)]
notes, remote: show unknown subcommands between `'
Update the "unknown subcommand" error message in 'git notes' and 'git
remote' to wrap the offending argument between `', to make it
consistent with the "unknown switch/option/subcommand" error messages
in parse-options.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git notes' has a default operation mode, but when invoked without a
subcommand it doesn't accept any arguments (although the 'list'
subcommand implementing the default operation mode does accept
arguments). The condition checking this ended up a bit awkward, so
let's make it clearer.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 25 Aug 2022 10:51:40 +0000 (06:51 -0400)]
remote: run "remote rm" argv through parse_options()
The "git remote rm" command's option parsing is fairly primitive: it
insists on a single argument, which it treats as the remote name, and
displays a usage message otherwise.
This is OK, and maybe even convenient, as you could run:
git remote rm --foo
to drop a remote named "--foo". But it's also weirdly unlike most of the
rest of Git, which would complain that there is no option "--foo". The
right way to spell it by our conventions is:
git remote rm -- --foo
but this doesn't currently work.
So let's bring the command in line with the rest of Git (including its
sibling subcommands!) by feeding argv to parse_options(). We already
have an empty options array for the usage helper.
Note that we have to adjust the argc index down by one, as
parse_options() eats the program name from the start of the array.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 25 Aug 2022 10:51:06 +0000 (06:51 -0400)]
maintenance: add parse-options boilerplate for subcommands
Several of the git-maintenance subcommands don't take any options, so
they don't bother looking at argv at all. This means they'll silently
accept garbage, like:
$ git maintenance register --foo
[no output]
$ git maintenance stop bar
[no output]
Let's give them the basic boilerplate to detect and handle these cases:
$ git maintenance stop bar
usage: git maintenance stop
We could reduce the number of lines of code here a bit with a shared
helper function. But it's worth building out the boilerplate, as it may
serve as the base for adding options later.
Note one complication: maintenance_start() calls directly into
maintenance_register(), so it now needs to pass a plausible argv (we
don't care, but parse_options() is expecting there to at least be an
argv[0] program name). This is an extra line of code, but it eliminates
the need for an explanatory comment.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 25 Aug 2022 10:47:00 +0000 (06:47 -0400)]
pass subcommand "prefix" arguments to parse_options()
Recent commits such as bf0a6b65fc (builtin/multi-pack-index.c: let
parse-options parse subcommands, 2022-08-19) converted a few functions
to match our usual argc/argv/prefix conventions, but the prefix argument
remains unused.
However, there is a good use for it: they should pass it to their own
parse_options() functions, where it may be used to adjust the value of
any filename options. In all but one of these functions, there's no
behavior change, since they don't use OPT_FILENAME. But this is an
actual fix for one option, which you can see by modifying the test suite
like so:
diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh
index 4fe57414c1..d0974d4371 100755
--- a/t/t5326-multi-pack-bitmaps.sh
+++ b/t/t5326-multi-pack-bitmaps.sh
@@ -186,7 +186,11 @@ test_expect_success 'writing a bitmap with --refs-snapshot' '
# Then again, but with a refs snapshot which only sees
# refs/tags/one.
- git multi-pack-index write --bitmap --refs-snapshot=snapshot &&
+ (
+ mkdir subdir &&
+ cd subdir &&
+ git multi-pack-index write --bitmap --refs-snapshot=../snapshot
+ ) &&
I'd emphasize that this wasn't broken by bf0a6b65fc; it has been broken
all along, because the sub-function never got to see the prefix. It is
that commit which is actually enabling us to fix it (and which also
brought attention to the problem because it triggers -Wunused-parameter!)
The other functions changed here don't use OPT_FILENAME at all. In their
cases this isn't fixing anything visible, but it's following the usual
pattern and future-proofing them against somebody adding new options and
being surprised.
I didn't include a test for the one visible case above. We don't
generally test routine parse-options behavior for individual options.
The challenge here was finding the problem, and now that this has been
done, it's not likely to regress. Likewise, we could apply the patch
above to cover it "for free" but it makes reading the rest of the test
unnecessarily complicated.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:11 +0000 (18:04 +0200)]
builtin/worktree.c: let parse-options parse subcommands
'git worktree' parses its subcommands with a long list of if
statements. parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:10 +0000 (18:04 +0200)]
builtin/stash.c: let parse-options parse subcommands
'git stash' parses its subcommands with a long list of if-else if
statements. parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
and listing subcommands for Bash completion.
Note that the push_stash() function implementing the 'push' subcommand
accepts an extra flag parameter to indicate whether push was assumed,
so add a wrapper function with the standard subcommand function
signature.
Note also that this change "hides" the '-h' option in 'git stash push
-h' from the parse_option() call in cmd_stash(), as it comes after the
subcommand. Consequently, from now on it will emit the usage of the
'push' subcommand instead of the usage of 'git stash'. We had a
failing test for this case, which can now be flipped to expect
success.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:09 +0000 (18:04 +0200)]
builtin/sparse-checkout.c: let parse-options parse subcommands
'git sparse-checkout' parses its subcommands with a couple of if
statements. parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.
Note that some of the functions implementing each subcommand only
accept the 'argc' and '**argv' parameters, so add a (unused) '*prefix'
parameter to make them match the type expected by parse-options, and
thus avoid casting function pointers.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:08 +0000 (18:04 +0200)]
builtin/remote.c: let parse-options parse subcommands
'git remote' parses its subcommands with a long list of if-else if
statements. parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling unknown subcommands, and listing subcommands for Bash
completion. Make sure that the default operation mode doesn't accept
any arguments; and while at it remove the capitalization of the error
message and adjust the test checking it accordingly.
Note that 'git remote' has both 'remove' and 'rm' subcommands, and the
former is preferred [1], so hide the latter for completion.
Note also that the functions implementing each subcommand only accept
the 'argc' and '**argv' parameters, so add a (unused) '*prefix'
parameter to make them match the type expected by parse-options, and
thus avoid casting a bunch of function pointers.
[1] e17dba8fe1 (remote: prefer subcommand name 'remove' to 'rm',
2012-09-06)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:07 +0000 (18:04 +0200)]
builtin/reflog.c: let parse-options parse subcommands
'git reflog' parses its subcommands with a couple of if-else if
statements. parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
and listing subcommands for Bash completion.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:06 +0000 (18:04 +0200)]
builtin/notes.c: let parse-options parse subcommands
'git notes' parses its subcommands with a long list of if-else if
statements. parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling unknown subcommands, and listing subcommands for Bash
completion. Make sure that the default operation mode doesn't accept
any arguments.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:05 +0000 (18:04 +0200)]
builtin/multi-pack-index.c: let parse-options parse subcommands
'git multi-pack-index' parses its subcommands with a couple of if-else
if statements. parse-options has just learned to parse subcommands,
so let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.
Note that the functions implementing each subcommand only accept the
'argc' and '**argv' parameters, so add a (unused) '*prefix' parameter
to make them match the type expected by parse-options, and thus avoid
casting function pointers.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:04 +0000 (18:04 +0200)]
builtin/hook.c: let parse-options parse subcommands
'git hook' parses its currently only subcommand with an if statement.
parse-options has just learned to parse subcommands, so let's use that
facility instead, with the benefits of shorter code, handling missing
or unknown subcommands, and listing subcommands for Bash completion.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:03 +0000 (18:04 +0200)]
builtin/gc.c: let parse-options parse 'git maintenance's subcommands
'git maintenanze' parses its subcommands with a couple of if
statements. parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.
This change makes 'git maintenance' consistent with other commands in
that the help text shown for '-h' goes to standard output, not error,
in the exit code and error message on unknown subcommand, and the
error message on missing subcommand. There is a test checking these,
which is now updated accordingly.
Note that some of the functions implementing each subcommand don't
accept any parameters, so add the (unused) 'argc', '**argv' and
'*prefix' parameters to make them match the type expected by
parse-options, and thus avoid casting function pointers.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:02 +0000 (18:04 +0200)]
builtin/commit-graph.c: let parse-options parse subcommands
'git commit-graph' parses its subcommands with an if-else if
statement. parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.
Note that the functions implementing each subcommand only accept the
'argc' and '**argv' parameters, so add a (unused) '*prefix' parameter
to make them match the type expected by parse-options, and thus avoid
casting function pointers.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:01 +0000 (18:04 +0200)]
builtin/bundle.c: let parse-options parse subcommands
'git bundle' parses its subcommands with a couple of if-else if
statements. parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:04:00 +0000 (18:04 +0200)]
parse-options: add support for parsing subcommands
Several Git commands have subcommands to implement mutually exclusive
"operation modes", and they usually parse their subcommand argument
with a bunch of if-else if statements.
Teach parse-options to handle subcommands as well, which will result
in shorter and simpler code with consistent error handling and error
messages on unknown or missing subcommand, and it will also make
possible for our Bash completion script to handle subcommands
programmatically.
The approach is guided by the following observations:
- Most subcommands [1] are implemented in dedicated functions, and
most of those functions [2] either have a signature matching the
'int cmd_foo(int argc, const char **argc, const char *prefix)'
signature of builtin commands or can be trivially converted to
that signature, because they miss only that last prefix parameter
or have no parameters at all.
- Subcommand arguments only have long form, and they have no double
dash prefix, no negated form, and no description, and they don't
take any arguments, and can't be abbreviated.
- There must be exactly one subcommand among the arguments, or zero
if the command has a default operation mode.
- All arguments following the subcommand are considered to be
arguments of the subcommand, and, conversely, arguments meant for
the subcommand may not preceed the subcommand.
So in the end subcommand declaration and parsing would look something
like this:
Here each OPT_SUBCOMMAND specifies the name of the subcommand and the
function implementing it, and the address of the same 'fn' subcommand
function pointer. parse_options() then processes the arguments until
it finds the first argument matching one of the subcommands, sets 'fn'
to the function associated with that subcommand, and returns, leaving
the rest of the arguments unprocessed. If none of the listed
subcommands is found among the arguments, parse_options() will show
usage and abort.
If a command has a default operation mode, 'fn' should be initialized
to the function implementing that mode, and parse_options() should be
invoked with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag. In this case
parse_options() won't error out when not finding any subcommands, but
will return leaving 'fn' unchanged. Note that if that default
operation mode has any --options, then the PARSE_OPT_KEEP_UNKNOWN_OPT
flag is necessary as well (otherwise parse_options() would error out
upon seeing the unknown option meant to the default operation mode).
Some thoughts about the implementation:
- The same pointer to 'fn' must be specified as 'value' for each
OPT_SUBCOMMAND, because there can be only one set of mutually
exclusive subcommands; parse_options() will BUG() otherwise.
There are other ways to tell parse_options() where to put the
function associated with the subcommand given on the command line,
but I didn't like them:
- Change parse_options()'s signature by adding a pointer to
subcommand function to be set to the function associated with
the given subcommand, affecting all callsites, even those that
don't have subcommands.
- Introduce a specific parse_options_and_subcommand() variant
with that extra funcion parameter.
- I decided against automatically calling the subcommand function
from within parse_options(), because:
- There are commands that have to perform additional actions
after option parsing but before calling the function
implementing the specified subcommand.
- The return code of the subcommand is usually the return code
of the git command, but preserving the return code of the
automatically called subcommand function would have made the
API awkward.
- Also add a OPT_SUBCOMMAND_F() variant to allow specifying an
option flag: we have two subcommands that are purposefully
excluded from completion ('git remote rm' and 'git stash save'),
so they'll have to be specified with the PARSE_OPT_NOCOMPLETE
flag.
- Some of the 'parse_opt_flags' don't make sense with subcommands,
and using them is probably just an oversight or misunderstanding.
Therefore parse_options() will BUG() when invoked with any of the
following flags while the options array contains at least one
OPT_SUBCOMMAND:
- PARSE_OPT_KEEP_DASHDASH: parse_options() stops parsing
arguments when encountering a "--" argument, so it doesn't
make sense to expect and keep one before a subcommand, because
it would prevent the parsing of the subcommand.
However, this flag is allowed in combination with the
PARSE_OPT_SUBCOMMAND_OPTIONAL flag, because the double dash
might be meaningful for the command's default operation mode,
e.g. to disambiguate refs and pathspecs.
- PARSE_OPT_STOP_AT_NON_OPTION: As its name suggests, this flag
tells parse_options() to stop as soon as it encouners a
non-option argument, but subcommands are by definition not
options... so how could they be parsed, then?!
- PARSE_OPT_KEEP_UNKNOWN: This flag can be used to collect any
unknown --options and then pass them to a different command or
subsystem. Surely if a command has subcommands, then this
functionality should rather be delegated to one of those
subcommands, and not performed by the command itself.
However, this flag is allowed in combination with the
PARSE_OPT_SUBCOMMAND_OPTIONAL flag, making possible to pass
--options to the default operation mode.
- If the command with subcommands has a default operation mode, then
all arguments to the command must preceed the arguments of the
subcommand.
AFAICT we don't have any commands where this makes a difference,
because in those commands either only the command accepts any
arguments ('notes' and 'remote'), or only the default subcommand
('reflog' and 'stash'), but never both.
- The 'argv' array passed to subcommand functions currently starts
with the name of the subcommand. Keep this behavior. AFAICT no
subcommand functions depend on the actual content of 'argv[0]',
but the parse_options() call handling their options expects that
the options start at argv[1].
- To support handling subcommands programmatically in our Bash
completion script, 'git cmd --git-completion-helper' will now list
both subcommands and regular --options, if any. This means that
the completion script will have to separate subcommands (i.e.
words without a double dash prefix) from --options on its own, but
that's rather easy to do, and it's not much work either, because
the number of subcommands a command might have is rather low, and
those commands accept only a single --option or none at all. An
alternative would be to introduce a separate option that lists
only subcommands, but then the completion script would need not
one but two git invocations and command substitutions for commands
with subcommands.
Note that this change doesn't affect the behavior of our Bash
completion script, because when completing the --option of a
command with subcommands, e.g. for 'git notes --<TAB>', then all
subcommands will be filtered out anyway, as none of them will
match the word to be completed starting with that double dash
prefix.
[1] Except 'git rerere', because many of its subcommands are
implemented in the bodies of the if-else if statements parsing the
command's subcommand argument.
[2] Except 'credential', 'credential-store' and 'fsmonitor--daemon',
because some of the functions implementing their subcommands take
special parameters.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This doesn't matter for the completion script, because field splitting
discards that space anyway.
However, later patches in this series will teach parse-options to
handle subcommands, and subcommands will be included in the completion
helper output as well. This will make the loop printing options (and
subcommands) a tad more complex, so I wanted to test the result. The
test would have to account for the presence of that leading space,
which bugged my OCD, so let's get rid of it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:03:58 +0000 (18:03 +0200)]
parse-options: clarify the limitations of PARSE_OPT_NODASH
Update the comment documenting 'struct option' to clarify that
PARSE_OPT_NODASH can only be an argumentless short option; see 51a9949eda (parseopt: add PARSE_OPT_NODASH, 2009-05-07).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:03:57 +0000 (18:03 +0200)]
parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options
The description of 'PARSE_OPT_KEEP_UNKNOWN' starts with "Keep unknown
arguments instead of erroring out". This is a bit misleading, as this
flag only applies to unknown --options, while non-option arguments are
kept even without this flag.
Update the description to clarify this, and rename the flag to
PARSE_OPTIONS_KEEP_UNKNOWN_OPT to make this obvious just by looking at
the flag name.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:03:56 +0000 (18:03 +0200)]
api-parse-options.txt: fix description of OPT_CMDMODE
The description of the 'OPT_CMDMODE' macro states that "enum_val is
set to int_var when ...", but it's the other way around, 'int_var' is
set to 'enum_val'. Fix this.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:03:55 +0000 (18:03 +0200)]
t0040-parse-options: test parse_options() with various 'parse_opt_flags'
In 't0040-parse-options.sh' we thoroughly test the parsing of all
types and forms of options, but in all those tests parse_options() is
always invoked with a 0 flags parameter.
Add a few tests to demonstrate how various 'enum parse_opt_flags'
values are supposed to influence option parsing.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:03:54 +0000 (18:03 +0200)]
t5505-remote.sh: check the behavior without a subcommand
'git remote' without a subcommand defaults to listing all remotes and
doesn't accept any arguments except the '-v|--verbose' option.
We are about to teach parse-options to handle subcommands, and update
'git remote' to make use of that new feature. So let's add some tests
to make sure that the upcoming changes don't inadvertently change the
behavior in these cases.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:03:53 +0000 (18:03 +0200)]
t3301-notes.sh: check that default operation mode doesn't take arguments
'git notes' without a subcommand defaults to listing all notes and
doesn't accept any arguments.
We are about to teach parse-options to handle subcommands, and update
'git notes' to make use of that new feature. So let's add a test to
make sure that the upcoming changes don't inadvertenly change the
behavior in this corner case.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Fri, 19 Aug 2022 16:03:52 +0000 (18:03 +0200)]
git.c: update NO_PARSEOPT markings
Our Bash completion script can complete --options for commands using
parse-options even when that command doesn't have a dedicated
completion function, but to do so the completion script must know
which commands use parse-options and which don't. Therefore, commands
not using parse-options are marked in 'git.c's command list with the
NO_PARSEOPT flag.
Update this list, and remove this flag from the commands that by now
use parse-options.
After this change we can TAB complete --options of the plumbing
commands 'commit-tree', 'mailinfo' and 'mktag'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 19 Jul 2022 23:40:19 +0000 (16:40 -0700)]
Merge branch 'll/curl-accept-language'
Earlier, HTTP transport clients learned to tell the server side
what locale they are in by sending Accept-Language HTTP header, but
this was done only for some requests but not others.
* ll/curl-accept-language:
remote-curl: send Accept-Language header to server
Junio C Hamano [Tue, 19 Jul 2022 23:40:17 +0000 (16:40 -0700)]
Merge branch 'jk/clone-unborn-confusion'
"git clone" from a repository with some ref whose HEAD is unborn
did not set the HEAD in the resulting repository correctly, which
has been corrected.
* jk/clone-unborn-confusion:
clone: move unborn head creation to update_head()
clone: use remote branch if it matches default HEAD
clone: propagate empty remote HEAD even with other branches
clone: drop extra newline from warning message
Junio C Hamano [Mon, 18 Jul 2022 20:31:56 +0000 (13:31 -0700)]
Merge branch 'ab/cocci-unused'
Add Coccinelle rules to detect the pattern of initializing and then
finalizing a structure without using it in between at all, which
happens after code restructuring and the compilers fail to
recognize as an unused variable.
* ab/cocci-unused:
cocci: generalize "unused" rule to cover more than "strbuf"
cocci: add and apply a rule to find "unused" strbufs
cocci: have "coccicheck{,-pending}" depend on "coccicheck-test"
cocci: add a "coccicheck-test" target and test *.cocci rules
Makefile & .gitignore: ignore & clean "git.res", not "*.res"
Makefile: remove mandatory "spatch" arguments from SPATCH_FLAGS
Junio C Hamano [Mon, 18 Jul 2022 20:31:56 +0000 (13:31 -0700)]
Merge branch 'en/merge-dual-dir-renames-fix'
Fixes a long-standing corner case bug around directory renames in
the merge-ort strategy.
* en/merge-dual-dir-renames-fix:
merge-ort: fix issue with dual rename and add/add conflict
merge-ort: shuffle the computation and cleanup of potential collisions
merge-ort: make a separate function for freeing struct collisions
merge-ort: small cleanups of check_for_directory_rename
t6423: add tests of dual directory rename plus add/add conflict
Junio C Hamano [Mon, 18 Jul 2022 20:31:55 +0000 (13:31 -0700)]
Merge branch 'ab/test-without-templates'
Tweak tests so that they still work when the "git init" template
did not create .git/info directory.
* ab/test-without-templates:
tests: don't assume a .git/info for .git/info/sparse-checkout
tests: don't assume a .git/info for .git/info/exclude
tests: don't assume a .git/info for .git/info/refs
tests: don't assume a .git/info for .git/info/attributes
tests: don't assume a .git/info for .git/info/grafts
tests: don't depend on template-created .git/branches
t0008: don't rely on default ".git/info/exclude"
Junio C Hamano [Mon, 18 Jul 2022 20:31:55 +0000 (13:31 -0700)]
Merge branch 'ab/build-gitweb'
Teach "make all" to build gitweb as well.
* ab/build-gitweb:
gitweb/Makefile: add a "NO_GITWEB" parameter
Makefile: build 'gitweb' in the default target
gitweb/Makefile: include in top-level Makefile
gitweb: remove "test" and "test-installed" targets
gitweb/Makefile: prepare to merge into top-level Makefile
gitweb/Makefile: clear up and de-duplicate the gitweb.{css,js} vars
gitweb/Makefile: add a $(GITWEB_ALL) variable
gitweb/Makefile: define all .PHONY prerequisites inline
René Scharfe [Fri, 15 Jul 2022 03:58:50 +0000 (05:58 +0200)]
mingw: avoid mktemp() in mkstemp() implementation
The implementation of mkstemp() for MinGW uses mktemp() and open()
without the flag O_EXCL, which is racy. It's not a security problem
for now because all of its callers only create files within the
repository (incl. worktrees). Replace it with a call to our more
secure internal function, git_mkstemp_mode(), to prevent possible
future issues.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.
A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.
Create a config variable, `safe.bareRepository`, that tells Git whether
or not to die() when working with a bare repository. This config is an
enum of:
- "all": allow all bare repositories (this is the default)
- "explicit": only allow bare repositories specified via --git-dir
or GIT_DIR.
If we want to protect users from such attacks by default, neither value
will suffice - "all" provides no protection, but "explicit" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.
Use git_protected_config() to read `safe.directory` instead of
read_very_early_config(), making it 'protected configuration only'.
As a result, `safe.directory` now respects "-c", so update the tests and
docs accordingly. It used to ignore "-c" due to how it was implemented,
not because of security or correctness concerns [1].
`uploadpack.packObjectsHook` is the only 'protected configuration only'
variable today, but we've noted that `safe.directory` and the upcoming
`safe.bareRepository` should also be 'protected configuration only'. So,
for consistency, we'd like to have a single implementation for protected
configuration.
The primary constraints are:
1. Reading from protected configuration should be fast. Nearly all "git"
commands inside a bare repository will read both `safe.directory` and
`safe.bareRepository`, so we cannot afford to be slow.
2. Protected configuration must be readable when the gitdir is not
known. `safe.directory` and `safe.bareRepository` both affect
repository discovery and the gitdir is not known at that point [1].
The chosen implementation in this commit is to read protected
configuration and cache the values in a global configset. This is
similar to the caching behavior we get with the_repository->config.
Introduce git_protected_config(), which reads protected configuration
and caches them in the global configset protected_config. Then, refactor
`uploadpack.packObjectsHook` to use git_protected_config().
The protected configuration functions are named similarly to their
non-protected counterparts, e.g. git_protected_config_check_init() vs
git_config_check_init().
In light of constraint 1, this implementation can still be improved.
git_protected_config() iterates through every variable in
protected_config, which is wasteful, but it makes the conversion simple
because it matches existing patterns. We will likely implement constant
time lookup functions for protected configuration in a future series
(such functions already exist for non-protected configuration, i.e.
repo_config_get_*()).
An alternative that avoids introducing another configset is to continue
to read all config using git_config(), but only accept values that have
the correct config scope [2]. This technically fulfills constraint 2,
because git_config() simply ignores the local and worktree config when
the gitdir is not known. However, this would read incomplete config into
the_repository->config, which would need to be reset when the gitdir is
known and git_config() needs to read the local and worktree config.
Resetting the_repository->config might be reasonable while we only have
these 'protected configuration only' variables, but it's not clear
whether this extends well to future variables.
[1] In this case, we do have a candidate gitdir though, so with a little
refactoring, it might be possible to provide a gitdir.
[2] This is how `uploadpack.packObjectsHook` was implemented prior to
this commit.
Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
For security reasons, there are config variables that are only trusted
when they are specified in certain configuration scopes, which are
sometimes referred to on-list as 'protected configuration' [1]. A future
commit will introduce another such variable, so let's define our terms
so that we can have consistent documentation and implementation.
In our documentation, define 'protected configuration' as the system,
global and command config scopes. As a shorthand, I will refer to
variables that are only respected in protected configuration as
'protected configuration only', but this term is not used in the
documentation.
This definition of protected configuration is based on whether or not
Git can reasonably protect the user by ignoring the configuration scope:
- System, global and command line config are considered protected
because an attacker who has control over any of those can do plenty of
harm without Git, so we gain very little by ignoring those scopes.
- On the other hand, local (and similarly, worktree) config are not
considered protected because it is relatively easy for an attacker to
control local config, e.g.:
- On some shared user environments, a non-admin attacker can create a
repository high up the directory hierarchy (e.g. C:\.git on
Windows), and a user may accidentally use it when their PS1
automatically invokes "git" commands.
`safe.directory` prevents attacks of this form by making sure that
the user intended to use the shared repository. It obviously
shouldn't be read from the repository, because that would end up
trusting the repository that Git was supposed to reject.
- "git upload-pack" is expected to run in repositories that may not be
controlled by the user. We cannot ignore all config in that
repository (because "git upload-pack" would fail), but we can limit
the risks by ignoring `uploadpack.packObjectsHook`.
Only `uploadpack.packObjectsHook` is 'protected configuration only'. The
following variables are intentionally excluded:
- `safe.directory` should be 'protected configuration only', but it does
not technically fit the definition because it is not respected in the
"command" scope. A future commit will fix this.
- `trace2.*` happens to read the same scopes as `safe.directory` because
they share an implementation. However, this is not for security
reasons; it is because we want to start tracing so early that
repository-level config and "-c" are not available [2].
This requirement is unique to `trace2.*`, so it does not makes sense
for protected configuration to be subject to the same constraints.
[1] For example,
https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com/
[2] https://lore.kernel.org/git/a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com/
Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a subsequent commit, we will introduce "protected configuration",
which is easiest to describe in terms of configuration scopes (i.e. it's
the union of the 'system', 'global', and 'command' scopes). This
description is fine for ML discussions, but it's inadequate for end
users because we don't provide a good description of "configuration
scopes" in the public docs.
145d59f482 (config: add '--show-scope' to print the scope of a config
value, 2020-02-10) introduced the word "scope" to our public docs, but
that only enumerates the scopes and assumes the user can figure out
what those values mean.
Add a SCOPES section to Documentation/git-config.txt that describes the
configuration scopes, their corresponding CLI options, and mentions that
some configuration options are only respected in certain scopes. Then,
use the word "scope" to simplify the FILES section and change some
confusing wording.
Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 14 Jul 2022 22:03:59 +0000 (15:03 -0700)]
Merge branch 'sy/mv-out-of-cone'
"git mv A B" in a sparsely populated working tree can be asked to
move a path between directories that are "in cone" (i.e. expected
to be materialized in the working tree) and "out of cone"
(i.e. expected to be hidden). The handling of such cases has been
improved.
* sy/mv-out-of-cone:
mv: add check_dir_in_index() and solve general dir check issue
mv: use flags mode for update_mode
mv: check if <destination> exists in index to handle overwriting
mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
mv: decouple if/else-if checks using goto
mv: update sparsity after moving from out-of-cone to in-cone
t1092: mv directory from out-of-cone to in-cone
t7002: add tests for moving out-of-cone file/directory
Junio C Hamano [Thu, 14 Jul 2022 22:03:59 +0000 (15:03 -0700)]
Merge branch 'hx/unpack-streaming'
Allow large objects read from a packstream to be streamed into a
loose object file straight, without having to keep it in-core as a
whole.
* hx/unpack-streaming:
unpack-objects: use stream_loose_object() to unpack large objects
core doc: modernize core.bigFileThreshold documentation
object-file.c: add "stream_loose_object()" to handle large object
object-file.c: factor out deflate part of write_loose_object()
object-file.c: refactor write_loose_object() to several steps
unpack-objects: low memory footprint for get_data() in dry_run mode
Junio C Hamano [Thu, 14 Jul 2022 22:03:58 +0000 (15:03 -0700)]
Merge branch 'en/merge-tree'
"git merge-tree" learned a new mode where it takes two commits and
computes a tree that would result in the merge commit, if the
histories leading to these two commits were to be merged.
* en/merge-tree:
git-merge-tree.txt: add a section on potentional usage mistakes
merge-tree: add a --allow-unrelated-histories flag
merge-tree: allow `ls-files -u` style info to be NUL terminated
merge-ort: optionally produce machine-readable output
merge-ort: store more specific conflict information
merge-ort: make `path_messages` a strmap to a string_list
merge-ort: store messages in a list, not in a single strbuf
merge-tree: provide easy access to `ls-files -u` style info
merge-tree: provide a list of which files have conflicts
merge-ort: remove command-line-centric submodule message from merge-ort
merge-ort: provide a merge_get_conflicted_files() helper function
merge-tree: support including merge messages in output
merge-ort: split out a separate display_update_messages() function
merge-tree: implement real merges
merge-tree: add option parsing and initial shell for real merge function
merge-tree: move logic for existing merge into new function
merge-tree: rename merge_trees() to trivial_merge_trees()
Junio C Hamano [Thu, 14 Jul 2022 22:03:58 +0000 (15:03 -0700)]
Merge branch 'gg/worktree-from-the-above'
In a non-bare repository, the behavior of Git when the
core.worktree configuration variable points at a directory that has
a repository as its subdirectory, regressed in Git 2.27 days.
* gg/worktree-from-the-above:
dir: minor refactoring / clean-up
dir: traverse into repository
When sorting the output of `git shortlog` by count, a list of authors in
alphabetical order is then sorted by contribution count. Obviously, the
idea is to maintain the alphabetical order for items with identical
contribution count.
At the moment, this job is performed by `qsort()`. As that function is
not guaranteed to implement a stable sort algorithm, this can lead to
inconsistent and/or surprising behavior: items with identical
contribution count could lose their alphabetical sub-order.
The `qsort()` in MS Visual C's runtime does _not_ implement a stable
sort algorithm, and under certain circumstances this even causes a test
failure in t4201.21 "shortlog can match multiple groups", where two
authors both are listed with 2 contributions, and are listed in inverse
alphabetical order.
Let's instead use the stable sort provided by `git_stable_qsort()` to
avoid this inconsistency.
This is a companion to 2049b8dc65 (diffcore_rename(): use a stable sort,
2019-09-30).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
mergetool(vimdiff): allow paths to contain spaces again
In 0041797449d (vimdiff: new implementation with layout support,
2022-03-30), we introduced a completely new implementation of the
`vimdiff` backend for `git mergetool`.
In this implementation, we no longer call `vim` directly but we
accumulate in the variable `FINAL_CMD` an arbitrary number of commands
for `vim` to execute, which necessitates the use of `eval` to split the
commands properly into multiple command-line arguments.
That same `eval` command also needs to pass the paths to `vim`, and
while it looks as if they are quoted correctly, that quoting only
reaches the `eval` instruction and is lost after that, therefore paths
that contain whitespace characters (or other characters that are
interpreted by the POSIX shell) are handled incorrectly.
This is a simple reproducer:
git init -b main bam-merge-fail
cd bam-merge-fail
echo a>"a file.txt"
git add "a file.txt"
git commit -m "added 'a file.txt'"
echo b>"a file.txt"
git add "a file.txt"
git commit -m "diverged b 'a file.txt'"
git checkout -b c HEAD~
echo c>"a file.txt"
git add "a file.txt"
git commit -m "diverged c 'a file.txt'"
git checkout main
git merge c
git mergetool --tool=vimdiff
With Git v2.37.0/v2.37.1, this will open 7 buffers, not four, and not
display the correct contents at all.
To fix this, let's not expand the variables containing the path
parameters before passing them to the `eval` command, but let that
command expand the variables instead.
This fixes https://github.com/git-for-windows/git/issues/3945
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 78d5e4cfb4b (tests: refactor --write-junit-xml code, 2022-05-21),
this developer refactored the `--write-junit-xml` code a bit, including
the part where the current test case's title was used in a `set`
invocation, but failed to account for the fact that some test cases'
titles start with a long option, which the `set` misinterprets as being
intended for parsing.
Let's fix this by using the `set -- <...>` form.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 13 Jul 2022 21:54:54 +0000 (14:54 -0700)]
Merge branch 'zk/push-use-bitmaps'
"git push" sometimes perform poorly when reachability bitmaps are
used, even in a repository where other operations are helped by
bitmaps. The push.useBitmaps configuration variable is introduced
to allow disabling use of reachability bitmaps only for "git push".
Junio C Hamano [Wed, 13 Jul 2022 21:54:53 +0000 (14:54 -0700)]
Merge branch 'ro/mktree-allow-missing-fix'
"git mktree --missing" lazily fetched objects that are missing from
the local object store, which was totally unnecessary for the purpose
of creating the tree object(s) from its input.
* ro/mktree-allow-missing-fix:
mktree: do not check type of remote objects
Han Xin [Tue, 12 Jul 2022 08:01:43 +0000 (16:01 +0800)]
t5330: remove run_with_limited_processses()
run_with_limited_processses() is used to end the loop faster when an
infinite loop happen. But "ulimit" is tied to the entire development
station, and the test will fail due to too many other processes or using
"--stress".
Without run_with_limited_processses() the infinite loop can also be
stopped due to global configrations or quotas, and the verification
still works fine. So let's remove run_with_limited_processses().
Signed-off-by: Han Xin <hanxin.hx@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 12 Jul 2022 07:03:45 +0000 (03:03 -0400)]
diff-files: move misplaced cleanup label
Commit 0139c58ab9 (revisions API users: add "goto cleanup" for
release_revisions(), 2022-04-13) converted an early return in
cmd_diff_files() into a goto. But it put the cleanup label too early: if
read_cache_preload() returns an error, we'll set result to "-1", but
then jump to calling run_diff_files(), overwriting our result.
We should jump past the call to run_diff_files(). Likewise, we should go
past diff_result_code(), which is expecting to see a code from an actual
diff, not a negative error code.
In practice, I suspect this bug cannot actually be triggered, because
read_cache_preload() does not seem to ever return an error. Its return
value (eventually) comes from do_read_index(), which gives the number of
cache entries found, and calls die() on error. Still, it makes sense to
fix the inadvertent change from 0139c58ab9 first, and we can look into
the overall error handling of read_cache() separately (which is present
in many other callsites).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 11 Jul 2022 23:25:14 +0000 (16:25 -0700)]
fsck: do not dereference NULL while checking resolve-undo data
When we found an invalid object recorded in the resolve-undo data,
we would have ended up dereferencing NULL while fsck. Reporting the
problem and going on to the next object is the right thing to do
here.
Junio C Hamano [Mon, 11 Jul 2022 22:38:51 +0000 (15:38 -0700)]
Merge branch 'rs/archive-with-internal-gzip'
Teach "git archive" to (optionally and then by default) avoid
spawning an external "gzip" process when creating ".tar.gz" (and
".tgz") archives.
* rs/archive-with-internal-gzip:
archive-tar: use internal gzip by default
archive-tar: use OS_CODE 3 (Unix) for internal gzip
archive-tar: add internal gzip implementation
archive-tar: factor out write_block()
archive: rename archiver data field to filter_command
archive: update format documentation
Junio C Hamano [Mon, 11 Jul 2022 22:38:51 +0000 (15:38 -0700)]
Merge branch 'ds/branch-checked-out'
Introduce a helper to see if a branch is already being worked on
(hence should not be newly checked out in a working tree), which
performs much better than the existing find_shared_symref() to
replace many uses of the latter.
* ds/branch-checked-out:
branch: drop unused worktrees variable
fetch: stop passing around unused worktrees variable
branch: fix branch_checked_out() leaks
branch: use branch_checked_out() when deleting refs
fetch: use new branch_checked_out() and add tests
branch: check for bisects and rebases
branch: add branch_checked_out() helper
Junio C Hamano [Mon, 11 Jul 2022 22:38:50 +0000 (15:38 -0700)]
Merge branch 'ac/bitmap-format-doc'
Adjust technical/bitmap-format to be formatted by AsciiDoc, and
add some missing information to the documentation.
* ac/bitmap-format-doc:
bitmap-format.txt: add information for trailing checksum
bitmap-format.txt: fix some formatting issues
bitmap-format.txt: feed the file to asciidoc to generate html
Junio C Hamano [Mon, 11 Jul 2022 22:38:49 +0000 (15:38 -0700)]
Merge branch 'pb/diff-doc-raw-format'
Update "git diff/log --raw" format documentation.
* pb/diff-doc-raw-format:
diff-index.txt: update raw output format in examples
diff-format.txt: correct misleading wording
diff-format.txt: dst can be 0* SHA-1 when path is deleted, too
Jeff King [Mon, 11 Jul 2022 14:48:06 +0000 (10:48 -0400)]
ref-filter: disable save_commit_buffer while traversing
Various ref-filter options like "--contains" or "--merged" may cause us
to traverse large segments of the history graph. It's counter-productive
to have save_commit_buffer turned on, as that will instruct the commit
code to cache in-memory the object contents for each commit we traverse.
This increases the amount of heap memory used while providing little or
no benefit, since we're not actually planning to display those commits
(which is the usual reason that tools like git-log want to keep them
around). We can easily disable this feature while ref-filter is running.
This lowers peak heap (as measured by massif) for running:
in linux.git from ~100MB to ~20MB. It also seems to improve runtime by
4-5% (600ms vs 630ms).
A few points to note:
- it should be safe to temporarily disable save_commit_buffer like
this. The saved buffers are accessed through get_commit_buffer(),
which treats the saved ones like a cache, and loads on-demand from
the object database on a cache miss. So any code that was using this
would not be wrong, it might just incur an extra object lookup for
some objects. But...
- I don't think any ref-filter related code is using the cache. While
it's true that an option like "--format=%(*contents:subject)" or
"--sort=*authordate" will need to look at the commit contents,
ref-filter doesn't use get_commit_buffer() to do so! It always reads
the objects directly via read_object_file(), though it does avoid
re-reading objects if the format can be satisfied without them.
Timing "git tag --format=%(*authordate)" shows that we're the same
before and after, as expected.
- Note that all of this assumes you don't have a commit-graph file. if
you do, then the heap usage is even lower, and the runtime is 10x
faster. So in that sense this is not urgent, as there's a much
better solution. But since it's such an obvious and easy win for
fallback cases (including commits which aren't yet in the graph
file), there's no reason not to.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 11 Jul 2022 09:21:52 +0000 (05:21 -0400)]
clone: move unborn head creation to update_head()
Prior to 4f37d45706 (clone: respect remote unborn HEAD, 2021-02-05),
creation of the local HEAD was always done in update_head(). That commit
added code to handle an unborn head in an empty repository, and just did
all symref creation and config setup there.
This makes the code flow a little bit confusing, especially as new
corner cases have been covered (like the previous commit to match our
default branch name to a non-HEAD remote branch).
Let's move the creation of the unborn symref into update_head(). This
matches the other HEAD-creation cases, and now the logic is consistently
separated: the main cmd_clone() function only examines the situation and
sets variables based on what it finds, and update_head() actually
performs the update.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>