Derrick Stolee [Thu, 16 Dec 2021 16:13:41 +0000 (16:13 +0000)]
sparse-checkout: fix OOM error with mixed patterns
Add a test to t1091-sparse-checkout-builtin.sh that would result in an
infinite loop and out-of-memory error before this change. The issue
relies on having non-cone-mode patterns while trying to modify the
patterns in cone-mode.
The fix is simple, allowing us to break from the loop when the input
path does not contain a slash, as the "dir" pattern we added does not.
This is only a fix to the critical out-of-memory error. A better
response to such a strange state will follow in a later change.
Reported-by: Calbabreaker <calbabreaker@gmail.com> Helped-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 16 Dec 2021 16:13:40 +0000 (16:13 +0000)]
sparse-checkout: fix segfault on malformed patterns
Then core.sparseCheckoutCone is enabled, the sparse-checkout patterns are
used to populate two hashsets that accelerate pattern matching. If the user
modifies the sparse-checkout file outside of the 'sparse-checkout' builtin,
then strange patterns can happen, triggering some error checks.
One of these error checks is possible to hit when some special characters
exist in a line. A warning message is correctly written to stderr, but then
there is additional logic that attempts to remove the line from the hashset
and free the data. This leads to a segfault in the 'git sparse-checkout
list' command because it iterates over the contents of the hashset, which is
now invalid.
The fix here is to stop trying to remove from the hashset. In addition,
we disable cone mode sparse-checkout because of the malformed data. This
results in the pattern-matching working with a possibly-slower
algorithm, but using the patterns as they are in the sparse-checkout
file.
This also changes the behavior of commands such as 'git sparse-checkout
list' because the output patterns will be the contents of the
sparse-checkout file instead of the list of directories. This is an
existing behavior for other types of bad patterns.
Add a test that triggers the segfault without the code change.
Reported-by: John Burnett <johnburnett@johnburnett.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 30 Dec 2021 20:18:35 +0000 (12:18 -0800)]
SubmittingPatchs: clarify choice of base and testing
We encourage identifying what, among many topics on `next`, exact
topics a new work depends on, instead of building directly on
`next`. Let's clarify this in the documentation.
Developers should know what they are building on top of, and be
aware of which part of the system is currently being worked on.
Encouraging them to make trial merges to `next` and `seen`
themselves will incentivize them to read others' changes and
understand them, eventually helping the developers to coordinate
among themselves and reviewing each others' changes.
cat-file: use GET_OID_ONLY_TO_DIE in --(textconv|filters)
Change the cat_one_file() logic that calls get_oid_with_context()
under --textconv and --filters to use the GET_OID_ONLY_TO_DIE flag,
thus improving the error messaging emitted when e.g. <path> is missing
but <rev> is not.
To service the "cat-file" use-case we need to introduce a new
"GET_OID_REQUIRE_PATH" flag, otherwise it would exit early as soon as
a valid "HEAD" was resolved, but in the "cat-file" case being changed
we always need a valid revision and path.
This arguably makes the "<bad rev>:<bad path>" and "<bad
rev>:<good (in HEAD) path>" use cases worse, as we won't quote the
<path> component at the user anymore, but let's just use the existing
logic "git log" et al use for now. We can improve the messaging for
those cases as a follow-up for all callers.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-name.c: don't have GET_OID_ONLY_TO_DIE imply *_QUIETLY
Stop having GET_OID_ONLY_TO_DIE imply GET_OID_QUIETLY in
get_oid_with_context_1().
The *_DIE flag was added in 33bd598c390 (sha1_name.c: teach lookup
context to get_sha1_with_context(), 2012-07-02), and then later
tweaked in 7243ffdd78d (get_sha1: avoid repeating ourselves via
ONLY_TO_DIE, 2016-09-26).
Everything in that commit makes sense, but only for callers that
expect to fail in an initial call to get_oid_with_context_1(), e.g. as
"git show 0017" does via handle_revision_arg(), and then would like to
call get_oid_with_context_1() again via this
maybe_die_on_misspelt_object_name() function.
In the subsequent commit we'll add a new caller that expects to call
this only once, but who would still like to have all the error
messaging that GET_OID_ONLY_TO_DIE gives it, in addition to any
regular errors.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the usage output emitted on "git cat-file -h" to group related
options, making it clear to users which options go with which other
ones.
The new output is:
Check object existence or emit object contents
-e check if <object> exists
-p pretty-print <object> content
Emit [broken] object attributes
-t show object type (one of 'blob', 'tree', 'commit', 'tag', ...)
-s show object size
--allow-unknown-type allow -s and -t to work with broken/corrupt objects
Batch objects requested on stdin (or --batch-all-objects)
--batch[=<format>] show full <object> or <rev> contents
--batch-check[=<format>]
like --batch, but don't emit <contents>
--batch-all-objects with --batch[-check]: ignores stdin, batches all known objects
Change or optimize batch output
--buffer buffer --batch output
--follow-symlinks follow in-tree symlinks
--unordered do not order objects before emitting them
Emit object (blob or tree) with conversion or filter (stand-alone, or with batch)
--textconv run textconv on object's content
--filters run filters on object's content
--path blob|tree use a <path> for (--textconv | --filters ); Not with 'batch'
The old usage was:
<type> can be one of: blob, tree, commit, tag
-t show object type
-s show object size
-e exit with zero when there's no error
-p pretty-print object's content
--textconv for blob objects, run textconv on object's content
--filters for blob objects, run filters on object's content
--batch-all-objects show all objects with --batch or --batch-check
--path <blob> use a specific path for --textconv/--filters
--allow-unknown-type allow -s and -t to work with broken/corrupt objects
--buffer buffer --batch output
--batch[=<format>] show info and content of objects fed from the standard input
--batch-check[=<format>]
show info about objects fed from the standard input
--follow-symlinks follow in-tree symlinks (used with --batch or --batch-check)
--unordered do not order --batch-all-objects output
While shorter, I think the new one is easier to understand, as
e.g. "--allow-unknown-type" is grouped with "-t" and "-s", as it can
only be combined with those options. The same goes for "--buffer",
"--unordered" etc.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the migration of --batch-all-objects to OPT_CMDMODE() in the
preceding commit one bug with combining it and other OPT_CMDMODE()
options was solved, but we were still left with e.g. --buffer silently
being discarded when not in batch mode.
Fix all those bugs, and in addition emit errors telling the user
specifically what options can't be combined with what other options,
before this we'd usually just emit the cryptic usage text and leave
the users to work it out by themselves.
This change is rather large, because to do so we need to untangle the
options processing so that we can not only error out, but emit
sensible errors, and e.g. emit errors about options before errors
about stray argc elements (as they might become valid if the option
were removed).
Some of the output changes ("error:" to "fatal:" with
usage_msg_opt[f]()), but none of the exit codes change, except in
those cases where we silently accepted bad option combinations before,
now we'll error out.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The usage of OPT_CMDMODE() in "cat-file"[1] was added in parallel with
the development of[3] the --batch-all-objects option[4], so we've
since grown[5] checks that it can't be combined with other command
modes, when it should just be made a top-level command-mode
instead. It doesn't combine with --filters, --textconv etc.
By giving parse_options() information about what options are mutually
exclusive with one another we can get the die() message being removed
here for free, we didn't even use that removed message in some cases,
e.g. for both of:
There's no benefit to defining this at a distance, and it makes the
code harder to read as you've got to scroll up to see the usage that
corresponds to the options.
In subsequent commits I'll make use of usage_msg_opt(), which will be
quite noisy if I have to use the long "cat_file_usage" variable,
there's no other command being defined in this file, so let's rename
it to just "usage".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were various inaccuracies in the previous SYNOPSIS output,
e.g. "--path" is not something that can optionally go with any options
except --textconv or --filters, as the output implied.
The opening line of the DESCRIPTION section is also "In its first
form[...]", which refers to "git cat-file <type> <object>", but the
SYNOPSIS section wasn't showing that as the first form!
That part of the documentation made sense in d83a42f34a6 (Documentation: minor grammatical fixes in
git-cat-file.txt, 2009-03-22) when it was introduced, but since then
various options that were added have made that intro make no sense in
the context it was in. Now the two will match again.
The usage output here is not properly aligned on "master" currently,
but will be with my in-flight 4631cfc20bd (parse-options: properly
align continued usage output, 2021-09-21), so let's indent things
correctly in the C code in anticipation of that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a usage_msg_optf() as a shorthand for the sort of
usage_msg_opt(xstrfmt(...)) used in builtin/stash.c. I'll make more
use of this function in builtin/cat-file.c shortly.
The disconnect between the "..." and "fmt" is a bit unusual, but it
works just fine and this keeps it consistent with usage_msg_opt(),
i.e. a caller of it can be moved to usage_msg_optf() and not have to
have its arguments re-arranged.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stress test the usage emitted when options are combined in ways that
isn't supported. Let's test various option combinations, some of these
we buggily allow right now.
E.g. this reveals a bug in 321459439e1 (cat-file: support
--textconv/--filters in batch mode, 2016-09-09) that we'll fix in a
subsequent commit. We're supposed to be emitting a relevant message
when --batch-all-objects is combined with --textconv or --filters, but
we don't.
The cases of needing to assign to opt=2 in the "opt" loop are because
on those we do the right thing already, in subsequent commits the
"test_expect_failure" cases will be fixed, and the for-loops unified.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 28 Dec 2021 00:20:46 +0000 (00:20 +0000)]
merge-ort: fix bug with renormalization and rename/delete conflicts
Ever since commit a492d5331c ("merge-ort: ensure we consult df_conflict
and path_conflicts", 2021-06-30), when renormalization is active AND a
file is involved in a rename/delete conflict BUT the file is unmodified
(either before or after renormalization), merge-ort was running into an
assertion failure. Prior to that commit (or if assertions were compiled
out), merge-ort would mis-merge instead, ignoring the rename/delete
conflict and just deleting the file.
Remove the assertions, fix the code appropriately, leave some good
comments in the code, and add a testcase for this situation.
Reported-by: Ralf Thielow <ralf.thielow@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the --statistics flag that I added in 5e9637c6297 (i18n: add
infrastructure for translating Git with gettext, 2011-11-18). Our
Makefile output is good about reducing verbosity by default, except in
this case:
$ rm -rf po/build/locale/e*; time make -j $(nproc) all
SUBDIR templates
MKDIR -p po/build/locale/el/LC_MESSAGES
MSGFMT po/build/locale/el/LC_MESSAGES/git.mo
MKDIR -p po/build/locale/es/LC_MESSAGES
MSGFMT po/build/locale/es/LC_MESSAGES/git.mo
1038 translated messages, 3325 untranslated messages.
5230 translated messages.
I didn't have any good reason for using --statistics at the time other
than ad-hoc eyeballing of the output. We don't need to spew out
exactly how many messages we've got translated every time. Now we'll
instead emit:
$ rm -rf po/build/locale/e*; time make -j $(nproc) all
SUBDIR templates
MKDIR -p po/build/locale/el/LC_MESSAGES
MSGFMT po/build/locale/el/LC_MESSAGES/git.mo
MKDIR -p po/build/locale/es/LC_MESSAGES
MSGFMT po/build/locale/es/LC_MESSAGES/git.mo
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: move -DPAGER_ENV from BASIC_CFLAGS to EXTRA_CPPFLAGS
Remove -DPAGER_ENV from the BASIC_CFLAGS and instead have it passed
via the EXTRA_CPPFLAGS passed when compiling pager.c.
This doesn't change anything except to make it clear that only pager.c
needs this, as it's the only user of this define. See 995bc22d7f8 (pager: move pager-specific setup into the build,
2016-08-04) for the commit that originally added this.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: correct the dependency graph of hook-list.h
Fix an issue in my cfe853e66be (hook-list.h: add a generated list of
hooks, like config-list.h, 2021-09-26), the builtin/help.c was
inadvertently made to depend on hook-list.h, but it's used by
builtin/bugreport.c.
The hook.c also does not depend on hook-list.h. It did in an earlier
version of the greater series cfe853e66be was extracted from, but not
anymore. We might end up needing that line again, but let's remove it
for now.
Reported-by: Mike Hommey <mh@glandium.org> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The environment variable $SHELL is usually set to the user's
interactive shell. Our build and test scripts never use $SHELL because
there are no guarantees about its input language. Instead, we use
/bin/sh which should be a POSIX shell.
For systems with a broken /bin/sh, we allow to override that path via
SHELL_PATH. To run tests in yet another shell we allow to override
SHELL_PATH with TEST_SHELL_PATH.
Perf tests run in $SHELL via a wrapper defined in t/perf/perf-lib.sh,
so they break with e.g. SHELL=python. Use TEST_SHELL_PATH like
in other tests. TEST_SHELL_PATH is always defined because
t/perf/perf-lib.sh includes t/test-lib.sh, which includes
GIT-BUILD-OPTIONS.
Acked-by: Jeff King <peff@peff.net> Signed-off-by: Johannes Altmanninger <aclopte@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Han-Wen Nienhuys [Thu, 23 Dec 2021 19:29:49 +0000 (19:29 +0000)]
reftable: signal overflow
reflog entries have unbounded size. In theory, each log ('g') block in reftable
can have an arbitrary size, so the format allows for arbitrarily sized reflog
messages. However, in the implementation, we are not scaling the log blocks up
with the message, and writing a large message fails.
This triggers a failure for reftable in t7006-pager.sh.
Until this is fixed more structurally, report an error from within the reftable
library for easier debugging.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 23 Dec 2021 06:48:11 +0000 (22:48 -0800)]
Merge branch 'es/chainlint'
The chainlint test script linter in the test suite has been updated.
* es/chainlint:
chainlint.sed: stop splitting "(..." into separate lines "(" and "..."
chainlint.sed: swallow comments consistently
chainlint.sed: stop throwing away here-doc tags
chainlint.sed: don't mistake `<< word` in string as here-doc operator
chainlint.sed: make here-doc "<<-" operator recognition more POSIX-like
chainlint.sed: drop subshell-closing ">" annotation
chainlint.sed: drop unnecessary distinction between ?!AMP?! and ?!SEMI?!
chainlint.sed: tolerate harmless ";" at end of last line in block
chainlint.sed: improve ?!SEMI?! placement accuracy
chainlint.sed: improve ?!AMP?! placement accuracy
t/Makefile: optimize chainlint self-test
t/chainlint/one-liner: avoid overly intimate chainlint.sed knowledge
t/chainlint/*.test: generalize self-test commentary
t/chainlint/*.test: fix invalid test cases due to mixing quote types
t/chainlint/*.test: don't use invalid shell syntax
Junio C Hamano [Thu, 23 Dec 2021 06:48:11 +0000 (22:48 -0800)]
Merge branch 'jz/apply-quiet-and-allow-empty'
"git apply" has been taught to ignore a message without a patch
with the "--allow-empty" option. It also learned to honor the
"--quiet" option given from the command line.
* jz/apply-quiet-and-allow-empty:
git-apply: add --allow-empty flag
git-apply: add --quiet flag
"git fetch --set-upstream" did not check if there is a current
branch, leading to a segfault when it is run on a detached HEAD,
which has been corrected.
* ab/fetch-set-upstream-while-detached:
pull, fetch: fix segfault in --set-upstream option
reflog + refs-backend: move "verbose" out of the backend
Move the handling of the "verbose" flag entirely out of
"refs/files-backend.c" and into "builtin/reflog.c". This allows the
backend to stop knowing about the EXPIRE_REFLOGS_VERBOSE flag.
The expire_reflog_ent() function shouldn't need to deal with the
implementation detail of whether or not we're emitting verbose output,
by doing this the --verbose output becomes backend-agnostic, so
reftable will get the same output.
I think the output is rather bad currently, and should e.g. be
implemented with some better future mode of progress.[ch], but that's
a topic for another improvement.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs files-backend: assume cb->newlog if !EXPIRE_REFLOGS_DRY_RUN
It's not possible for "cb->newlog" to be NULL if
!EXPIRE_REFLOGS_DRY_RUN, since files_reflog_expire() would have
error()'d and taken the "goto failure" branch if it couldn't open the
file. By not using the "newlog" field private to "file-backend.c"'s
"struct expire_reflog_cb", we can move this verbosity logging to
"builtin/reflog.c" in a subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "cmd.stalefix" handling added in 1389d9ddaa6 (reflog expire
--fix-stale, 2007-01-06) to use a locally scoped "struct
rev_info". This code relies on mark_reachable_objects() twiddling
flags in the walked objects.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reflog expire: don't use lookup_commit_reference_gently()
In the initial implementation of "git reflog" in 4264dc15e19 (git
reflog expire, 2006-12-19) we had this
lookup_commit_reference_gently().
I don't think we've ever found tags that we need to recursively
dereference in reflogs, so this should at least be changed to a
"lookup commit" as I'm doing here, although I can't think of a way
where it mattered in practice.
I also think we'd probably like to just die here if we have a NULL
object, but as this code needs to handle potentially broken
repositories let's just show an "error" but continue, the non-quiet
lookup_commit() will do for us. None of our tests cover the case where
"commit" is NULL after this lookup.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reflog expire: refactor & use "tip_commit" only for UE_NORMAL
Add an intermediate variable for "tip_commit" in
reflog_expiry_prepare(), and only add it to the struct if we're
handling the UE_NORMAL case.
The code behaves the same way as before, but this makes the control
flow clearer, and the shorter name allows us to fold a 4-line i/else
into a one-line ternary instead.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code added in 03cb91b18cc (reflog --expire-unreachable: special
case entries in "HEAD" reflog, 2010-04-09) to use a "switch" statement
with an exhaustive list of "case" statements instead of doing numeric
comparisons against the enum labels.
Now we won't assume that "x != UE_ALWAYS" means "(x == UE_HEAD || x ||
UE_NORMAL)". That assumption is true now, but we'd introduce subtle
bugs here if that were to change, now the compiler will notice and
error out on such errors.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reflog: change one->many worktree->refnames to use a string_list
Change the FLEX_ARRAY pattern added in bda3a31cc79 (reflog-expire:
Avoid creating new files in a directory inside readdir(3) loop,
2008-01-25) the string-list API instead.
This does not change any behavior, allows us to delete much of this
code as it's replaced by things we get from the string-list API for
free, as a result we need just one struct to keep track of this data,
instead of two.
The "DUP" -> "string_list_append_nodup(..., strbuf_detach(...))"
pattern here is the same as that used in a recent memory leak fix in b202e51b154 (grep: fix a "path_list" memory leak, 2021-10-22).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reflog expire: narrow scope of "cb" in cmd_reflog_expire()
As with the preceding change for "reflog delete", change the "cb_data"
we pass to callbacks to be &cb.cmd itself, instead of passing &cb and
having the callback lookup cb->cmd.
This makes it clear that the "cb" itself is the same memzero'd
structure on each iteration of the for-loops that use &cb, except for
the "cmd" member.
The "struct expire_reflog_policy_cb" we pass to reflog_expire() will
have the members that aren't "cmd" modified by the callbacks, but
before we invoke them everything except "cmd" is zero'd out.
This included the "tip_commit", "mark_list" and "tips". It might have
looked as though we were re-using those between iterations, but the
first thing we did in reflog_expiry_prepare() was to either NULL them,
or clobber them with another value.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Han-Wen Nienhuys [Wed, 22 Dec 2021 18:11:52 +0000 (18:11 +0000)]
refs: pass gitdir to packed_ref_store_create
This is consistent with the calling convention for ref backend creation, and
avoids storing ".git/packed-refs" (the name of a regular file) in a variable called
ref_store::gitdir.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Joel Holdsworth [Wed, 22 Dec 2021 14:55:52 +0000 (14:55 +0000)]
git-p4: remove "rollback" verb
The "rollback" verb implements a simple algorithm which takes the set of
remote perforce tracker branches, or optionally, the complete collection
of local branches in a git repository, and deletes commits from these
branches until there are no commits left with a perforce change number
greater than than a user-specified change number. If the base of a git
branch has a newer change number than the user-specified maximum, then
the branch is deleted.
In future, there might be an argument for the addition of some kind of
"reset this branch back to a given perforce change number" verb for
git-p4. However, in its current form it is unlikely to be useful to
users for the following reasons:
* The verb is completely undocumented. The only description provided
contains the following text: "A tool to debug the multi-branch
import. Don't use :)".
* The verb has a very narrow purpose in that it applies the rollback
operation to fixed sets of branches - either all remote p4 branches,
or all local branches. There is no way for users to specify branches
with more granularity, for example, allowing users to specify a
single branch or a set of branches. The utility of the current
implementation is therefore a niche within a niche.
Given these shortcomings, this patch removes the verb from git-p4.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Joel Holdsworth [Wed, 22 Dec 2021 14:55:51 +0000 (14:55 +0000)]
git-p4: remove "debug" verb
The git-p4 "debug" verb is described as "A tool to debug the output of
p4 -G".
The verb is not documented in any detail, but implements a function
which executes an arbitrary p4 command with the -G flag, which causes
perforce to format all output as marshalled Python dictionary objects.
The verb was implemented early in the history of git-p4, and may once
have served a useful purpose to the authors in the early stages of
development. However, the "debug" verb is no longer being used by the
current developers (and users) of git-p4, and whatever purpose the verb
previously offered is easily replaced by invoking p4 directly.
This patch therefore removes the verb from git-p4.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Han-Wen Nienhuys [Wed, 22 Dec 2021 10:59:43 +0000 (10:59 +0000)]
t7004: create separate tags for different tests
Reftable intentionally keeps reflog data for deleted refs.
This breaks tests that delete and recreate "refs/tags/tag_with_reflog" as traces
of the deletion are left in reflog. To resolve this, use a differently named ref
for each test case.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Wed, 22 Dec 2021 14:20:56 +0000 (14:20 +0000)]
test-read-cache: remove --table, --expand options
This commit effectively reverts 2782db3 (test-tool: don't force full
index, 2021-03-30) and e2df6c3 (test-read-cache: print cache entries
with --table, 2021-03-30) to remove the --table and --expand options
from 'test-tool read-cache'. The previous changes already removed these
options from the test suite in favor of 'git ls-files --sparse'.
The initial thought of creating these options was to allow for tests to
see additional information with every cache entry. In particular, the
object type is still not mirrored in 'git ls-files'. Since sparse
directory entries always end with a slash, the object type is not
critical to verify the sparse index is enabled. It was thought that it
would be helpful to have additional information, such as flags, but that
was not needed for the FS Monitor integration and hasn't been needed
since.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that 'git ls-files --sparse' exists, we can use it to verify the
state of a sparse index instead of 'test-tool read-cache table'. Replace
these usages within t1091-sparse-checkout-builtin.sh and
t3705-add-sparse-checkout.sh.
The important changes are due to the different output format. In t3705,
we need to use the '--stage' output to get a file mode and OID, but
it also includes a stage value and drops the object type. This leads
to some differences in how we handle looking for specific entries.
In t1091, the test focuses on enabling the sparse index, so we do not
need the --stage flag to demonstrate how the index changes, and instead
can use a diff.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Wed, 22 Dec 2021 14:20:54 +0000 (14:20 +0000)]
t1092: replace 'read-cache --table' with 'ls-files --sparse'
Now that 'git ls-files --sparse' exists, we can use it to verify the
state of a sparse index instead of 'test-tool read-cache table'. Replace
these usages within t1092-sparse-checkout-compatibility.sh.
The important changes are due to the different output format. We need to
use the '--stage' output to get a file mode and OID, but it also
includes a stage value and drops the object type. This leads to some
differences in how we handle looking for specific entries.
Some places where we previously looked for no 'tree' entries, we can
instead directly compare the output across the repository with a sparse
index and the one without.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Wed, 22 Dec 2021 14:20:53 +0000 (14:20 +0000)]
ls-files: add --sparse option
Existing callers to 'git ls-files' are expecting file names, not
directories. It is best to expand a sparse index to show all of the
contained files in this case.
However, expert users may want to inspect the contents of the index
itself including which directories are sparse. Add a --sparse option to
allow users to request this information.
During testing, I noticed that options such as --modified did not affect
the output when the files in question were outside the sparse-checkout
definition. Tests are added to document this preexisting behavior and
how it remains unchanged with the sparse index and the --sparse option.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Wed, 22 Dec 2021 14:20:52 +0000 (14:20 +0000)]
fetch/pull: use the sparse index
The 'git fetch' and 'git pull' commands parse the index in order to
determine if submodules exist. Without command_requires_full_index=0,
this will expand a sparse index, causing slow performance even when
there is no new data to fetch.
The .gitmodules file will never be inside a sparse directory entry, and
even if it was, the index_name_pos() method would expand the sparse
index if needed as we search for the path by name. These commands do not
iterate over the index, which is the typical thing we are careful about
when integrating with the sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 21 Dec 2021 23:03:17 +0000 (15:03 -0800)]
Merge branch 'js/scalar'
Add pieces from "scalar" to contrib/.
* js/scalar:
scalar: implement the `version` command
scalar: implement the `delete` command
scalar: teach 'reconfigure' to optionally handle all registered enlistments
scalar: allow reconfiguring an existing enlistment
scalar: implement the `run` command
scalar: teach 'clone' to support the --single-branch option
scalar: implement the `clone` subcommand
scalar: implement 'scalar list'
scalar: let 'unregister' handle a deleted enlistment directory gracefully
scalar: 'unregister' stops background maintenance
scalar: 'register' sets recommended config and starts maintenance
scalar: create test infrastructure
scalar: start documenting the command
scalar: create a rudimentary executable
scalar: add a README with a roadmap
Junio C Hamano [Tue, 21 Dec 2021 23:03:17 +0000 (15:03 -0800)]
Merge branch 'ld/sparse-diff-blame'
Teach diff and blame to work well with sparse index.
* ld/sparse-diff-blame:
blame: enable and test the sparse index
diff: enable and test the sparse index
diff: replace --staged with --cached in t1092 tests
repo-settings: prepare_repo_settings only in git repos
test-read-cache: set up repo after git directory
commit-graph: return if there is no git directory
git: ensure correct git directory setup with -h
Junio C Hamano [Tue, 21 Dec 2021 23:03:16 +0000 (15:03 -0800)]
Merge branch 'ak/protect-any-current-branch'
"git fetch" without the "--update-head-ok" option ought to protect
a checked out branch from getting updated, to prevent the working
tree that checks it out to go out of sync. The code was written
before the use of "git worktree" got widespread, and only checked
the branch that was checked out in the current worktree, which has
been updated.
(originally called ak/fetch-not-overwrite-any-current-branch)
* ak/protect-any-current-branch:
branch: protect branches checked out in all worktrees
receive-pack: protect current branch for bare repository worktree
receive-pack: clean dead code from update_worktree()
fetch: protect branches checked out in all worktrees
worktree: simplify find_shared_symref() memory ownership model
branch: lowercase error messages
receive-pack: lowercase error messages
fetch: lowercase error messages
Junio C Hamano [Tue, 21 Dec 2021 23:03:16 +0000 (15:03 -0800)]
Merge branch 'fs/ssh-signing-other-keytypes'
The cryptographic signing using ssh keys can specify literal keys
for keytypes whose name do not begin with the "ssh-" prefix by
using the "key::" prefix mechanism (e.g. "key::ecdsa-sha2-nistp256").
* fs/ssh-signing-other-keytypes:
ssh signing: make sign/amend test more resilient
ssh signing: support non ssh-* keytypes
Junio C Hamano [Tue, 21 Dec 2021 23:03:15 +0000 (15:03 -0800)]
Merge branch 'fs/ssh-signing-key-lifetime'
Extend the signing of objects with SSH keys and learn to pay
attention to the key validity time range when verifying.
* fs/ssh-signing-key-lifetime:
ssh signing: verify ssh-keygen in test prereq
ssh signing: make fmt-merge-msg consider key lifetime
ssh signing: make verify-tag consider key lifetime
ssh signing: make git log verify key lifetime
ssh signing: make verify-commit consider key lifetime
ssh signing: add key lifetime test prereqs
ssh signing: use sigc struct to pass payload
t/fmt-merge-msg: make gpgssh tests more specific
t/fmt-merge-msg: do not redirect stderr
When "git log" implicitly enabled the "decoration" processing
without being explicitly asked with "--decorate" option, it failed
to read and honor the settings given by the "--decorate-refs"
option.
* jk/log-decorate-opts-with-implicit-decorate:
log: load decorations with --simplify-by-decoration
log: handle --decorate-refs with userformat "%d"
Junio C Hamano [Tue, 21 Dec 2021 23:03:15 +0000 (15:03 -0800)]
Merge branch 'en/rebase-x-wo-git-dir-env'
"git rebase -x" by mistake started exporting the GIT_DIR and
GIT_WORK_TREE environment variables when the command was rewritten
in C, which has been corrected.
* en/rebase-x-wo-git-dir-env:
sequencer: do not export GIT_DIR and GIT_WORK_TREE for 'exec'
Josh Steadmon [Tue, 21 Dec 2021 19:00:42 +0000 (11:00 -0800)]
l10n: README: call more attention to plural strings
In po/README.md, we point core developers to gettext's "Preparing
Strings" documentation for advice on marking strings for translation.
However, this doc doesn't really discuss the issues around plural form
translation, which can make it seem that nothing special needs to be
done in this case.
Add a specific callout here about marking plural-form strings so that
the advice later on in the README is not overlooked.
Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
William Sprent [Thu, 16 Dec 2021 16:23:09 +0000 (16:23 +0000)]
fast-export: fix surprising behavior with --first-parent
The revision traversal machinery typically processes and returns all
children before any parent. fast-export needs to operate in the
reverse fashion, handling parents before any of their children in
order to build up the history starting from the root commit(s). This
would be a clear case where we could just use the revision traversal
machinery's "reverse" option to achieve this desired affect.
However, this wasn't what the code did. It added its own array for
queuing. The obvious hand-rolled solution would be to just push all
the commits into the array and then traverse afterwards, but it didn't
quite do that either. It instead attempted to process anything it
could as soon as it could, and once it could, check whether it could
process anything that had been queued. As far as I can tell, this was
an effort to save a little memory in the case of multiple root commits
since it could process some commits before queueing all of them. This
involved some helper functions named has_unshown_parent() and
handle_tail(). For typical invocations of fast-export, this
alternative essentially amounted to a hand-rolled method of reversing
the commits -- it was a bunch of work to duplicate the revision
traversal machinery's "reverse" option.
This hand-rolled reversing mechanism is actually somewhat difficult to
reason about. It takes some time to figure out how it ensures in
normal cases that it will actually process all traversed commits
(rather than just dropping some and not printing anything for them).
And it turns out there are some cases where the code does drop commits
without handling them, and not even printing an error or warning for
the user. Due to the has_unshown_parent() checks, some commits could
be left in the array at the end of the "while...get_revision()" loop
which would be unprocessed. This could be triggered for example with
git fast-export main -- --first-parent
or non-sensical traversal rules such as
git fast-export main -- --grep=Merge --invert-grep
While most traversals that don't include all parents should likely
trigger errors in fast-export (or at least require being used in
combination with --reference-excluded-parents), the --first-parent
traversal is at least reasonable and it'd be nice if it didn't just drop
commits. It'd also be nice for future readers of the code to have a
simpler "reverse traversal" mechanism. Use the "reverse" option of the
revision traversal machinery to achieve both.
Even for the non-sensical traversal flags like the --grep one above,
this would be an improvement. For example, in that case, the code
previously would have silently truncated history to only those commits
that do not have an ancestor containing "Merge" in their commit message.
After this code change, that case would include all commits without
"Merge" in their commit message -- but any commit that previously had a
"Merge"-mentioning parent would lose that parent
(likely resulting in many new root commits). While the new behavior is
still odd, it is at least understandable given that
--reference-excluded-parents is not the default.
Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: William Sprent <williams@unity3d.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Josh Steadmon [Tue, 21 Dec 2021 03:30:24 +0000 (19:30 -0800)]
config: require lowercase for branch.*.autosetupmerge
Although we only documented that branch.*.autosetupmerge would accept
"always" as a value, the actual implementation would accept any
combination of upper- or lower-case. Fix this to be consistent with
documentation and with other values of this config variable.
Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Josh Steadmon [Tue, 21 Dec 2021 03:30:23 +0000 (19:30 -0800)]
branch: add flags and config to inherit tracking
It can be helpful when creating a new branch to use the existing
tracking configuration from the branch point. However, there is
currently not a method to automatically do so.
Teach git-{branch,checkout,switch} an "inherit" argument to the
"--track" option. When this is set, creating a new branch will cause the
tracking configuration to default to the configuration of the branch
point, if set.
For example, if branch "main" tracks "origin/main", and we run
`git checkout --track=inherit -b feature main`, then branch "feature"
will track "origin/main". Thus, `git status` will show us how far
ahead/behind we are from origin, and `git pull` will pull from origin.
This is particularly useful when creating branches across many
submodules, such as with `git submodule foreach ...` (or if running with
a patch such as [1], which we use at $job), as it avoids having to
manually set tracking info for each submodule.
Since we've added an argument to "--track", also add "--track=direct" as
another way to explicitly get the original "--track" behavior ("--track"
without an argument still works as well).
Finally, teach branch.autoSetupMerge a new "inherit" option. When this
is set, "--track=inherit" becomes the default behavior.
Josh Steadmon [Tue, 21 Dec 2021 03:30:22 +0000 (19:30 -0800)]
branch: accept multiple upstream branches for tracking
Add a new static variant of install_branch_config() that accepts
multiple remote branch names for tracking. This will be used in an
upcoming commit that enables inheriting the tracking configuration from
a parent branch.
Currently, all callers of install_branch_config() pass only a single
remote. Make install_branch_config() a small wrapper around
install_branch_config_multiple_remotes() so that existing callers do not
need to be changed.
Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 20 Dec 2021 22:53:43 +0000 (14:53 -0800)]
merge: allow to pretend a merge is made into a different branch
When a series of patches for a topic-B depends on having topic-A,
the workflow to prepare the topic-B branch would look like this:
$ git checkout -b topic-B main
$ git merge --no-ff --no-edit topic-A
$ git am <mbox-for-topic-B
When topic-A gets updated, recreating the first merge and rebasing
the rest of the topic-B, all on detached HEAD, is a useful
technique. After updating topic-A with its new round of patches:
(0) check out the current topic-B.
(1) find the previous merge of topic-A into topic-B.
(2) detach the HEAD to the parent of the previous merge.
(3) merge the updated topic-A to it.
(4) reapply the patches to rebuild the rest of topic-B.
(5) update topic-B with the result.
without contaminating the reflog of topic-B too much. topic-B@{1}
is the "logically previous" state before topic-A got updated, for
example. At (4), comparison (e.g. range-diff) between HEAD and
@{-1} is a meaningful way to sanity check the result, and the same
can be done at (5) by comparing topic-B and topic-B@{1}.
But there is one glitch. The merge into the detached HEAD done in
the step (3) above gives us "Merge branch 'topic-A' into HEAD", and
does not say "into topic-B".
Teach the "--into-name=<branch>" option to "git merge" and its
underlying "git fmt-merge-message", to pretend as if we were merging
into <branch>, no matter what branch we are actually merging into,
when they prepare the merge message. The pretend name honors the
usual "into <target>" suppression mechanism, which can be seen in
the tests added here.
Joel Holdsworth [Sun, 19 Dec 2021 15:40:28 +0000 (15:40 +0000)]
git-p4: show progress as an integer
When importing files from Perforce, git-p4 periodically logs the
progress of file transfers as a percentage. However, the value is
printed as a float with an excessive number of decimal places.
For example a typical update might contain the following message:
Joel Holdsworth [Sun, 19 Dec 2021 15:40:27 +0000 (15:40 +0000)]
git-p4: print size values in appropriate units
The git-p4 script reports file sizes in various log messages.
Previously, in each case the script would print them as the number of
bytes divided by 1048576 i.e. the size in mebibytes, rounded down to an
integer. This resulted in small files being described as having a size
of "0 MB".
This patch replaces the existing behaviour with a new helper function:
format_size_human_readable, which takes a number of bytes (or any other
quantity), and computes the appropriate prefix to use: none, Ki, Mi, Gi,
Ti, Pi, Ei, Zi, Yi.
For example, a size of 123456 will now be printed as "120.6 KiB" greatly
improving the readability of the log output.
Large valued prefixes such as pebi, exbi, zebi and yobi are included for
completeness, though they not expected to appear in any real-world
Perforce repository!
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sat, 18 Dec 2021 19:53:15 +0000 (20:53 +0100)]
grep/pcre2: factor out literal variable
Patterns that contain no wildcards and don't have to be case-folded are
literal. Give this condition a name to increase the readability of the
boolean expression for enabling the option PCRE2_UTF.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sat, 18 Dec 2021 19:50:02 +0000 (20:50 +0100)]
grep/pcre2: use PCRE2_UTF even with ASCII patterns
compile_pcre2_pattern() currently uses the option PCRE2_UTF only for
patterns with non-ASCII characters. Patterns with ASCII wildcards can
match non-ASCII strings, though. Without that option PCRE2 mishandles
UTF-8 input, though -- it matches parts of multi-byte characters. Fix
that by using PCRE2_UTF even for ASCII-only patterns.
This is a remake of the reverted ae39ba431a (grep/pcre2: fix an edge
case concerning ascii patterns and UTF-8 data, 2021-10-15). The change
to the condition and the test are simplified and more targeted.
Original-patch-by: Hamza Mahfooz <someguy@effective-light.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jerry Zhang [Fri, 17 Dec 2021 23:29:02 +0000 (15:29 -0800)]
git-apply: skip threeway in add / rename cases
Certain invocations of "git apply --3way" will attempt threeway and
fail due to missing objects, even though git is able to fall back on
apply_fragments and apply the patch successfully with a return value
of 0. To fix, return early from try_threeway() in the following
cases:
- When the patch is a rename and no lines have changed. In this
case, "git diff" doesn't record the blob info, so 3way is neither
possible nor necessary.
- When the patch is an addition and there is no add/add conflict,
i.e. direct_to_threeway is false. In this case, threeway will
fail since the preimage is not in cache, but isn't necessary
anyway since there is no conflict.
This fixes a few unecessary error messages when applying these kinds
of patches with --3way.
It also fixes a reported issue where applying a concatenation of
several git produced patches will fail when those patches involve a
deletion followed by creation of the same file. Add a test for this
case too. (test provided by <i@zenithal.me>)
Signed-off-by: Jerry Zhang <jerry@skydio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 20 Dec 2021 14:48:11 +0000 (14:48 +0000)]
repack: make '--quiet' disable progress
While testing some ideas in 'git repack', I ran it with '--quiet' and
discovered that some progress output was still shown. Specifically, the
output for writing the multi-pack-index showed the progress.
The 'show_progress' variable in cmd_repack() is initialized with
isatty(2) and is not modified at all by the '--quiet' flag. The
'--quiet' flag modifies the po_args.quiet option which is translated
into a '--quiet' flag for the 'git pack-objects' child process. However,
'show_progress' is used to directly send progress information to the
multi-pack-index writing logic which does not use a child process.
The fix here is to modify 'show_progress' to be false if po_opts.quiet
is true, and isatty(2) otherwise. This new expectation simplifies a
later condition that checks both.
Update the documentation to make it clear that '-q' will disable all
progress in addition to ensuring the 'git pack-objects' child process
will receive the flag.
Use 'test_terminal' to check that this works to get around the isatty(2)
check.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 20 Dec 2021 14:48:10 +0000 (14:48 +0000)]
repack: respect kept objects with '--write-midx -b'
Historically, we needed a single packfile in order to have reachability
bitmaps. This introduced logic that when 'git repack' had a '-b' option
that we should stop sending the '--honor-pack-keep' option to the 'git
pack-objects' child process, ensuring that we create a packfile
containing all reachable objects.
In the world of multi-pack-index bitmaps, we no longer need to repack
all objects into a single pack to have valid bitmaps. Thus, we should
continue sending the '--honor-pack-keep' flag to 'git pack-objects'.
The fix is very simple: only disable the flag when writing bitmaps but
also _not_ writing the multi-pack-index.
This opens the door to new repacking strategies that might want to keep
some historical set of objects in a stable pack-file while only
repacking more recent objects.
To test, create a new 'test_subcommand_inexact' helper that is more
flexible than 'test_subcommand'. This allows us to look for the
--honor-pack-keep flag without over-indexing on the exact set of
arguments.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Greg Hurrell [Fri, 17 Dec 2021 16:17:18 +0000 (17:17 +0100)]
docs: add missing colon to Documentation/config/gpg.txt
Add missing colon to ensure correct rendering of definition list
item. Without the proper number of colons, it renders as just another
top-level paragraph rather than a list item.
Signed-off-by: Greg Hurrell <greg@hurrell.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Fri, 17 Dec 2021 16:48:49 +0000 (17:48 +0100)]
log: let --invert-grep only invert --grep
The option --invert-grep is documented to filter out commits whose
messages match the --grep filters. However, it also affects the
header matches (--author, --committer), which is not intended.
Move the handling of that option to grep.c, as only the code there can
distinguish between matches in the header from those in the message
body. If --invert-grep is given then enable extended expressions (not
the regex type, we just need git grep's --not to work), negate the body
patterns and check if any of them match by piggy-backing on the
collect_hits mechanism of grep_source_1().
Collecting the matches in struct grep_opt is a bit iffy, but with
"last_shown" we have a precedent for writing state information to that
struct.
Reported-by: Dotan Cohen <dotancohen@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 16 Dec 2021 23:37:10 +0000 (15:37 -0800)]
format-patch: mark rev_info with UNLEAK
The comand uses a single instance of rev_info on stack, makes a
single revision traversal and exit. Mark the resources held by the
rev_info structure with UNLEAK().
We do not do this at lower level in revision.c or cmd_log_walk(), as
a new caller of the revision traversal API can make unbounded number
of rev_info during a single run, and UNLEAK() would not a be
suitable mechanism to deal with such a caller.
Junio C Hamano [Thu, 16 Dec 2021 23:11:12 +0000 (15:11 -0800)]
t4204 is not sanitizer clean at all
Earlier we marked that this patch-id test is leak-sanitizer clean,
but if we read the test script carefully, it is merely because we
have too many invocations of commands in the "git log" family on the
upstream side of the pipe, hiding breakages from them.
Split the pipeline so that breakages from these commands can be
caught (not limited to aborts due to leak-sanitizer) and unmark
the script as not passing the test with leak-sanitizer in effect.
Joel Holdsworth [Thu, 16 Dec 2021 13:46:19 +0000 (13:46 +0000)]
git-p4: resolve RCS keywords in bytes not utf-8
RCS keywords are strings that are replaced with information from
Perforce. Examples include $Date$, $Author$, $File$, $Change$ etc.
Perforce resolves these by expanding them with their expanded values
when files are synced, but Git's data model requires these expanded
values to be converted back into their unexpanded form.
Previously, git-p4.py would implement this behaviour through the use of
regular expressions. However, the regular expression substitution was
applied using decoded strings i.e. the content of incoming commit diffs
was first decoded from bytes into UTF-8, processed with regular
expressions, then converted back to bytes.
Not only is this behaviour inefficient, but it is also a cause of a
common issue caused by text files containing invalid UTF-8 data. For
files created in Windows, CP1252 Smart Quote Characters (0x93 and 0x94)
are seen fairly frequently. These codes are invalid in UTF-8, so if the
script encountered any file containing them, on Python 2 the symbols
will be corrupted, and on Python 3 the script will fail with an
exception.
This patch replaces this decoding/encoding with bytes object regular
expressions, so that the substitution is performed directly upon the
source data with no conversions.
A test for smart quote handling has been added to the
t9810-git-p4-rcs.sh test suite.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Joel Holdsworth [Thu, 16 Dec 2021 13:46:18 +0000 (13:46 +0000)]
git-p4: open temporary patch file for write only
The patchRCSKeywords method creates a temporary file in which to store
the patched output data. Previously this file was opened in "w+" mode
(write and read), but the code never reads the contents of the file
while open, so it only needs to be opened in "w" mode (write-only).
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Joel Holdsworth [Thu, 16 Dec 2021 13:46:17 +0000 (13:46 +0000)]
git-p4: add raw option to read_pipelines
Previously the read_lines function always decoded the result lines. In
order to improve support for non-decoded binary processing of data in
git-p4.py, this patch adds a raw option to the function that allows
decoding to be disabled.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Joel Holdsworth [Thu, 16 Dec 2021 13:46:16 +0000 (13:46 +0000)]
git-p4: pre-compile RCS keyword regexes
Previously git-p4.py would compile one of two regular expressions for
ever RCS keyword-enabled file. This patch improves simplifies the code
by pre-compiling the two regular expressions when the script first
loads.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
stash: don't show "git stash push" usage on bad "git stash" usage
Change the usage message emitted by "git stash --invalid-option" to
emit usage information for "git stash" in general, and not just for
the "push" command. I.e. before:
That we emitted the usage for just "push" in the case of the
subcommand not being explicitly specified was an unintentional
side-effect of how it was implemented. When it was converted to C in d553f538b8a (stash: convert push to builtin, 2019-02-25) the pattern
of having per-subcommand usage information was rightly continued. The
"git-stash.sh" shellscript did not have that, and always printed the
equivalent of "git_stash_usage".
But in doing so the case of push being implicit and explicit was
conflated. A variable was added to track this in 8c3713cede7 (stash:
eliminate crude option parsing, 2020-02-17), but it did not update the
usage output accordingly.
This still leaves e.g. "git stash push -h" emitting the
"git_stash_usage" output, instead of "git_stash_push_usage". That
should be fixed, but is a much deeper misbehavior in parse_options()
not being aware of subcommands at all. I.e. in how
PARSE_OPT_KEEP_UNKNOWN and PARSE_OPT_NO_INTERNAL_HELP combine in
commands such as "git stash".
Perhaps PARSE_OPT_KEEP_UNKNOWN should imply
PARSE_OPT_NO_INTERNAL_HELP, or better yet parse_options() should be
extended to fully handle these subcommand cases that we handle
manually in "git stash", "git commit-graph", "git multi-pack-index"
etc. All of those musings would be a much bigger change than this
isolated fix though, so let's leave that for some other time.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Kashav Madan [Wed, 15 Dec 2021 22:58:58 +0000 (22:58 +0000)]
help: make auto-correction prompt more consistent
There are three callsites of git_prompt() helper function that ask a
"yes/no" question to the end user, but one of them in help.c that
asks if the suggested auto-correction is OK, which is given when the
user makes a possible typo in a Git subcommand name, is formatted
differently from the others.
Update the format string to make the prompt string look more
consistent.
Signed-off-by: Kashav Madan <kshvmdn@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
徐沛文 (Aleen) [Thu, 9 Dec 2021 07:25:54 +0000 (07:25 +0000)]
am: support --empty=<option> to handle empty patches
Since that the command 'git-format-patch' can include patches of
commits that emit no changes, the 'git-am' command should also
support an option, named as '--empty', to specify how to handle
those empty patches. In this commit, we have implemented three
valid options ('stop', 'drop' and 'keep').
Signed-off-by: 徐沛文 (Aleen) <aleen42@vip.qq.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 15 Dec 2021 16:23:48 +0000 (11:23 -0500)]
doc/config: mark ssh allowedSigners example as literal
The discussion for gpg.ssh.allowedSignersFile shows an example string
that contains "user1@example.com,user2@example.com". Asciidoc thinks
these are real email addresses and generates "mailto" footnotes for
them. This makes the rendered content more confusing, as it has extra
"[1]" markers:
The file consists of one or more lines of principals followed by an
ssh public key. e.g.: user1@example.com[1],user2@example.com[2]
ssh-rsa AAAAX1... See ssh-keygen(1) "ALLOWED SIGNERS" for details.
and also generates pointless notes at the end of the page:
We can fix this by putting the example into a backtick literal block.
That inhibits the mailto generation, and as a bonus typesets the example
text in a way that sets it off from the regular prose (a tt font for
html, or bold in the roff manpage).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jacob Vosmaer [Tue, 14 Dec 2021 19:46:26 +0000 (20:46 +0100)]
upload-pack.c: increase output buffer size
When serving a fetch, git upload-pack copies data from a git
pack-objects stdout pipe to its stdout. This commit increases the size
of the buffer used for that copying from 8192 to 65515, the maximum
sideband-64k packet size.
Previously, this buffer was allocated on the stack. Because the new
buffer size is nearly 64KB, we switch this to a heap allocation.
On GitLab.com we use GitLab's pack-objects cache which does writes of
65515 bytes. Because of the default 8KB buffer size, propagating these
cache writes requires 8 pipe reads and 8 pipe writes from
git-upload-pack, and 8 pipe reads from Gitaly (our Git RPC service).
If we increase the size of the buffer to the maximum Git packet size,
we need only 1 pipe read and 1 pipe write in git-upload-pack, and 1
pipe read in Gitaly to transfer the same amount of data. In benchmarks
with a pure fetch and 100% cache hit rate workload we are seeing CPU
utilization reductions of over 30%.
Signed-off-by: Jacob Vosmaer <jacob@gitlab.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 14 Dec 2021 04:09:10 +0000 (04:09 +0000)]
git-sparse-checkout.txt: update to document init/set/reapply changes
As noted in the previous commit, using separate `init` and `set` steps
with sparse-checkout result in a number of issues. The previous commits
made `set` able to handle the work of both commands, and enabled reapply
to tweak the {cone,sparse-index} settings. Update the documentation to
reflect this, and mark `init` as deprecated.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 14 Dec 2021 04:09:09 +0000 (04:09 +0000)]
sparse-checkout: enable reapply to take --[no-]{cone,sparse-index}
Folks may want to switch to or from cone mode, or to or from a
sparse-index without changing their sparsity paths. Allow them to do so
using the reapply command.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>