refs: add ability for backends to special-case reading of symbolic refs
Reading of symbolic and non-symbolic references is currently treated the
same in reference backends: we always call `refs_read_raw_ref()` and
then decide based on the returned flags what type it is. This has one
downside though: symbolic references may be treated different from
normal references in a backend from normal references. The packed-refs
backend for example doesn't even know about symbolic references, and as
a result it is pointless to even ask it for one.
There are cases where we really only care about whether a reference is
symbolic or not, but don't care about whether it exists at all or may be
a non-symbolic reference. But it is not possible to optimize for this
case right now, and as a consequence we will always first check for a
loose reference to exist, and if it doesn't, we'll query the packed-refs
backend for a known-to-not-be-symbolic reference. This is inefficient
and requires us to search all packed references even though we know to
not care for the result at all.
Introduce a new function `refs_read_symbolic_ref()` which allows us to
fix this case. This function will only ever return symbolic references
and can thus optimize for the scenario layed out above. By default, if
the backend doesn't provide an implementation for it, we just use the
old code path and fall back to `read_raw_ref()`. But in case the backend
provides its own, more efficient implementation, we will use that one
instead.
Note that this function is explicitly designed to not distinguish
between missing references and non-symbolic references. If it did, we'd
be forced to always search the packed-refs backend to see whether the
symbolic reference the user asked for really doesn't exist, or if it
exists as a non-symbolic reference.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch: avoid lookup of commits when not appending to FETCH_HEAD
When fetching from a remote repository we will by default write what has
been fetched into the special FETCH_HEAD reference. The order in which
references are written depends on whether the reference is for merge or
not, which, despite some other conditions, is also determined based on
whether the old object ID the reference is being updated from actually
exists in the repository.
To write FETCH_HEAD we thus loop through all references thrice: once for
the references that are about to be merged, once for the references that
are not for merge, and finally for all references that are ignored. For
every iteration, we then look up the old object ID to determine whether
the referenced object exists so that we can label it as "not-for-merge"
if it doesn't exist. It goes without saying that this can be expensive
in case where we are fetching a lot of references.
While this is hard to avoid in the case where we're writing FETCH_HEAD,
users can in fact ask us to skip this work via `--no-write-fetch-head`.
In that case, we do not care for the result of those lookups at all
because we don't have to order writes to FETCH_HEAD in the first place.
Skip this busywork in case we're not writing to FETCH_HEAD. The
following benchmark performs a mirror-fetch in a repository with about
two million references via `git fetch --prune --no-write-fetch-head
+refs/*:refs/*`:
Benchmark 1: HEAD~
Time (mean ± σ): 75.388 s ± 1.942 s [User: 71.103 s, System: 8.953 s]
Range (min … max): 73.184 s … 76.845 s 3 runs
Benchmark 2: HEAD
Time (mean ± σ): 69.486 s ± 1.016 s [User: 65.941 s, System: 8.806 s]
Range (min … max): 68.864 s … 70.659 s 3 runs
Summary
'HEAD' ran
1.08 ± 0.03 times faster than 'HEAD~'
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
upload-pack: look up "want" lines via commit-graph
During packfile negotiation the client will send "want" and "want-ref"
lines to the server to tell it which objects it is interested in. The
server-side parses each of those and looks them up to see whether it
actually has requested objects. This lookup is performed by calling
`parse_object()` directly, which thus hits the object database. In the
general case though most of the objects the client requests will be
commits. We can thus try to look up the object via the commit-graph
opportunistically, which is much faster than doing the same via the
object database.
Refactor parsing of both "want" and "want-ref" lines to do so.
The following benchmark is executed in a repository with a huge number
of references. It uses cached request from git-fetch(1) as input to
git-upload-pack(1) that contains about 876,000 "want" lines:
Benchmark 1: HEAD~
Time (mean ± σ): 7.113 s ± 0.028 s [User: 6.900 s, System: 0.662 s]
Range (min … max): 7.072 s … 7.168 s 10 runs
Benchmark 2: HEAD
Time (mean ± σ): 6.622 s ± 0.061 s [User: 6.452 s, System: 0.650 s]
Range (min … max): 6.535 s … 6.727 s 10 runs
Summary
'HEAD' ran
1.07 ± 0.01 times faster than 'HEAD~'
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 1 Mar 2022 18:11:00 +0000 (10:11 -0800)]
Merge branch 'ps/fetch-atomic' into ps/fetch-mirror-optim
* ps/fetch-atomic:
fetch: make `--atomic` flag cover pruning of refs
fetch: make `--atomic` flag cover backfilling of tags
refs: add interface to iterate over queued transactional updates
fetch: report errors when backfilling tags fails
fetch: control lifecycle of FETCH_HEAD in a single place
fetch: backfill tags before setting upstream
fetch: increase test coverage of fetches
Junio C Hamano [Fri, 18 Feb 2022 21:53:30 +0000 (13:53 -0800)]
Merge branch 'js/short-help-outside-repo-fix'
"git cmd -h" outside a repository should error out cleanly for many
commands, but instead it hit a BUG(), which has been corrected.
* js/short-help-outside-repo-fix:
t0012: verify that built-ins handle `-h` even without gitdir
checkout/fetch/pull/pack-objects: allow `-h` outside a repository
Junio C Hamano [Fri, 18 Feb 2022 21:53:29 +0000 (13:53 -0800)]
Merge branch 'gc/branch-recurse-submodules'
"git branch" learned the "--recurse-submodules" option.
* gc/branch-recurse-submodules:
branch.c: use 'goto cleanup' in setup_tracking() to fix memory leaks
branch: add --recurse-submodules option for branch creation
builtin/branch: consolidate action-picking logic in cmd_branch()
branch: add a dry_run parameter to create_branch()
branch: make create_branch() always create a branch
branch: move --set-upstream-to behavior to dwim_and_setup_tracking()
Because a deletion of ref would need to remove it from both the
loose ref store and the packed ref store, a delete-ref operation
that logically removes one ref may end up invoking ref-transaction
hook twice, which has been corrected.
* ps/avoid-unnecessary-hook-invocation-with-packed-refs:
refs: skip hooks when deleting uncovered packed refs
refs: do not execute reference-transaction hook on packing refs
refs: demonstrate excessive execution of the reference-transaction hook
refs: allow skipping the reference-transaction hook
refs: allow passing flags when beginning transactions
refs: extract packed_refs_delete_refs() to allow control of transaction
Use an internal call to reset_head() helper function instead of
spawning "git checkout" in "rebase", and update code paths that are
involved in the change.
* pw/use-in-process-checkout-in-rebase:
rebase -m: don't fork git checkout
rebase --apply: set ORIG_HEAD correctly
rebase --apply: fix reflog
reset_head(): take struct rebase_head_opts
rebase: cleanup reset_head() calls
create_autostash(): remove unneeded parameter
reset_head(): make default_reflog_action optional
reset_head(): factor out ref updates
reset_head(): remove action parameter
rebase --apply: don't run post-checkout hook if there is an error
rebase: do not remove untracked files on checkout
rebase: pass correct arguments to post-checkout hook
t5403: refactor rebase post-checkout hook tests
rebase: factor out checkout for up to date branch
"receive-pack" checks if it will do any ref updates (various
conditions could reject a push) before received objects are taken
out of the temporary directory used for quarantine purposes, so
that a push that is known-to-fail will not leave crufts that a
future "gc" needs to clean up.
* cb/clear-quarantine-early-on-all-ref-update-errors:
receive-pack: purge temporary data if no command is ready to run
Junio C Hamano [Fri, 18 Feb 2022 00:25:05 +0000 (16:25 -0800)]
Merge branch 'ab/complete-show-all-commands'
The command line completion script (in contrib/) learned to
complete all Git subcommands, including the ones that are normally
hidden, when GIT_COMPLETION_SHOW_ALL_COMMANDS is used.
* ab/complete-show-all-commands:
completion: add a GIT_COMPLETION_SHOW_ALL_COMMANDS
completion tests: re-source git-completion.bash in a subshell
Junio C Hamano [Fri, 18 Feb 2022 00:25:05 +0000 (16:25 -0800)]
Merge branch 'vd/sparse-clean-etc'
"git update-index", "git checkout-index", and "git clean" are
taught to work better with the sparse checkout feature.
* vd/sparse-clean-etc:
update-index: reduce scope of index expansion in do_reupdate
update-index: integrate with sparse index
update-index: add tests for sparse-checkout compatibility
checkout-index: integrate with sparse index
checkout-index: add --ignore-skip-worktree-bits option
checkout-index: expand sparse checkout compatibility tests
clean: integrate with sparse index
reset: reorder wildcard pathspec conditions
reset: fix validation in sparse index test
"git log" and friends learned an option --exclude-first-parent-only
to propagate UNINTERESTING bit down only along the first-parent
chain, just like --first-parent option shows commits that lack the
UNINTERESTING bit only along the first-parent chain.
* jz/rev-list-exclude-first-parent-only:
git-rev-list: add --exclude-first-parent-only flag
Junio C Hamano [Fri, 18 Feb 2022 00:25:04 +0000 (16:25 -0800)]
Merge branch 'tk/subtree-merge-not-ff-only'
When "git subtree" wants to create a merge, it used "git merge" and
let it be affected by end-user's "merge.ff" configuration, which
has been corrected.
* tk/subtree-merge-not-ff-only:
subtree: force merge commit
When fetching with the `--prune` flag we will delete any local
references matching the fetch refspec which have disappeared on the
remote. This step is not currently covered by the `--atomic` flag: we
delete branches even though updating of local references has failed,
which means that the fetch is not an all-or-nothing operation.
Fix this bug by passing in the global transaction into `prune_refs()`:
if one is given, then we'll only queue up deletions and not commit them
right away.
This change also improves performance when pruning many branches in a
repository with a big packed-refs file: every references is pruned in
its own transaction, which means that we potentially have to rewrite
the packed-refs files for every single reference we're about to prune.
The following benchmark demonstrates this: it performs a pruning fetch
from a repository with a single reference into a repository with 100k
references, which causes us to prune all but one reference. This is of
course a very artificial setup, but serves to demonstrate the impact of
only having to write the packed-refs file once:
Benchmark 1: git fetch --prune --atomic +refs/*:refs/* (HEAD~)
Time (mean ± σ): 2.366 s ± 0.021 s [User: 0.858 s, System: 1.508 s]
Range (min … max): 2.328 s … 2.407 s 10 runs
Benchmark 2: git fetch --prune --atomic +refs/*:refs/* (HEAD)
Time (mean ± σ): 1.369 s ± 0.017 s [User: 0.715 s, System: 0.641 s]
Range (min … max): 1.346 s … 1.400 s 10 runs
Summary
'git fetch --prune --atomic +refs/*:refs/* (HEAD)' ran
1.73 ± 0.03 times faster than 'git fetch --prune --atomic +refs/*:refs/* (HEAD~)'
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch: make `--atomic` flag cover backfilling of tags
When fetching references from a remote we by default also fetch all tags
which point into the history we have fetched. This is a separate step
performed after updating local references because it requires us to walk
over the history on the client-side to determine whether the remote has
announced any tags which point to one of the fetched commits.
This backfilling of tags isn't covered by the `--atomic` flag: right
now, it only applies to the step where we update our local references.
This is an oversight at the time the flag was introduced: its purpose is
to either update all references or none, but right now we happily update
local references even in the case where backfilling failed.
Fix this by pulling up creation of the reference transaction such that
we can pass the same transaction to both the code which updates local
references and to the code which backfills tags. This allows us to only
commit the transaction in case both actions succeed.
Note that we also have to start passing the transaction into
`find_non_local_tags()`: this function is responsible for finding all
tags which we need to backfill. Right now, it will happily return tags
which have already been updated with our local references. But when we
use a single transaction for both local references and backfilling then
it may happen that we try to queue the same reference update twice to
the transaction, which consequently triggers a bug. We thus have to skip
over any tags which have already been queued.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: add interface to iterate over queued transactional updates
There is no way for a caller to see whether a reference update has
already been queued up for a given reference transaction. There are
multiple alternatives to provide this functionality:
- We may add a function that simply tells us whether a specific
reference has already been queued. If implemented naively then
this would potentially be quadratic in runtime behaviour if this
question is asked repeatedly because we have to iterate over all
references every time. The alternative would be to add a hashmap
of all queued reference updates to speed up the lookup, but this
adds overhead to all callers.
- We may add a flag to `ref_transaction_add_update()` that causes it
to skip duplicates, but this has the same runtime concerns as the
first alternative.
- We may add an interface which lets callers collect all updates
which have already been queued such that he can avoid re-adding
them. This is the most flexible approach and puts the burden on
the caller, but also allows us to not impact any of the existing
callsites which don't need this information.
This commit implements the last approach: it allows us to compute the
map of already-queued updates once up front such that we can then skip
all subsequent references which are already part of this map.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the backfilling of tags fails we do not report this error to the
caller, but only report it implicitly at a later point when reporting
updated references. This leaves callers unable to act upon the
information of whether the backfilling succeeded or not.
Refactor the function to return an error code and pass it up the
callstack. This causes us to correctly propagate the error back to the
user of git-fetch(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch: control lifecycle of FETCH_HEAD in a single place
There are two different locations where we're appending to FETCH_HEAD:
first when storing updated references, and second when backfilling tags.
Both times we open the file, append to it and then commit it into place,
which is essentially duplicate work.
Improve the lifecycle of updating FETCH_HEAD by opening and committing
it once in `do_fetch()`, where we pass the structure down to the code
which wants to append to it.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fetch code flow is a bit hard to understand right now:
1. We optionally prune all references which have vanished on the
remote side.
2. We fetch and update all other references locally.
3. We update the upstream branch in the gitconfig.
4. We backfill tags pointing into the history we have just fetched.
It is quite confusing that we fetch objects and update references in
both (2) and (4), which is further stressed by the point that we use a
`skip` goto label to jump from (3) to (4) in case we fail to update the
gitconfig as expected.
Reorder the code to first update all local references, and only after we
have done so update the upstream branch information. This improves the
code flow and furthermore makes it easier to refactor the way we update
references together.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using git-fetch(1) with the `--atomic` flag the expectation is that
either all of the references are updated, or alternatively none are in
case the fetch fails. While we already have tests for this, we do not
have any tests which exercise atomicity either when pruning deleted refs
or when backfilling tags. This gap in test coverage hides that we indeed
don't handle atomicity correctly for both of these cases.
Add test cases which cover these testing gaps to demonstrate the broken
behaviour. Note that tests are not marked as `test_expect_failure`: this
is done to explicitly demonstrate the current known-wrong behaviour, and
they will be fixed up as soon as we fix the underlying bugs.
While at it this commit also adds another test case which demonstrates
that backfilling of tags does not return an error code in case the
backfill fails. This bug will also be fixed by a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 16 Feb 2022 23:14:30 +0000 (15:14 -0800)]
Merge branch 'js/no-more-legacy-stash'
Removal of unused code and doc.
* js/no-more-legacy-stash:
stash: stop warning about the obsolete `stash.useBuiltin` config setting
stash: remove documentation for `stash.useBuiltin`
add: remove support for `git-legacy-stash`
git-sh-setup: remove remnant bits referring to `git-legacy-stash`
Junio C Hamano [Wed, 16 Feb 2022 23:14:30 +0000 (15:14 -0800)]
Merge branch 'js/diff-filter-negation-fix'
"git diff --diff-filter=aR" is now parsed correctly.
* js/diff-filter-negation-fix:
diff-filter: be more careful when looking for negative bits
diff.c: move the diff filter bits definitions up a bit
docs(diff): lose incorrect claim about `diff-files --diff-filter=A`
Junio C Hamano [Wed, 16 Feb 2022 23:14:30 +0000 (15:14 -0800)]
Merge branch 'en/fetch-negotiation-default-fix'
Interaction between fetch.negotiationAlgorithm and
feature.experimental configuration variables has been corrected.
* en/fetch-negotiation-default-fix:
repo-settings: rename the traditional default fetch.negotiationAlgorithm
repo-settings: fix error handling for unknown values
repo-settings: fix checking for fetch.negotiationAlgorithm=default
Junio C Hamano [Wed, 16 Feb 2022 23:14:29 +0000 (15:14 -0800)]
Merge branch 'en/remerge-diff'
"git log --remerge-diff" shows the difference from mechanical merge
result and the result that is actually recorded in a merge commit.
* en/remerge-diff:
diff-merges: avoid history simplifications when diffing merges
merge-ort: mark conflict/warning messages from inner merges as omittable
show, log: include conflict/warning messages in --remerge-diff headers
diff: add ability to insert additional headers for paths
merge-ort: format messages slightly different for use in headers
merge-ort: mark a few more conflict messages as omittable
merge-ort: capture and print ll-merge warnings in our preferred fashion
ll-merge: make callers responsible for showing warnings
log: clean unneeded objects during `log --remerge-diff`
show, log: provide a --remerge-diff capability
Junio C Hamano [Wed, 16 Feb 2022 23:14:27 +0000 (15:14 -0800)]
Merge branch 'hn/reftable-coverity-fixes'
Problems identified by Coverity in the reftable code have been
corrected.
* hn/reftable-coverity-fixes:
reftable: add print functions to the record types
reftable: make reftable_record a tagged union
reftable: remove outdated file reftable.c
reftable: implement record equality generically
reftable: make reftable-record.h function signatures const correct
reftable: handle null refnames in reftable_ref_record_equal
reftable: drop stray printf in readwrite_test
reftable: order unittests by complexity
reftable: all xxx_free() functions accept NULL arguments
reftable: fix resource warning
reftable: ignore remove() return value in stack_test.c
reftable: check reftable_stack_auto_compact() return value
reftable: fix resource leak blocksource.c
reftable: fix resource leak in block.c error path
reftable: fix OOB stack write in print functions
Junio C Hamano [Sat, 12 Feb 2022 00:56:01 +0000 (16:56 -0800)]
Merge branch 'tg/fetch-prune-exit-code-fix'
When "git fetch --prune" failed to prune the refs it wanted to
prune, the command issued error messages but exited with exit
status 0, which has been corrected.
* tg/fetch-prune-exit-code-fix:
fetch --prune: exit with error if pruning fails
Junio C Hamano [Sat, 12 Feb 2022 00:55:58 +0000 (16:55 -0800)]
Merge branch 'jc/doc-log-messages'
Update the contributor-facing documents on proposed log messages.
* jc/doc-log-messages:
SubmittingPatches: explain why we care about log messages
CodingGuidelines: hint why we value clearly written log messages
SubmittingPatches: write problem statement in the log in the present tense
* ab/no-errno-from-resolve-ref-unsafe:
refs API: remove "failure_errno" from refs_resolve_ref_unsafe()
sequencer: don't use die_errno() on refs_resolve_ref_unsafe() failure
Junio C Hamano [Thu, 10 Feb 2022 02:19:07 +0000 (18:19 -0800)]
glossary: describe "worktree"
We have description on "per worktree ref", but "worktree" is not
described in the glossary. We do have "working tree", though.
Casually put, a "working tree" is what your editor and compiler
interacts with. "worktree" is a mechanism to allow one or more
"working tree"s to be attached to a repository and used to check out
different commits and branches independently, which includes not
just a "working tree" but also repository metadata like HEAD, the
index to support simultaneous use of them. Historically, we used
these terms interchangeably but we have been trying to use "working
tree" when we mean it, instead of "worktree".
Most of the existing references to "working tree" in the glossary do
refer primarily to the working tree portion, except for one that
said refs like HEAD and refs/bisect/* are per "working tree", but it
is more precise to say they are per "worktree".
Junio C Hamano [Wed, 9 Feb 2022 22:21:01 +0000 (14:21 -0800)]
Merge branch 'js/sparse-vs-split-index'
Mark in various places in the code that the sparse index and the
split index features are mutually incompatible.
* js/sparse-vs-split-index:
split-index: it really is incompatible with the sparse index
t1091: disable split index
sparse-index: sparse index is disallowed when split index is active
Junio C Hamano [Wed, 9 Feb 2022 22:21:01 +0000 (14:21 -0800)]
Merge branch 'jt/clone-not-quite-empty'
Cloning from a repository that does not yet have any branches or
tags but has other refs resulted in a "remote transport reported
error", which has been corrected.
* jt/clone-not-quite-empty:
clone: support unusual remote ref configurations
Junio C Hamano [Wed, 9 Feb 2022 22:21:00 +0000 (14:21 -0800)]
Merge branch 'jt/sparse-checkout-leading-dir-fix'
"git sparse-checkout init" failed to write into $GIT_DIR/info
directory when the repository was created without one, which has
been corrected to auto-create it.
* jt/sparse-checkout-leading-dir-fix:
sparse-checkout: create leading directory
Junio C Hamano [Wed, 9 Feb 2022 22:21:00 +0000 (14:21 -0800)]
Merge branch 'ab/config-based-hooks-2'
More "config-based hooks".
* ab/config-based-hooks-2:
run-command: remove old run_hook_{le,ve}() hook API
receive-pack: convert push-to-checkout hook to hook.h
read-cache: convert post-index-change to use hook.h
commit: convert {pre-commit,prepare-commit-msg} hook to hook.h
git-p4: use 'git hook' to run hooks
send-email: use 'git hook run' for 'sendemail-validate'
git hook run: add an --ignore-missing flag
hooks: convert worktree 'post-checkout' hook to hook library
hooks: convert non-worktree 'post-checkout' hook to hook library
merge: convert post-merge to use hook.h
am: convert applypatch-msg to use hook.h
rebase: convert pre-rebase to use hook.h
hook API: add a run_hooks_l() wrapper
am: convert {pre,post}-applypatch to use hook.h
gc: use hook library for pre-auto-gc hook
hook API: add a run_hooks() wrapper
hook: add 'run' subcommand
"git fetch --negotiate-only" is an internal command used by "git
push" to figure out which part of our history is missing from the
other side. It should never recurse into submodules even when
fetch.recursesubmodules configuration variable is set, nor it
should trigger "gc". The code has been tightened up to ensure it
only does common ancestry discovery and nothing else.
* gc/fetch-negotiate-only-early-return:
fetch: help translators by reusing the same message template
fetch --negotiate-only: do not update submodules
fetch: skip tasks related to fetching objects
fetch: use goto cleanup in cmd_fetch()
Junio C Hamano [Wed, 9 Feb 2022 22:20:59 +0000 (14:20 -0800)]
Merge branch 'tl/doc-cli-options-first'
We explain that revs come first before the pathspec among command
line arguments, but did not spell out that dashed options come
before other args, which has been corrected.
* tl/doc-cli-options-first:
git-cli.txt: clarify "options first and then args"
The conditional inclusion mechanism of configuration files using
"[includeIf <condition>]" learns to base its decision on the
URL of the remote repository the repository interacts with.
* jt/conditional-config-on-remote-url:
config: include file if remote URL matches a glob
config: make git_config_include() static
Taylor Blau [Wed, 9 Feb 2022 19:26:47 +0000 (14:26 -0500)]
midx: prevent writing a .bitmap without any objects
When trying to write a MIDX, we already prevent the case where there
weren't any packs present, and thus we would have written an empty MIDX.
But there is another "empty" case, which is more interesting, and we
don't yet handle. If we try to write a MIDX which has at least one pack,
but those packs together don't contain any objects, we will encounter a
BUG() when trying to use the bitmap corresponding to that MIDX, like so:
$ git rev-parse HEAD | git pack-objects --revs --use-bitmap-index --stdout >/dev/null
BUG: pack-revindex.c:394: pack_pos_to_midx: out-of-bounds object at 0
(note that in the above reproduction, both `--use-bitmap-index` and
`--stdout` are important, since without the former we won't even both to
load the .bitmap, and without the latter we wont attempt pack reuse).
The problem occurs when we try to discover the identity of the
preferred pack to determine which range if any of existing packs we can
reuse verbatim. This path is: `reuse_packfile_objects()` ->
`reuse_partial_packfile_from_bitmap()` -> `midx_preferred_pack()`.
#4 0x000055555575401f in pack_pos_to_midx (m=0x555555997160, pos=0) at pack-revindex.c:394
#5 0x00005555557502c8 in midx_preferred_pack (bitmap_git=0x55555599c280) at pack-bitmap.c:1431
#6 0x000055555575036c in reuse_partial_packfile_from_bitmap (bitmap_git=0x55555599c280, packfile_out=0x5555559666b0 <reuse_packfile>,
entries=0x5555559666b8 <reuse_packfile_objects>, reuse_out=0x5555559666c0 <reuse_packfile_bitmap>) at pack-bitmap.c:1452
#7 0x00005555556041f6 in get_object_list_from_bitmap (revs=0x7fffffffcbf0) at builtin/pack-objects.c:3658
#8 0x000055555560465c in get_object_list (ac=2, av=0x555555997050) at builtin/pack-objects.c:3765
#9 0x0000555555605e4e in cmd_pack_objects (argc=0, argv=0x7fffffffe920, prefix=0x0) at builtin/pack-objects.c:4154
Since neither the .bitmap or MIDX stores the identity of the
preferred pack, we infer it by trying to load the first object in
pseudo-pack order, and then asking the MIDX which pack was chosen to
represent that object.
But this fails our bounds check, since there are zero objects in the
MIDX to begin with, which results in the BUG().
We could catch this more carefully in `midx_preferred_pack()`, but
signaling the absence of a preferred pack out to all of its callers is
somewhat awkward.
Instead, let's avoid writing a MIDX .bitmap without any objects
altogether. We catch this case in `write_midx_internal()`, and emit a
warning if the caller indicated they wanted to write a bitmap before
clearing out the relevant flags. If we somehow got to
write_midx_bitmap(), then we will call BUG(), but this should now be an
unreachable path.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: handle unusual characters for sparse-checkout
Update the __gitcomp_directories method to de-quote and handle unusual
characters in directory names. Although this initially involved an attempt
to re-use the logic in __git_index_files, this method removed
subdirectories (e.g. folder1/0/ became folder1/), so instead new custom
logic was placed directly in the __gitcomp_directories method.
Note there are two tests for this new functionality - one for spaces and
accents and one for backslashes and tabs. The backslashes and tabs test
uses FUNNYNAMES to avoid running on Windows. This is because:
1. Backslashes are explicitly not allowed in Windows file paths.
2. Although tabs appear to be allowed when creating a file in a Windows
bash shell, they actually are not renderable (and appear as empty boxes
in the shell).
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de> Co-authored-by: Lessley Dennington <lessleydennington@gmail.com> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Lessley Dennington <lessleydennington@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use new __gitcomp_directories method to complete directory names in cone
mode sparse-checkouts. This method addresses the caveat of poor
performance in monorepos from the previous commit (by completing only one
level of directories).
The unusual character caveat from the previous commit will be fixed by the
final commit in this series.
Correct multiple issues with tab completion of the git sparse-checkout
command. These issues were:
1. git sparse-checkout <TAB> previously resulted in an incomplete list of
subcommands (it was missing reapply and add).
2. Subcommand options were not tab-completable.
3. git sparse-checkout set <TAB> and git sparse-checkout add <TAB> showed
both file names and directory names. While this may be a less surprising
behavior for non-cone mode, cone mode sparse checkouts should complete
only directory names.
Note that while the new strategy of just using git ls-tree to complete on
directory names is simple and a step in the right direction, it does have
some caveats. These are:
1. Likelihood of poor performance in large monorepos (as a result of
recursively completing directory names).
2. Inability to handle paths containing unusual characters.
These caveats will be fixed by subsequent commits in this series.
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t0012: verify that built-ins handle `-h` even without gitdir
We just fixed a class of recently introduced bugs where calling, say,
`git fetch -h` outside a repository would not show the usage but instead
show an ugly `BUG` message.
Let's verify that this does not regress anymore.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
checkout/fetch/pull/pack-objects: allow `-h` outside a repository
When we taught these commands about the sparse index, we did not account
for the fact that the `cmd_*()` functions _can_ be called without a
gitdir, namely when `-h` is passed to show the usage.
A plausible approach to address this is to move the
`prepare_repo_settings()` calls right after the `parse_options()` calls:
The latter will never return when it handles `-h`, and therefore it is
safe to assume that we have a `gitdir` at that point, as long as the
built-in is marked with the `RUN_SETUP` flag.
However, it is unfortunately not that simple. In `cmd_pack_objects()`,
for example, the repo settings need to be fully populated so that the
command-line options `--sparse`/`--no-sparse` can override them, not the
other way round.
Therefore, we choose to imitate the strategy taken in `cmd_diff()`,
where we simply do not bother to prepare and initialize the repo
settings unless we have a `gitdir`.
This fixes https://github.com/git-for-windows/git/issues/3688
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-remote & transport API: release "struct transport_ls_refs_options"
Fix a memory leak in codepaths that use the "struct
transport_ls_refs_options" API. Since the introduction of the struct
in 39835409d10 (connect, transport: encapsulate arg in struct,
2021-02-05) the caller has been responsible for freeing it.
That commit in turn migrated code originally added in 402c47d9391 (clone: send ref-prefixes when using protocol v2,
2018-07-20) and b4be74105fe (ls-remote: pass ref prefixes when
requesting a remote's refs, 2018-03-15). Only some of those codepaths
were releasing the allocated resources of the struct, now all of them
will.
Mark the "t/t5511-refspec.sh" test as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target). Previously 24/47 tests would fail.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak that happened when the --path option was
provided. This leak has been with us ever since the option was added
in 39702431500 (add --path option to git hash-object, 2008-08-03).
We can now mark "t1007-hash-object.sh" as passing when git is compiled
with SANITIZE=leak. It'll now run in the the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sat, 5 Feb 2022 17:42:32 +0000 (09:42 -0800)]
Merge branch 'ms/update-index-racy'
"git update-index --refresh" has been taught to deal better with
racy timestamps (just like "git status" already does).
* ms/update-index-racy:
update-index: refresh should rewrite index in case of racy timestamps
t7508: add tests capturing racy timestamp handling
t7508: fix bogus mtime verification
test-lib: introduce API for verifying file mtime
Junio C Hamano [Sat, 5 Feb 2022 17:42:31 +0000 (09:42 -0800)]
Merge branch 'ab/cat-file'
Assorted updates to "git cat-file", especially "-h".
* ab/cat-file:
cat-file: s/_/-/ in typo'd usage_msg_optf() message
cat-file: don't whitespace-pad "(...)" in SYNOPSIS and usage output
cat-file: use GET_OID_ONLY_TO_DIE in --(textconv|filters)
object-name.c: don't have GET_OID_ONLY_TO_DIE imply *_QUIETLY
cat-file: correct and improve usage information
cat-file: fix remaining usage bugs
cat-file: make --batch-all-objects a CMDMODE
cat-file: move "usage" variable to cmd_cat_file()
cat-file docs: fix SYNOPSIS and "-h" output
parse-options API: add a usage_msg_optf()
cat-file tests: test messaging on bad objects/paths
cat-file tests: test bad usage
Junio C Hamano [Sat, 5 Feb 2022 17:42:31 +0000 (09:42 -0800)]
Merge branch 'jc/qsort-s-alignment-fix'
Fix a hand-rolled alloca() imitation that may have violated
alignment requirement of data being sorted in compatibility
implementation of qsort_s() and stable qsort().
* jc/qsort-s-alignment-fix:
stable-qsort: avoid using potentially unaligned access
compat/qsort_s.c: avoid using potentially unaligned access
Junio C Hamano [Sat, 5 Feb 2022 17:42:30 +0000 (09:42 -0800)]
Merge branch 'rs/apply-symlinks-use-strset'
"git apply" (ab)used the util pointer of the string-list to keep
track of how each symbolic link needs to be handled, which has been
simplified by using strset.
* rs/apply-symlinks-use-strset:
apply: use strsets to track symlinks
Junio C Hamano [Sat, 5 Feb 2022 17:42:30 +0000 (09:42 -0800)]
Merge branch 'rs/grep-expr-cleanup'
Code clean-up.
* rs/grep-expr-cleanup:
grep: use grep_and_expr() in compile_pattern_and()
grep: extract grep_binexp() from grep_or_expr()
grep: use grep_not_expr() in compile_pattern_not()
grep: use grep_or_expr() in compile_pattern_or()
* jh/p4-spawning-external-commands-cleanup:
git-p4: don't print shell commands as python lists
git-p4: pass command arguments as lists instead of using shell
git-p4: don't select shell mode using the type of the command argument
Junio C Hamano [Sat, 5 Feb 2022 17:42:28 +0000 (09:42 -0800)]
Merge branch 'pb/pull-rebase-autostash-fix'
"git pull --rebase" ignored the rebase.autostash configuration
variable when the remote history is a descendant of our history,
which has been corrected.
* pb/pull-rebase-autostash-fix:
pull --rebase: honor rebase.autostash when fast-forwarding
t0051: use "skip_all" under !MINGW in single-test file
Have this file added in 06ba9d03e34 (t0051: test GIT_TRACE to a
windows named pipe, 2018-09-11) use the same "skip_all" pattern as an
existing Windows-only test added in 0e218f91c29 (mingw: unset PERL5LIB
by default, 2018-10-30) uses.
This way TAP consumers like "prove" will show a nice summary when the
test is skipped. Instead of:
$ prove t0051-windows-named-pipe.sh
[...]
t0051-windows-named-pipe.sh .. ok
[...]
This is because we are now making use of the right TAP-y way to
communicate this to the consumer. I.e. skipping the whole test file,
v.s. skipping individual tests (in this case there's only one test).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Glen Choo [Sat, 29 Jan 2022 00:04:45 +0000 (16:04 -0800)]
branch: add --recurse-submodules option for branch creation
To improve the submodules UX, we would like to teach Git to handle
branches in submodules. Start this process by teaching "git branch" the
--recurse-submodules option so that "git branch --recurse-submodules
topic" will create the `topic` branch in the superproject and its
submodules.
Although this commit does not introduce breaking changes, it does not
work well with existing --recurse-submodules commands because "git
branch --recurse-submodules" writes to the submodule ref store, but most
commands only consider the superproject gitlink and ignore the submodule
ref store. For example, "git checkout --recurse-submodules" will check
out the commits in the superproject gitlinks (and put the submodules in
detached HEAD) instead of checking out the submodule branches.
Because of this, this commit introduces a new configuration value,
`submodule.propagateBranches`. The plan is for Git commands to
prioritize submodule ref store information over superproject gitlinks if
this value is true. Because "git branch --recurse-submodules" writes to
submodule ref stores, for the sake of clarity, it will not function
unless this configuration value is set.
This commit also includes changes that support working with submodules
from a superproject commit because "branch --recurse-submodules" (and
future commands) need to read .gitmodules and gitlinks from the
superproject commit, but submodules are typically read from the
filesystem's .gitmodules and the index's gitlinks. These changes are:
* add a submodules_of_tree() helper that gives the relevant
information of an in-tree submodule (e.g. path and oid) and
initializes the repository
* add is_tree_submodule_active() by adding a treeish_name parameter to
is_submodule_active()
* add the "submoduleNotUpdated" advice to advise users to update the
submodules in their trees
Incidentally, fix an incorrect usage string that combined the 'list'
usage of git branch (-l) with the 'create' usage; this string has been
incorrect since its inception, a8dfd5eac4 (Make builtin-branch.c use
parse_options., 2007-10-07).
Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Glen Choo <chooglen@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: add a GIT_COMPLETION_SHOW_ALL_COMMANDS
Add a GIT_COMPLETION_SHOW_ALL_COMMANDS=1 configuration setting to go
with the existing GIT_COMPLETION_SHOW_ALL=1 added in c099f579b98 (completion: add GIT_COMPLETION_SHOW_ALL env var,
2020-08-19).
This will include plumbing commands such as "cat-file" in "git <TAB>"
and "git c<TAB>" completion. Without/with this I have 134 and 243
completion with git <TAB>, respectively.
It was already possible to do this by tweaking
GIT_TESTING_PORCELAIN_COMMAND_LIST= from the outside, that testing
variable was added in 84a97131065 (completion: let git provide the
completable command list, 2018-05-20). Doing this before loading
git-completion.bash worked:
But such testing variables are not meant to be used from the outside,
and we make no guarantees that those internal won't change. So let's
expose this as a dedicated configuration knob.
It would be better to teach --list-cmds=* a new category which would
include all of these groups, but that's a larger change that we can
leave for some other time.
completion tests: re-source git-completion.bash in a subshell
Change tests of git-completion.bash that re-source it to do so inside
a subshell. Re-sourcing it will clobber variables it sets, and in the
case of the "GIT_COMPLETION_SHOW_ALL=1" test added in ca2d62b7879 (parse-options: don't complete option aliases by default,
2021-07-16) change the behavior of the completion persistently.
Aside from the addition of "(" and ")" on new lines this is an
indentation-only change, only the "(" and ")" lines are changed under
"git diff -w".
So let's change that test, and for good measure do the same for the
three tests that precede it, which were added in 8b0eaa41f23 (completion: clear cached --options when sourcing the
completion script, 2018-03-22). The may not be wrong, but doing this
establishes a more reliable pattern for future tests, which might use
these as a template to copy.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shaoxuan Yuan [Wed, 2 Feb 2022 06:43:00 +0000 (14:43 +0800)]
t/lib-read-tree-m-3way: indent with tabs
As Documentation/CodingGuidelines says, our shell scripts
(including tests) are to use HT for indentation, but this script
uses 4-column indent with SP. Fix this.
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>