bundle API: change "flags" to be "extra_index_pack_args"
Since the "flags" parameter was added in be042aff24c (Teach progress
eye-candy to fetch_refs_from_bundle(), 2011-09-18) there's never been
more than the one flag: BUNDLE_VERBOSE.
Let's have the only caller who cares about that pass "-v" itself
instead through new "extra_index_pack_args" parameter. The flexibility
of being able to pass arbitrary arguments to "unbundle" will be used
in a subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no other API docs in bundle.h, but this is at least a
start. We'll add a parameter to this function in a subsequent commit,
but let's start by documenting it.
The "/**" comment (as opposed to "/*") signifies the start of API
documentation. See [1] and bdfdaa4978d (strbuf.h: integrate
api-strbuf.txt documentation, 2015-01-16) and 6afbbdda333 (strbuf.h:
unify documentation comments beginnings, 2015-01-16) for a discussion
of that convention.
Junio C Hamano [Tue, 24 Aug 2021 22:32:41 +0000 (15:32 -0700)]
Merge branch 'ps/fetch-pack-load-refs-optim'
Loading of ref tips to prepare for common ancestry negotiation in
"git fetch-pack" has been optimized by taking advantage of the
commit graph when available.
* ps/fetch-pack-load-refs-optim:
fetch-pack: speed up loading of refs via commit graph
Junio C Hamano [Tue, 24 Aug 2021 22:32:40 +0000 (15:32 -0700)]
Merge branch 'jt/push-negotiation-fixes'
Bugfix for common ancestor negotiation recently introduced in "git
push" code path.
* jt/push-negotiation-fixes:
fetch: die on invalid --negotiation-tip hash
send-pack: fix push nego. when remote has refs
send-pack: fix push.negotiate with remote helper
Junio C Hamano [Tue, 24 Aug 2021 22:32:39 +0000 (15:32 -0700)]
Merge branch 'hn/refs-test-cleanup'
A handful of tests that assumed implementation details of files
backend for refs have been cleaned up.
* hn/refs-test-cleanup:
t6001: avoid direct file system access
t6500: use "ls -1" to snapshot ref database state
t7064: use update-ref -d to remove upstream branch
t1410: mark test as REFFILES
t1405: mark test for 'git pack-refs' as REFFILES
t1405: use 'git reflog exists' to check reflog existence
t2402: use ref-store test helper to create broken symlink
t3320: use git-symbolic-ref rather than filesystem access
t6120: use git-update-ref rather than filesystem access
t1503: mark symlink test as REFFILES
t6050: use git-update-ref rather than filesystem access
Junio C Hamano [Tue, 24 Aug 2021 22:32:39 +0000 (15:32 -0700)]
Merge branch 'en/ort-perf-batch-15'
Final batch for "merge -sort" optimization.
* en/ort-perf-batch-15:
merge-ort: remove compile-time ability to turn off usage of memory pools
merge-ort: reuse path strings in pool_alloc_filespec
merge-ort: store filepairs and filespecs in our mem_pool
diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc
merge-ort: switch our strmaps over to using memory pools
merge-ort: set up a memory pool
merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers
diffcore-rename: use a mem_pool for exact rename detection's hashmap
merge-ort: rename str{map,intmap,set}_func()
Junio C Hamano [Tue, 24 Aug 2021 22:32:38 +0000 (15:32 -0700)]
Merge branch 'js/expand-runtime-prefix'
Pathname expansion (like "~username/") learned a way to specify a
location relative to Git installation (e.g. its $sharedir which is
$(prefix)/share), with "%(prefix)".
* js/expand-runtime-prefix:
expand_user_path: allow in-flight topics to keep using the old name
interpolate_path(): allow specifying paths relative to the runtime prefix
Use a better name for the function interpolating paths
expand_user_path(): clarify the role of the `real_home` parameter
expand_user_path(): remove stale part of the comment
tests: exercise the RUNTIME_PREFIX feature
Junio C Hamano [Tue, 24 Aug 2021 22:32:37 +0000 (15:32 -0700)]
Merge branch 'zh/ref-filter-raw-data'
Prepare the "ref-filter" machinery that drives the "--format"
option of "git for-each-ref" and its friends to be used in "git
cat-file --batch".
* zh/ref-filter-raw-data:
ref-filter: add %(rest) atom
ref-filter: use non-const ref_format in *_atom_parser()
ref-filter: --format=%(raw) support --perl
ref-filter: add %(raw) atom
ref-filter: add obj-type check in grab contents
Junio C Hamano [Tue, 24 Aug 2021 22:32:36 +0000 (15:32 -0700)]
Merge branch 'ab/http-drop-old-curl'
Support for ancient versions of cURL library (pre 7.19.4) has been
dropped.
* ab/http-drop-old-curl:
http: rename CURLOPT_FILE to CURLOPT_WRITEDATA
http: drop support for curl < 7.19.3 and < 7.17.0 (again)
http: drop support for curl < 7.19.4
http: drop support for curl < 7.16.0
http: drop support for curl < 7.11.1
Junio C Hamano [Tue, 24 Aug 2021 22:32:35 +0000 (15:32 -0700)]
Merge branch 'ds/add-with-sparse-index'
"git add" can work better with the sparse index.
* ds/add-with-sparse-index:
add: remove ensure_full_index() with --renormalize
add: ignore outside the sparse-checkout in refresh()
pathspec: stop calling ensure_full_index
add: allow operating on a sparse-only index
t1092: test merge conflicts outside cone
Junio C Hamano [Tue, 24 Aug 2021 22:32:35 +0000 (15:32 -0700)]
Merge branch 'jc/bisect-sans-show-branch'
"git bisect" spawned "git show-branch" only to pretty-print the
title of the commit after checking out the next version to be
tested; this has been rewritten in C.
* jc/bisect-sans-show-branch:
bisect: simplify return code from bisect_checkout()
bisect: do not run show-branch just to show the current commit
René Scharfe [Sat, 14 Aug 2021 20:00:38 +0000 (22:00 +0200)]
oidtree: avoid unaligned access to crit-bit tree
The flexible array member "k" of struct cb_node is used to store the key
of the crit-bit tree node. It offers no alignment guarantees -- in fact
the current struct layout puts it one byte after a 4-byte aligned
address, i.e. guaranteed to be misaligned.
oidtree uses a struct object_id as cb_node key. Since cf0983213c (hash:
add an algo member to struct object_id, 2021-04-26) it requires 4-byte
alignment. The mismatch is reported by UndefinedBehaviorSanitizer at
runtime like this:
hash.h:277:2: runtime error: member access within misaligned address 0x00015000802d for type 'struct object_id', which requires 4 byte alignment
0x00015000802d: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior hash.h:277:2 in
We can fix that by:
1. eliminating the alignment requirement of struct object_id,
2. providing the alignment in struct cb_node, or
3. avoiding the issue by only using memcpy to access "k".
Currently we only store one of two values in "algo" in struct object_id.
We could use a uint8_t for that instead and widen it only once we add
support for our twohundredth algorithm or so. That would not only avoid
alignment issues, but also reduce the memory requirements for each
instance of struct object_id by ca. 9%.
Supporting keys with alignment requirements might be useful to spread
the use of crit-bit trees. It can be achieved by using a wider type for
"k" (e.g. uintmax_t), using different types for the members "byte" and
"otherbits" (e.g. uint16_t or uint32_t for each), or by avoiding the use
of flexible arrays like khash.h does.
This patch implements the third option, though, because it has the least
potential for causing side-effects and we're close to the next release.
If one of the other options is implemented later as well to get their
additional benefits we can get rid of the extra copies introduced here.
Reported-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jiang Xin [Fri, 13 Aug 2021 23:56:22 +0000 (07:56 +0800)]
Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git: (51 commits)
Git 2.33-rc2
object-file: use unsigned arithmetic with bit mask
Revert 'diff-merges: let "-m" imply "-p"'
object-store: avoid extra ';' from KHASH_INIT
oidtree: avoid nested struct oidtree_node
Git 2.33-rc1
test: fix for COLUMNS and bash 5
The eighth batch
diff: --pickaxe-all typofix
mingw: align symlinks-related rmdir() behavior with Linux
t7508: avoid non POSIX BRE
use fspathhash() everywhere
t0001: fix broken not-quite getcwd(3) test in bed67874e2
Documentation: render special characters correctly
reset: clear_unpack_trees_porcelain to plug leak
builtin/rebase: fix options.strategy memory lifecycle
builtin/merge: free found_ref when done
builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
convert: release strbuf to avoid leak
read-cache: call diff_setup_done to avoid leak
...
Junio C Hamano [Wed, 11 Aug 2021 19:36:17 +0000 (12:36 -0700)]
Merge branch 'cb/many-alternate-optim-fixup'
Build fix.
* cb/many-alternate-optim-fixup:
object-file: use unsigned arithmetic with bit mask
object-store: avoid extra ';' from KHASH_INIT
oidtree: avoid nested struct oidtree_node
René Scharfe [Fri, 6 Aug 2021 17:53:47 +0000 (19:53 +0200)]
object-file: use unsigned arithmetic with bit mask
33f379eee6 (make object_directory.loose_objects_subdir_seen a bitmap,
2021-07-07) replaced a wasteful 256-byte array with a 32-byte array
and bit operations. The mask calculation shifts a literal 1 of type
int left by anything between 0 and 31. UndefinedBehaviorSanitizer
doesn't like that and reports:
object-file.c:2477:18: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
Make sure to use an unsigned 1 instead to avoid the issue.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Nieder [Fri, 6 Aug 2021 01:45:23 +0000 (18:45 -0700)]
Revert 'diff-merges: let "-m" imply "-p"'
This reverts commit f5bfcc823ba242a46e20fb6f71c9fbf7ebb222fe, which
made "git log -m" imply "--patch" by default. The logic was that
"-m", which makes diff generation for merges perform a diff against
each parent, has no use unless I am viewing the diff, so we could save
the user some typing by turning on display of the resulting diff
automatically. That wasn't expected to adversely affect scripts
because scripts would either be using a command like "git diff-tree"
that already emits diffs by default or would be combining -m with a
diff generation option such as --name-status. By saving typing for
interactive use without adversely affecting scripts in the wild, it
would be a pure improvement.
The problem is that although diff generation options are only relevant
for the displayed diff, a script author can imagine them affecting
path limiting. For example, I might run
git log -w --format=%H -- README
hoping to list commits that edited README, excluding whitespace-only
changes. In fact, a whitespace-only change is not TREESAME so the use
of -w here has no effect (since we don't apply these diff generation
flags to the diff_options struct rev_info::pruning used for this
purpose), but the documentation suggests that it should work
Suppose you specified foo as the <paths>. We shall call
commits that modify foo !TREESAME, and the rest TREESAME. (In
a diff filtered for foo, they look different and equal,
respectively.)
and a script author who has not tested whitespace-only changes
wouldn't notice.
Similarly, a script author could include
git log -m --first-parent --format=%H -- README
to filter the first-parent history for commits that modified README.
The -m is a no-op but it reflects the script author's intent. For
example, until 1e20a407fe2 (stash list: stop passing "-m" to "git
log", 2021-05-21), "git stash list" did this.
As a result, we can't safely change "-m" to imply "-p" without fear of
breaking such scripts. Restore the previous behavior.
Noticed because Rust's src/bootstrap/bootstrap.py made use of this
same construct: https://github.com/rust-lang/rust/pull/87513. That
script has been updated to omit the unnecessary "-m" option, but we
can expect other scripts in the wild to have similar expectations.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
cf2dc1c238 (speed up alt_odb_usable() with many alternates, 2021-07-07)
introduces a KHASH_INIT invocation with a trailing ';', which while
commonly expected will trigger warnings with pedantic on both
clang[-Wextra-semi] and gcc[-Wpedantic], because that macro has already
a semicolon and is meant to be invoked without one.
while fixing the macro would be a worthy solution (specially considering
this is a common recurring problem), remove the extra ';' for now to
minimize churn.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since c49a177bec (test-lib.sh: set COLUMNS=80 for --verbose
repeatability, 2021-06-29) multiple tests have been failing when using
bash 5 because checkwinsize is enabled by default, therefore COLUMNS is
reset using TIOCGWINSZ even for non-interactive shells.
It's debatable whether or not bash should even be doing that, but for
now we can avoid this undesirable behavior by disabling this option.
Reported-by: Fabian Stelzer <fabian.stelzer@campoint.net> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
[jc: with SZEDER Gábor's suggestion to do this before setting COLUMNS] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Windows rmdir() equivalent behaves differently from POSIX ones in
that when used on a symbolic link that points at a directory, the
target directory gets removed, which has been corrected.
* tb/mingw-rmdir-symlink-to-directory:
mingw: align symlinks-related rmdir() behavior with Linux
Junio C Hamano [Wed, 4 Aug 2021 20:28:54 +0000 (13:28 -0700)]
Merge branch 'en/ort-perf-batch-14'
Further optimization on "merge -sort" backend.
* en/ort-perf-batch-14:
merge-ort: restart merge with cached renames to reduce process entry cost
merge-ort: avoid recursing into directories when we don't need to
merge-ort: defer recursing into directories when merge base is matched
merge-ort: add a handle_deferred_entries() helper function
merge-ort: add data structures for allowable trivial directory resolves
merge-ort: add some more explanations in collect_merge_info_callback()
merge-ort: resolve paths early when we have sufficient information
Junio C Hamano [Wed, 4 Aug 2021 20:28:52 +0000 (13:28 -0700)]
Merge branch 'ah/plugleaks'
Leak plugging.
* ah/plugleaks:
reset: clear_unpack_trees_porcelain to plug leak
builtin/rebase: fix options.strategy memory lifecycle
builtin/merge: free found_ref when done
builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
convert: release strbuf to avoid leak
read-cache: call diff_setup_done to avoid leak
ref-filter: also free head for ATOM_HEAD to avoid leak
diffcore-rename: move old_dir/new_dir definition to plug leak
builtin/for-each-repo: remove unnecessary argv copy to plug leak
builtin/submodule--helper: release unused strbuf to avoid leak
environment: move strbuf into block to plug leak
fmt-merge-msg: free newly allocated temporary strings when done
Junio C Hamano [Wed, 4 Aug 2021 20:28:52 +0000 (13:28 -0700)]
Merge branch 'ar/submodule-add'
Rewrite of "git submodule" in C continues.
* ar/submodule-add:
submodule: drop unused sm_name parameter from show_fetch_remotes()
submodule--helper: introduce add-clone subcommand
submodule--helper: refactor module_clone()
submodule: prefix die messages with 'fatal'
t7400: test failure to add submodule in tracked path
fetch-pack: speed up loading of refs via commit graph
When doing reference negotiation, git-fetch-pack(1) is loading all refs
from disk in order to determine which commits it has in common with the
remote repository. This can be quite expensive in repositories with many
references though: in a real-world repository with around 2.2 million
refs, fetching a single commit by its ID takes around 44 seconds.
Dominating the loading time is decompression and parsing of the objects
which are referenced by commits. Given the fact that we only care about
commits (or tags which can be peeled to one) in this context, there is
thus an easy performance win by switching the parsing logic to make use
of the commit graph in case we have one available. Like this, we avoid
hitting the object database to parse these commits but instead only load
them from the commit-graph. This results in a significant performance
boost when executing git-fetch in said repository with 2.2 million refs:
Benchmark #1: HEAD~: git fetch $remote $commit
Time (mean ± σ): 44.168 s ± 0.341 s [User: 42.985 s, System: 1.106 s]
Range (min … max): 43.565 s … 44.577 s 10 runs
Benchmark #2: HEAD: git fetch $remote $commit
Time (mean ± σ): 19.498 s ± 0.724 s [User: 18.751 s, System: 0.690 s]
Range (min … max): 18.629 s … 20.454 s 10 runs
Summary
'HEAD: git fetch $remote $commit' ran
2.27 ± 0.09 times faster than 'HEAD~: git fetch $remote $commit'
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bagas Sanjaya [Wed, 4 Aug 2021 12:24:20 +0000 (19:24 +0700)]
diff: --pickaxe-all typofix
When I was fixing fuzzies as I updating po/id.po for 2.33.0 l10n round,
I noticed a triple-dash typo (--pickaxe-all) at diff.c, which according
to git-diff(1) manpage, the correct option name should be --pickaxe-all.
Fix the typo.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge-ort: remove compile-time ability to turn off usage of memory pools
Simplify code maintenance by removing the ability to toggle between
usage of memory pools and direct allocations. This allows us to also
remove paths_to_free since it was solely about bookkeeping to make sure
we freed the necessary paths, and allows us to remove some auxiliary
functions.
Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>