Junio C Hamano [Wed, 4 Aug 2021 20:28:52 +0000 (13:28 -0700)]
Merge branch 'ah/plugleaks'
Leak plugging.
* ah/plugleaks:
reset: clear_unpack_trees_porcelain to plug leak
builtin/rebase: fix options.strategy memory lifecycle
builtin/merge: free found_ref when done
builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
convert: release strbuf to avoid leak
read-cache: call diff_setup_done to avoid leak
ref-filter: also free head for ATOM_HEAD to avoid leak
diffcore-rename: move old_dir/new_dir definition to plug leak
builtin/for-each-repo: remove unnecessary argv copy to plug leak
builtin/submodule--helper: release unused strbuf to avoid leak
environment: move strbuf into block to plug leak
fmt-merge-msg: free newly allocated temporary strings when done
Junio C Hamano [Wed, 4 Aug 2021 20:28:52 +0000 (13:28 -0700)]
Merge branch 'ar/submodule-add'
Rewrite of "git submodule" in C continues.
* ar/submodule-add:
submodule: drop unused sm_name parameter from show_fetch_remotes()
submodule--helper: introduce add-clone subcommand
submodule--helper: refactor module_clone()
submodule: prefix die messages with 'fatal'
t7400: test failure to add submodule in tracked path
Junio C Hamano [Wed, 28 Jul 2021 20:18:04 +0000 (13:18 -0700)]
Merge branch 'tb/reverse-midx'
The code that gives an error message in "git multi-pack-index" when
no subcommand is given tried to print a NULL pointer as a strong,
which has been corrected.
* tb/reverse-midx:
multi-pack-index: fix potential segfault without sub-command
Junio C Hamano [Wed, 28 Jul 2021 20:18:02 +0000 (13:18 -0700)]
Merge branch 'en/rename-limits-doc'
Documentation on "git diff -l<n>" and diff.renameLimit have been
updated, and the defaults for these limits have been raised.
* en/rename-limits-doc:
rename: bump limit defaults yet again
diffcore-rename: treat a rename_limit of 0 as unlimited
doc: clarify documentation for rename/copy limits
diff: correct warning message when renameLimit exceeded
Junio C Hamano [Wed, 28 Jul 2021 20:18:01 +0000 (13:18 -0700)]
Merge branch 'ds/status-with-sparse-index'
"git status" codepath learned to work with sparsely populated index
without hydrating it fully.
* ds/status-with-sparse-index:
t1092: document bad sparse-checkout behavior
fsmonitor: integrate with sparse index
wt-status: expand added sparse directory entries
status: use sparse-index throughout
status: skip sparse-checkout percentage with sparse-index
diff-lib: handle index diffs with sparse dirs
dir.c: accept a directory as part of cone-mode patterns
unpack-trees: unpack sparse directory entries
unpack-trees: rename unpack_nondirectories()
unpack-trees: compare sparse directories correctly
unpack-trees: preserve cache_bottom
t1092: add tests for status/add and sparse files
t1092: expand repository data shape
t1092: replace incorrect 'echo' with 'cat'
sparse-index: include EXTENDED flag when expanding
sparse-index: skip indexes with unmerged entries
Junio C Hamano [Wed, 28 Jul 2021 20:18:01 +0000 (13:18 -0700)]
Merge branch 'js/ci-make-sparse'
The CI gained a new job to run "make sparse" check.
* js/ci-make-sparse:
ci/install-dependencies: handle "sparse" job package installs
ci: run "apt-get update" before "apt-get install"
ci: run `make sparse` as part of the GitHub workflow
Junio C Hamano [Wed, 28 Jul 2021 20:17:58 +0000 (13:17 -0700)]
Merge branch 'jk/log-decorate-optim'
Optimize "git log" for cases where we wasted cycles to load ref
decoration data that may not be needed.
* jk/log-decorate-optim:
load_ref_decorations(): fix decoration with tags
add_ref_decoration(): rename s/type/deco_type/
load_ref_decorations(): avoid parsing non-tag objects
object.h: add lookup_object_by_type() function
object.h: expand docstring for lookup_unknown_object()
log: avoid loading decorations for userformats that don't need it
pretty.h: update and expand docstring for userformat_find_requirements()
Junio C Hamano [Wed, 28 Jul 2021 20:17:58 +0000 (13:17 -0700)]
Merge branch 'sm/worktree-add-lock'
"git worktree add --lock" learned to record why the worktree is
locked with a custom message.
* sm/worktree-add-lock:
worktree: teach `add` to accept --reason <string> with --lock
worktree: mark lock strings with `_()` for translation
t2400: clean up '"add" worktree with lock' test
Junio C Hamano [Wed, 28 Jul 2021 20:17:57 +0000 (13:17 -0700)]
Merge branch 'ew/many-alternate-optim'
Optimization for repositories with many alternate object store.
* ew/many-alternate-optim:
oidtree: a crit-bit tree for odb_loose_cache
oidcpy_with_padding: constify `src' arg
make object_directory.loose_objects_subdir_seen a bitmap
avoid strlen via strbuf_addstr in link_alt_odb_entry
speed up alt_odb_usable() with many alternates
Jeff King [Mon, 26 Jul 2021 17:53:39 +0000 (13:53 -0400)]
ci: run "apt-get update" before "apt-get install"
The "sparse" workflow runs "apt-get install" to pick up a few necessary
packages. But it needs to run "apt-get update" first, or it risks trying
to download an old package version that no longer exists. And in fact
this happens now, with output like:
2021-07-26T17:40:51.2551880Z E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/c/curl/libcurl4-openssl-dev_7.68.0-1ubuntu2.5_amd64.deb 404 Not Found [IP: 52.147.219.192 80]
2021-07-26T17:40:51.2554304Z E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
Our other ci jobs don't suffer from this; they rely on scripts in ci/,
and ci/install-dependencies does the appropriate "apt-get update".
Signed-off-by: Jeff King <peff@peff.net> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:30 +0000 (15:08 +0200)]
reset: clear_unpack_trees_porcelain to plug leak
setup_unpack_trees_porcelain() populates various fields on
unpack_tree_opts, we need to call clear_unpack_trees_porcelain() to
avoid leaking them. Specifically, we used to leak
unpack_tree_opts.msgs_to_free.
We have to do this in leave_reset_head because there are multiple
scenarios where unpack_tree_opts has already been configured, followed
by a 'goto leave_reset_head'. But we can also 'goto leave_reset_head'
prior to having initialised unpack_tree_opts via memset(..., 0, ...).
Therefore we also move unpack_tree_opts initialisation to the start of
reset_head(), and convert it to use brace initialisation - which
guarantees that we can never clear an uninitialised unpack_tree_opts.
clear_unpack_tree_opts() is always safe to call as long as
unpack_tree_opts is at least zero-initialised, i.e. it does not depend
on a previous call to setup_unpack_trees_porcelain().
LSAN output from t0021:
Direct leak of 192 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa721e5 in xrealloc wrapper.c:126:8
#2 0x9f7861 in strvec_push_nodup strvec.c:19:2
#3 0x9f7861 in strvec_pushf strvec.c:39:2
#4 0xa43e14 in setup_unpack_trees_porcelain unpack-trees.c:129:3
#5 0x97e011 in reset_head reset.c:53:2
#6 0x61dfa5 in cmd_rebase builtin/rebase.c:1991:9
#7 0x4ce83e in run_builtin git.c:475:11
#8 0x4ccafe in handle_builtin git.c:729:3
#9 0x4cb01c in run_argv git.c:818:4
#10 0x4cb01c in cmd_main git.c:949:19
#11 0x6b3f3d in main common-main.c:52:11
#12 0x7fa8addf3349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 147 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa721e5 in xrealloc wrapper.c:126:8
#2 0x9e8d54 in strbuf_grow strbuf.c:98:2
#3 0x9e8d54 in strbuf_vaddf strbuf.c:401:3
#4 0x9f7774 in strvec_pushf strvec.c:36:2
#5 0xa43e14 in setup_unpack_trees_porcelain unpack-trees.c:129:3
#6 0x97e011 in reset_head reset.c:53:2
#7 0x61dfa5 in cmd_rebase builtin/rebase.c:1991:9
#8 0x4ce83e in run_builtin git.c:475:11
#9 0x4ccafe in handle_builtin git.c:729:3
#10 0x4cb01c in run_argv git.c:818:4
#11 0x4cb01c in cmd_main git.c:949:19
#12 0x6b3f3d in main common-main.c:52:11
#13 0x7fa8addf3349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 134 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa721e5 in xrealloc wrapper.c:126:8
#2 0x9e8d54 in strbuf_grow strbuf.c:98:2
#3 0x9e8d54 in strbuf_vaddf strbuf.c:401:3
#4 0x9f7774 in strvec_pushf strvec.c:36:2
#5 0xa43fe4 in setup_unpack_trees_porcelain unpack-trees.c:168:3
#6 0x97e011 in reset_head reset.c:53:2
#7 0x61dfa5 in cmd_rebase builtin/rebase.c:1991:9
#8 0x4ce83e in run_builtin git.c:475:11
#9 0x4ccafe in handle_builtin git.c:729:3
#10 0x4cb01c in run_argv git.c:818:4
#11 0x4cb01c in cmd_main git.c:949:19
#12 0x6b3f3d in main common-main.c:52:11
#13 0x7fa8addf3349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 130 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa721e5 in xrealloc wrapper.c:126:8
#2 0x9e8d54 in strbuf_grow strbuf.c:98:2
#3 0x9e8d54 in strbuf_vaddf strbuf.c:401:3
#4 0x9f7774 in strvec_pushf strvec.c:36:2
#5 0xa43f20 in setup_unpack_trees_porcelain unpack-trees.c:150:3
#6 0x97e011 in reset_head reset.c:53:2
#7 0x61dfa5 in cmd_rebase builtin/rebase.c:1991:9
#8 0x4ce83e in run_builtin git.c:475:11
#9 0x4ccafe in handle_builtin git.c:729:3
#10 0x4cb01c in run_argv git.c:818:4
#11 0x4cb01c in cmd_main git.c:949:19
#12 0x6b3f3d in main common-main.c:52:11
#13 0x7fa8addf3349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 603 byte(s) leaked in 4 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
- cmd_rebase populates rebase_options.strategy with newly allocated
strings, hence we need to free those strings at the end of cmd_rebase
to avoid a leak.
- In some cases: get_replay_opts() is called, which prepares replay_opts
using data from rebase_options. We used to simply copy the pointer
from rebase_options.strategy, however that would now result in a
double-free because sequencer_remove_state() is eventually used to
free replay_opts.strategy. To avoid this we xstrdup() strategy when
adding it to replay_opts.
The original leak happens because we always populate
rebase_options.strategy, but we don't always enter the path that calls
get_replay_opts() and later sequencer_remove_state() - in other words
we'd always allocate a new string into rebase_options.strategy but
only sometimes did we free it. We now make sure that rebase_options
and replay_opts both own their own copies of strategy, and each copy
is free'd independently.
This was first seen when running t0021 with LSAN, but t2012 helped catch
the fact that we can't just free(options.strategy) at the end of
cmd_rebase (as that can cause a double-free). LSAN output from t0021:
LSAN output from t0021:
Direct leak of 4 byte(s) in 1 object(s) allocated from:
#0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0xa71eb8 in xstrdup wrapper.c:29:14
#2 0x61b1cc in cmd_rebase builtin/rebase.c:1779:22
#3 0x4ce83e in run_builtin git.c:475:11
#4 0x4ccafe in handle_builtin git.c:729:3
#5 0x4cb01c in run_argv git.c:818:4
#6 0x4cb01c in cmd_main git.c:949:19
#7 0x6b3fad in main common-main.c:52:11
#8 0x7f267b512349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:28 +0000 (15:08 +0200)]
builtin/merge: free found_ref when done
merge_name() calls dwim_ref(), which allocates a new string into
found_ref. Therefore add a free() to avoid leaking found_ref.
LSAN output from t0021:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0xa8beb8 in xstrdup wrapper.c:29:14
#2 0x954054 in expand_ref refs.c:671:12
#3 0x953cb6 in repo_dwim_ref refs.c:644:22
#4 0x5d3759 in dwim_ref refs.h:162:9
#5 0x5d3759 in merge_name builtin/merge.c:517:6
#6 0x5d3759 in collect_parents builtin/merge.c:1214:5
#7 0x5cf60d in cmd_merge builtin/merge.c:1458:16
#8 0x4ce83e in run_builtin git.c:475:11
#9 0x4ccafe in handle_builtin git.c:729:3
#10 0x4cb01c in run_argv git.c:818:4
#11 0x4cb01c in cmd_main git.c:949:19
#12 0x6bdbfd in main common-main.c:52:11
#13 0x7f0430502349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:27 +0000 (15:08 +0200)]
builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
These leaks all happen at the end of cmd_mv, hence don't matter in any
way. But we still fix the easy ones and squash the rest to get us closer
to being able to run tests without leaks.
LSAN output from t0050:
Direct leak of 384 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa8c015 in xrealloc wrapper.c:126:8
#2 0xa0a7e1 in add_entry string-list.c:44:2
#3 0xa0a7e1 in string_list_insert string-list.c:58:14
#4 0x5dac03 in cmd_mv builtin/mv.c:248:4
#5 0x4ce83e in run_builtin git.c:475:11
#6 0x4ccafe in handle_builtin git.c:729:3
#7 0x4cb01c in run_argv git.c:818:4
#8 0x4cb01c in cmd_main git.c:949:19
#9 0x6bd9ad in main common-main.c:52:11
#10 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0xa8bd09 in do_xmalloc wrapper.c:41:8
#2 0x5dbc34 in internal_prefix_pathspec builtin/mv.c:32:2
#3 0x5da575 in cmd_mv builtin/mv.c:158:14
#4 0x4ce83e in run_builtin git.c:475:11
#5 0x4ccafe in handle_builtin git.c:729:3
#6 0x4cb01c in run_argv git.c:818:4
#7 0x4cb01c in cmd_main git.c:949:19
#8 0x6bd9ad in main common-main.c:52:11
#9 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0xa8bd09 in do_xmalloc wrapper.c:41:8
#2 0x5dbc34 in internal_prefix_pathspec builtin/mv.c:32:2
#3 0x5da4e4 in cmd_mv builtin/mv.c:148:11
#4 0x4ce83e in run_builtin git.c:475:11
#5 0x4ccafe in handle_builtin git.c:729:3
#6 0x4cb01c in run_argv git.c:818:4
#7 0x4cb01c in cmd_main git.c:949:19
#8 0x6bd9ad in main common-main.c:52:11
#9 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x49a9a2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0xa8c119 in xcalloc wrapper.c:140:8
#2 0x5da585 in cmd_mv builtin/mv.c:159:22
#3 0x4ce83e in run_builtin git.c:475:11
#4 0x4ccafe in handle_builtin git.c:729:3
#5 0x4cb01c in run_argv git.c:818:4
#6 0x4cb01c in cmd_main git.c:949:19
#7 0x6bd9ad in main common-main.c:52:11
#8 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 4 byte(s) in 1 object(s) allocated from:
#0 0x49a9a2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0xa8c119 in xcalloc wrapper.c:140:8
#2 0x5da4f8 in cmd_mv builtin/mv.c:149:10
#3 0x4ce83e in run_builtin git.c:475:11
#4 0x4ccafe in handle_builtin git.c:729:3
#5 0x4cb01c in run_argv git.c:818:4
#6 0x4cb01c in cmd_main git.c:949:19
#7 0x6bd9ad in main common-main.c:52:11
#8 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa8c015 in xrealloc wrapper.c:126:8
#2 0xa00226 in strbuf_grow strbuf.c:98:2
#3 0xa00226 in strbuf_vaddf strbuf.c:394:3
#4 0xa065c7 in xstrvfmt strbuf.c:981:2
#5 0xa065c7 in xstrfmt strbuf.c:991:8
#6 0x9e7ce7 in prefix_path_gently setup.c:115:15
#7 0x9e7fa6 in prefix_path setup.c:128:12
#8 0x5dbdbf in internal_prefix_pathspec builtin/mv.c:55:23
#9 0x5da575 in cmd_mv builtin/mv.c:158:14
#10 0x4ce83e in run_builtin git.c:475:11
#11 0x4ccafe in handle_builtin git.c:729:3
#12 0x4cb01c in run_argv git.c:818:4
#13 0x4cb01c in cmd_main git.c:949:19
#14 0x6bd9ad in main common-main.c:52:11
#15 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa8c015 in xrealloc wrapper.c:126:8
#2 0xa00226 in strbuf_grow strbuf.c:98:2
#3 0xa00226 in strbuf_vaddf strbuf.c:394:3
#4 0xa065c7 in xstrvfmt strbuf.c:981:2
#5 0xa065c7 in xstrfmt strbuf.c:991:8
#6 0x9e7ce7 in prefix_path_gently setup.c:115:15
#7 0x9e7fa6 in prefix_path setup.c:128:12
#8 0x5dbdbf in internal_prefix_pathspec builtin/mv.c:55:23
#9 0x5da4e4 in cmd_mv builtin/mv.c:148:11
#10 0x4ce83e in run_builtin git.c:475:11
#11 0x4ccafe in handle_builtin git.c:729:3
#12 0x4cb01c in run_argv git.c:818:4
#13 0x4cb01c in cmd_main git.c:949:19
#14 0x6bd9ad in main common-main.c:52:11
#15 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 558 byte(s) leaked in 7 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:26 +0000 (15:08 +0200)]
convert: release strbuf to avoid leak
apply_multi_file_filter and async_query_available_blobs both query
subprocess output using subprocess_read_status, which writes data into
the identically named filter_status strbuf. We add a strbuf_release to
avoid leaking their contents.
Leak output seen when running t0021 with LSAN:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa8c2b5 in xrealloc wrapper.c:126:8
#2 0x9ff99d in strbuf_grow strbuf.c:98:2
#3 0x9ff99d in strbuf_addbuf strbuf.c:304:2
#4 0xa101d6 in subprocess_read_status sub-process.c:45:5
#5 0x77793c in apply_multi_file_filter convert.c:886:8
#6 0x77793c in apply_filter convert.c:1042:10
#7 0x77a0b5 in convert_to_git_filter_fd convert.c:1492:7
#8 0x8b48cd in index_stream_convert_blob object-file.c:2156:2
#9 0x8b48cd in index_fd object-file.c:2248:9
#10 0x597411 in hash_fd builtin/hash-object.c:43:9
#11 0x596be1 in hash_object builtin/hash-object.c:59:2
#12 0x596be1 in cmd_hash_object builtin/hash-object.c:153:3
#13 0x4ce83e in run_builtin git.c:475:11
#14 0x4ccafe in handle_builtin git.c:729:3
#15 0x4cb01c in run_argv git.c:818:4
#16 0x4cb01c in cmd_main git.c:949:19
#17 0x6bdc2d in main common-main.c:52:11
#18 0x7f42acf79349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s).
Direct leak of 120 byte(s) in 5 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa8c295 in xrealloc wrapper.c:126:8
#2 0x9ff97d in strbuf_grow strbuf.c:98:2
#3 0x9ff97d in strbuf_addbuf strbuf.c:304:2
#4 0xa101b6 in subprocess_read_status sub-process.c:45:5
#5 0x775c73 in async_query_available_blobs convert.c:960:8
#6 0x80029d in finish_delayed_checkout entry.c:183:9
#7 0xa65d1e in check_updates unpack-trees.c:493:10
#8 0xa5f469 in unpack_trees unpack-trees.c:1747:8
#9 0x525971 in checkout builtin/clone.c:815:6
#10 0x525971 in cmd_clone builtin/clone.c:1409:8
#11 0x4ce83e in run_builtin git.c:475:11
#12 0x4ccafe in handle_builtin git.c:729:3
#13 0x4cb01c in run_argv git.c:818:4
#14 0x4cb01c in cmd_main git.c:949:19
#15 0x6bdc2d in main common-main.c:52:11
#16 0x7fa253fce349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 120 byte(s) leaked in 5 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:25 +0000 (15:08 +0200)]
read-cache: call diff_setup_done to avoid leak
repo_diff_setup() calls through to diff.c's static prep_parse_options(),
which in turn allocates a new array into diff_opts.parseopts.
diff_setup_done() is responsible for freeing that array, and has the
benefit of verifying diff_opts too - hence we add a call to
diff_setup_done() to avoid leaking parseopts.
Output from the leak as found while running t0090 with LSAN:
Direct leak of 7120 byte(s) in 1 object(s) allocated from:
#0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0xa8bf89 in do_xmalloc wrapper.c:41:8
#2 0x7a7bae in prep_parse_options diff.c:5636:2
#3 0x7a7bae in repo_diff_setup diff.c:4611:2
#4 0x93716c in repo_index_has_changes read-cache.c:2518:3
#5 0x872233 in unclean merge-ort-wrappers.c:12:14
#6 0x872233 in merge_ort_recursive merge-ort-wrappers.c:53:6
#7 0x5d5b11 in try_merge_strategy builtin/merge.c:752:12
#8 0x5d0b6b in cmd_merge builtin/merge.c:1666:9
#9 0x4ce83e in run_builtin git.c:475:11
#10 0x4ccafe in handle_builtin git.c:729:3
#11 0x4cb01c in run_argv git.c:818:4
#12 0x4cb01c in cmd_main git.c:949:19
#13 0x6bdc2d in main common-main.c:52:11
#14 0x7f551eb51349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 7120 byte(s) leaked in 1 allocation(s)
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:24 +0000 (15:08 +0200)]
ref-filter: also free head for ATOM_HEAD to avoid leak
u.head is populated using resolve_refdup(), which returns a newly
allocated string - hence we also need to free() it.
Found while running t0041 with LSAN:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0xa8be98 in xstrdup wrapper.c:29:14
#2 0x9481db in head_atom_parser ref-filter.c:549:17
#3 0x9408c7 in parse_ref_filter_atom ref-filter.c:703:30
#4 0x9400e3 in verify_ref_format ref-filter.c:974:8
#5 0x4f9e8b in print_ref_list builtin/branch.c:439:6
#6 0x4f9e8b in cmd_branch builtin/branch.c:757:3
#7 0x4ce83e in run_builtin git.c:475:11
#8 0x4ccafe in handle_builtin git.c:729:3
#9 0x4cb01c in run_argv git.c:818:4
#10 0x4cb01c in cmd_main git.c:949:19
#11 0x6bdc2d in main common-main.c:52:11
#12 0x7f96edf86349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:23 +0000 (15:08 +0200)]
diffcore-rename: move old_dir/new_dir definition to plug leak
old_dir/new_dir are free()'d at the end of update_dir_rename_counts,
however if we return early we'll never free those strings. Therefore
we should move all new allocations after the possible early return,
avoiding a leak.
This seems like a fairly recent leak, that started happening since the
early-return was added in: 1ad69eb0dc (diffcore-rename: compute dir_rename_counts in stages, 2021-02-27)
LSAN output from t0022:
Direct leak of 7 byte(s) in 1 object(s) allocated from:
#0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0xa71e48 in xstrdup wrapper.c:29:14
#2 0x7db9c7 in update_dir_rename_counts diffcore-rename.c:464:12
#3 0x7db6ae in find_renames diffcore-rename.c:1062:3
#4 0x7d76c3 in diffcore_rename_extended diffcore-rename.c:1472:18
#5 0x7b4cfc in diffcore_std diff.c:6705:4
#6 0x855e46 in log_tree_diff_flush log-tree.c:846:2
#7 0x856574 in log_tree_diff log-tree.c:955:3
#8 0x856574 in log_tree_commit log-tree.c:986:10
#9 0x9a9c67 in print_commit_summary sequencer.c:1329:7
#10 0x52e623 in cmd_commit builtin/commit.c:1862:3
#11 0x4ce83e in run_builtin git.c:475:11
#12 0x4ccafe in handle_builtin git.c:729:3
#13 0x4cb01c in run_argv git.c:818:4
#14 0x4cb01c in cmd_main git.c:949:19
#15 0x6b3f3d in main common-main.c:52:11
#16 0x7fe397c7a349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 7 byte(s) in 1 object(s) allocated from:
#0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0xa71e48 in xstrdup wrapper.c:29:14
#2 0x7db9bc in update_dir_rename_counts diffcore-rename.c:463:12
#3 0x7db6ae in find_renames diffcore-rename.c:1062:3
#4 0x7d76c3 in diffcore_rename_extended diffcore-rename.c:1472:18
#5 0x7b4cfc in diffcore_std diff.c:6705:4
#6 0x855e46 in log_tree_diff_flush log-tree.c:846:2
#7 0x856574 in log_tree_diff log-tree.c:955:3
#8 0x856574 in log_tree_commit log-tree.c:986:10
#9 0x9a9c67 in print_commit_summary sequencer.c:1329:7
#10 0x52e623 in cmd_commit builtin/commit.c:1862:3
#11 0x4ce83e in run_builtin git.c:475:11
#12 0x4ccafe in handle_builtin git.c:729:3
#13 0x4cb01c in run_argv git.c:818:4
#14 0x4cb01c in cmd_main git.c:949:19
#15 0x6b3f3d in main common-main.c:52:11
#16 0x7fe397c7a349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 14 byte(s) leaked in 2 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:22 +0000 (15:08 +0200)]
builtin/for-each-repo: remove unnecessary argv copy to plug leak
cmd_for_each_repo() copies argv into args (a strvec), which is later
passed into run_command_on_repo(), which in turn copies that strvec onto
the end of child.args. The initial copy is unnecessary (we never modify
args). We therefore choose to just pass argv directly into
run_command_on_repo(), which lets us avoid the copy and fixes the leak.
LSAN output from t0068:
Direct leak of 192 byte(s) in 1 object(s) allocated from:
#0 0x7f63bd4ab8b0 in realloc (/usr/lib64/libasan.so.4+0xdc8b0)
#1 0x98d7e6 in xrealloc wrapper.c:126
#2 0x916914 in strvec_push_nodup strvec.c:19
#3 0x916a6e in strvec_push strvec.c:26
#4 0x4be4eb in cmd_for_each_repo builtin/for-each-repo.c:49
#5 0x410dcd in run_builtin git.c:475
#6 0x410dcd in handle_builtin git.c:729
#7 0x414087 in run_argv git.c:818
#8 0x414087 in cmd_main git.c:949
#9 0x40e9ec in main common-main.c:52
#10 0x7f63bc9fa349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 22 byte(s) in 2 object(s) allocated from:
#0 0x7f63bd445e30 in __interceptor_strdup (/usr/lib64/libasan.so.4+0x76e30)
#1 0x98d698 in xstrdup wrapper.c:29
#2 0x916a63 in strvec_push strvec.c:26
#3 0x4be4eb in cmd_for_each_repo builtin/for-each-repo.c:49
#4 0x410dcd in run_builtin git.c:475
#5 0x410dcd in handle_builtin git.c:729
#6 0x414087 in run_argv git.c:818
#7 0x414087 in cmd_main git.c:949
#8 0x40e9ec in main common-main.c:52
#9 0x7f63bc9fa349 in __libc_start_main (/lib64/libc.so.6+0x24349)
See also discussion about the original implementation below - this code
appears to have evolved from a callback explaining the double-strvec-copy
pattern, but there's no strong reason to keep that now:
https://lore.kernel.org/git/68bbeca5-314b-08ee-ef36-040e3f3814e9@gmail.com/
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:21 +0000 (15:08 +0200)]
builtin/submodule--helper: release unused strbuf to avoid leak
relative_url() populates sb. In the normal return path, its buffer is
detached using strbuf_detach(). However the early return path does
nothing with sb, which means that sb's memory is leaked - therefore
we add a release to avoid this leak.
The reset is also only necessary for the normal return path, hence we
move it down to after the early-return to avoid unnecessary work.
LSAN output from t0060:
Direct leak of 121 byte(s) in 1 object(s) allocated from:
#0 0x7f31246f28b0 in realloc (/usr/lib64/libasan.so.4+0xdc8b0)
#1 0x98d7d6 in xrealloc wrapper.c:126
#2 0x909a60 in strbuf_grow strbuf.c:98
#3 0x90bf00 in strbuf_vaddf strbuf.c:401
#4 0x90c321 in strbuf_addf strbuf.c:335
#5 0x5cb78d in relative_url builtin/submodule--helper.c:182
#6 0x5cbe46 in resolve_relative_url_test builtin/submodule--helper.c:248
#7 0x410dcd in run_builtin git.c:475
#8 0x410dcd in handle_builtin git.c:729
#9 0x414087 in run_argv git.c:818
#10 0x414087 in cmd_main git.c:949
#11 0x40e9ec in main common-main.c:52
#12 0x7f3123c41349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 121 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:20 +0000 (15:08 +0200)]
environment: move strbuf into block to plug leak
realpath is only populated if we execute the git_work_tree_initialized
block. However that block also causes us to return early, meaning we
never actually release the strbuf in the case where we populated it.
Therefore we move all strbuf related code into the block to guarantee
that we can't leak it.
LSAN output from t0095:
Direct leak of 129 byte(s) in 1 object(s) allocated from:
#0 0x49a9b9 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x78f585 in xrealloc wrapper.c:126:8
#2 0x713ff4 in strbuf_grow strbuf.c:98:2
#3 0x713ff4 in strbuf_getcwd strbuf.c:597:3
#4 0x4f0c18 in strbuf_realpath_1 abspath.c:99:7
#5 0x5ae4a4 in set_git_work_tree environment.c:259:3
#6 0x6fdd8a in setup_discovered_git_dir setup.c:931:2
#7 0x6fdd8a in setup_git_directory_gently setup.c:1235:12
#8 0x4cb50d in get_bloom_filter_for_commit t/helper/test-bloom.c:41:2
#9 0x4cb50d in cmd__bloom t/helper/test-bloom.c:95:3
#10 0x4caa1f in cmd_main t/helper/test-tool.c:124:11
#11 0x4caded in main common-main.c:52:11
#12 0x7f0869f02349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 129 byte(s) leaked in 1 allocation(s).
It looks like this leak has existed since realpath was first added to
set_git_work_tree() in: 3d7747e318 (real_path: remove unsafe API, 2020-03-10)
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 25 Jul 2021 13:08:19 +0000 (15:08 +0200)]
fmt-merge-msg: free newly allocated temporary strings when done
origin starts off pointing to somewhere within line, which is owned by
the caller. Later we might allocate a new string using xmemdupz() or
xstrfmt(). To avoid leaking these new strings, we introduce a to_free
pointer - which allows us to safely free the newly allocated string when
we're done (we cannot just free origin directly as it might still be
pointing to line).
LSAN output from t0090:
Direct leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0xa71f49 in do_xmalloc wrapper.c:41:8
#2 0xa720b0 in do_xmallocz wrapper.c:75:8
#3 0xa720b0 in xmallocz wrapper.c:83:9
#4 0xa720b0 in xmemdupz wrapper.c:99:16
#5 0x8092ba in handle_line fmt-merge-msg.c:187:23
#6 0x8092ba in fmt_merge_msg fmt-merge-msg.c:666:7
#7 0x5ce2e6 in prepare_merge_message builtin/merge.c:1119:2
#8 0x5ce2e6 in collect_parents builtin/merge.c:1215:3
#9 0x5c9c1e in cmd_merge builtin/merge.c:1454:16
#10 0x4ce83e in run_builtin git.c:475:11
#11 0x4ccafe in handle_builtin git.c:729:3
#12 0x4cb01c in run_argv git.c:818:4
#13 0x4cb01c in cmd_main git.c:949:19
#14 0x6b3fad in main common-main.c:52:11
#15 0x7fb929620349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Fri, 23 Jul 2021 18:52:23 +0000 (11:52 -0700)]
cache-tree: prefetch in partial clone read-tree
"git read-tree" checks the existence of the blobs referenced by the
given tree, but does not bulk prefetch them. Add a bulk prefetch.
The lack of prefetch here was noticed at $DAYJOB during a merge
involving some specific commits, but I couldn't find a minimal merge
that didn't also trigger the prefetch in check_updates() in
unpack-trees.c (and in all these cases, the lack of prefetch in
cache-tree.c didn't matter because all the relevant blobs would have
already been prefetched by then). This is why I used read-tree here to
exercise this code path.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 23 Jul 2021 09:29:34 +0000 (05:29 -0400)]
pack-bitmap: check pack validity when opening bitmap
When pack-objects adds an entry to its list of objects to pack, it may
mark the packfile and offset that contains the file, which we can later
use to output the object verbatim. If the packfile is deleted while we
are running (e.g., by another process running "git repack"), we may die
in use_pack() if the pack file cannot be opened.
We worked around this in 4c08018204 (pack-objects: protect against
disappearing packs, 2011-10-14) by making sure we can open the pack
before recording it as a source. This detects a pack which has already
disappeared while generating the packing list, and because we keep the
pack's file descriptor (or an mmap window) open, it means we can access
it later (unless you exceed core.packedgitlimit).
The bitmap code that was added later does not do this; it adds entries
to the packlist without checking that the packfile is still valid, and
is vulnerable to this race. It needs the same treatment as 4c08018204.
However, rather than add it in just that one spot, it makes more sense
to simply open and check the packfile when we open the bitmap.
Technically you can use the .bitmap without even looking in the .pack
file (e.g., if you are just printing a list of objects without accessing
them), but it's much simpler to do it early. That covers all later
direct uses of the pack (due to the cached descriptor) without having to
check each one directly. For example, in pack-objects we need to protect
the packlist entries, but we also access the pack directly as part of
the reuse_partial_pack_from_bitmap() feature. This patch covers both
cases.
There's no test here, because the problem is inherently racy. I
reproduced and verified the fix with this script:
git init --bare parent.git &&
git --git-dir=parent.git config transfer.unpacklimit 1 &&
git clone parent.git push.git &&
(for i in `seq 1 1000`; do push $i || break; done) &
pusher=$!
(for i in `seq 1 1000`; do fetch $i || break; done) &
fetcher=$!
wait $fetcher
kill $pusher
That simulates a race between a client cloning and a push triggering a
repack on the server. Without this patch, it generally fails within a
couple hundred iterations with:
remote: fatal: packfile ./objects/pack/.tmp-1377349-pack-498afdec371232bdb99d1757872f5569331da61e.pack cannot be accessed
error: git upload-pack: git-pack-objects died with error.
fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: fetch-pack: invalid index-pack output
With this patch, it reliably runs through all thousand attempts.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the bundle tests to fully compare the expected "git ls-remote"
or "git bundle list-heads" output, instead of merely grepping it.
This avoids subtle regressions in the tests. In f62e0a39b6 (t5704 (bundle): add tests for bundle --stdin, 2010-04-19)
the "bundle --stdin <rev-list options>" test was added to make sure we
didn't include the tag.
But since the --stdin mode didn't work until 5bb0fd2cab (bundle:
arguments can be read from stdin, 2021-01-11) our grepping of
"master" (later "main") missed the important part of the test.
Namely that we should not include the "refs/tags/tag" tag in that
case. Since the test only grepped for "main" in the output we'd miss a
regression in that code.
So let's use test_cmp instead, and also in the other nearby tests
where it's easy.
This does make things a bit more verbose in the case of the test
that's checking the bundle header, since it's different under SHA1 and
SHA256. I think this makes test easier to follow.
I've got some WIP changes to extend the "git bundle" command to dump
parts of the header out, which are easier to understand if we test the
output explicitly like this.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 22 Jul 2021 20:05:55 +0000 (13:05 -0700)]
Merge branch 'js/ci-windows-update'
GitHub Actions / CI update.
* js/ci-windows-update:
ci: accelerate the checkout
ci (vs-build): build with NO_GETTEXT
artifacts-tar: respect NO_GETTEXT
ci (windows): transfer also the Git-tracked files to the test jobs
ci: upgrade to using actions/{up,down}load-artifacts v2
ci (vs-build): use `cmd` to copy the DLLs, not `powershell`
ci: use the new GitHub Action to download git-sdk-64-minimal
Junio C Hamano [Thu, 22 Jul 2021 20:05:54 +0000 (13:05 -0700)]
Merge branch 'ab/send-email-optim'
"git send-email" optimization.
* ab/send-email-optim:
perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd()
send-email: move trivial config handling to Perl
perl: lazily load some common Git.pm setup code
send-email: lazily load modules for a big speedup
send-email: get rid of indirect object syntax
send-email: use function syntax instead of barewords
send-email: lazily shell out to "git var"
send-email: lazily load config for a big speedup
send-email: copy "config_regxp" into git-send-email.perl
send-email: refactor sendemail.smtpencryption config parsing
send-email: remove non-working support for "sendemail.smtpssl"
send-email tests: test for boolean variables without a value
send-email tests: support GIT_TEST_PERL_FATAL_WARNINGS=true
Felipe Contreras [Wed, 21 Jul 2021 22:15:45 +0000 (17:15 -0500)]
doc: pull: fix rebase=false documentation
"git pull --rebase=false" means we merge their history into ours, but
it has been described the other way around.
Cc: Stephen Haberman <stephen@exigencecorp.com> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
[jc: updated the log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Tue, 20 Jul 2021 20:16:22 +0000 (16:16 -0400)]
pack-bitmap: clarify comment in filter_bitmap_exclude_type()
The code that eventually became filter_bitmap_exclude_type() was
originally introduced in 4f3bd5606a (pack-bitmap: implement BLOB_NONE
filtering, 2020-02-14) to accelerate BLOB_NONE filters with bitmaps.
In 856e12c18a (pack-bitmap.c: make object filtering functions generic,
2020-05-04), it became filter_bitmap_exclude_type(). But not all of the
comments were updated to be agnostic to the provided type.
Remove the remaining comments which should have been updated in 856e12c18a to reflect the type-agnostic nature of the function.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Fri, 16 Jul 2021 03:02:57 +0000 (03:02 +0000)]
doc: clarify description of 'submodule.recurse'
The doc for 'submodule.recurse' starts with "Specifies if commands
recurse into submodles by default". This is not exactly true of all
commands that have a '--recurse-submodules' option. For example, 'git
pull --recurse-submodules' does not run 'git pull' in each submodule,
but rather runs 'git submodule update --recursive' so that the submodule
working trees after the pull matches the commits recorded in the
superproject.
Clarify that by just saying that it enables '--recurse-submodules'.
Note that the way this setting interacts with 'fetch.recurseSubmodules'
and 'push.recurseSubmodules', which can have other values than true or
false, is already documented since 4da9e99e6e (doc: be more precise on
(fetch|push).recurseSubmodules, 2020-04-06).
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 14 Jul 2021 21:38:27 +0000 (17:38 -0400)]
doc/git-config: simplify "override" advice for FILES section
At the end of the FILES section, we indicate that you can override the
regular lookup rules with --global, etc. But:
- we're missing the --local option
- we point to GIT_CONFIG instead of --file, but the latter has much
better documentation
- we're vague about how the overrides work; the actual option
descriptions are much better here
So let's just mention the names and point people back to the OPTIONS
section. We could perhaps even delete this paragraph entirely, but the
presence of the names may give people reading FILES a clue about where
to look for more information.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The scope and utility of the GIT_CONFIG variable was drastically reduced
by dc87183189 (Only use GIT_CONFIG in "git config", not other programs,
2008-06-30). But the documentation in git-config(1) predates that, which
makes it rather misleading.
These days it is really just another way to say "--file". So let's say
that, and explicitly make it clear that it does not impact other Git
commands (like GIT_CONFIG_SYSTEM, etc, would).
I also bumped it to the bottom of the list of variables, and warned
people off of using it. We don't have any plans for deprecation at this
point, but there's little point in encouraging people to use it by
putting it at the top of the list.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 14 Jul 2021 21:37:29 +0000 (17:37 -0400)]
doc/git-config: explain --file instead of referring to GIT_CONFIG
The explanation for the --file option only refers to GIT_CONFIG. This
redirection to an environment variable is confusing, but doubly so
because the description of GIT_CONFIG is out of date.
Let's describe --file from scratch, detailing both the reading and
writing behavior as we do for other similar options like --system, etc.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Testcases in t0000 are quite special given that they many of them run
nested testcases to verify that testing functionality itself works as
expected. These nested testcases are realized by writing a new ad-hoc
test script which again sources test-lib.sh, where the new script is
created in a nested subdirectory located beneath the current trash
directory. We then execute the new test script with the nested
subdirectory as current working directory and explicitly re-export
TEST_OUTPUT_DIRECTORY to point to that directory.
While this works as expected in the general case, it falls apart when
the developer has TEST_OUTPUT_DIRECTORY explicitly defined either via
the environment or via config.mak and runs "make test". In that case,
test-lib.sh will clobber the value that we've just carefully set up to
instead contain what the developer has defined. As a result, the
TEST_OUTPUT_DIRECTORY continues to point at the root output directory,
not at the nested one.
This issue causes breakage in the 'test_atexit is run' test case: the
nested test case writes files into "../../", which is assumed to be the
parent's trash directory. But because TEST_OUTPUT_DIRECTORY already
points to to the root output directory, we instead end up writing those
files outside of the output directory. The parent test case will then
try to check whether those files still exist in its own trash directory,
which thus must fail now.
Fix the issue by adding a new TEST_OUTPUT_DIRECTORY_OVERRIDE variable.
If set, then we'll always override the TEST_OUTPUT_DIRECTORY with its
value after sourcing GIT-BUILD-OPTIONS.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Mon, 19 Jul 2021 17:18:49 +0000 (13:18 -0400)]
multi-pack-index: fix potential segfault without sub-command
Since cd57bc41bb (builtin/multi-pack-index.c: display usage on
unrecognized command, 2021-03-30) we have used a "usage" label to avoid
having two separate callers of usage_with_options (one when no arguments
are given, and another for unrecognized sub-commands).
But the first caller has been broken since cd57bc41bb, since it will
happily jump to usage without arguments, and then pass argv[0] to the
"unrecognized subcommand" error.
Many compilers will save us from a segfault here, but the end result is
ugly, since it mentions an unrecognized subcommand when we didn't even
pass one, and (on GCC) includes "(null)" in its output.
Move the "usage" label down past the error about unrecognized
subcommands so that it is only triggered when it should be. While we're
at it, bulk up our test coverage in this area, too.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 16 Jul 2021 18:43:47 +0000 (14:43 -0400)]
t0000: clear GIT_SKIP_TESTS before running sub-tests
In t0000, we run several fake "sub-test" suites to verify the behavior
of the test suite. But because we don't clear the parent environment
completely, the sub-tests can be fooled by variables meant for the
parent. For example:
GIT_SKIP_TESTS=t1234 ./t0000-basic.sh
fails when a sub-test expects its fake t1234 to actually run. This
particular pattern is unlikely in practice; we're running a single
script, and there is no t1234 in the real test suite anyway (not yet, at
least). A more real-world example is:
GIT_SKIP_TESTS=t[^0]* make test
to run only the t0* tests.
The fix is conceptually simple: we should clear the GIT_SKIP_TESTS
variable when running the sub-tests, because its contents (if any) will
be meant for the main test suite. This is easy to do centrally in our
sub-test helper.
But there's a catch: some of our tests do set GIT_SKIP_TESTS
intentionally to test the feature. We need to allow them to continue to
set it, but clear it for all the other tests. And the sub-test helper
can't tell if the GIT_SKIP_TESTS it sees is from a test or not. We can
handle this by adding a new option to the helper to let callers specify
the skip list.
I considered adding a more general "--eval" option to let callers set up
the env for the sub-test however they like. That would cover this case
and possible future ones. But the quoting gets awkward for the callers
(since we're now 2 layers deep in evals!), so I went with the simpler
more specific solution.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-lib-functions: use test-tool for [de]packetize()
The shell+perl "[de]packetize()" helper functions were added in 4414a150025 (t/lib-git-daemon: add network-protocol helpers,
2018-01-24), and around the same time we added the "pkt-line" helper
in 74e70029615 (test-pkt-line: introduce a packet-line test helper,
2018-03-14).
For some reason it seems we've mostly used the shell+perl version
instead of the helper since then. There were discussions around 88124ab2636 (test-lib-functions: make packetize() more efficient,
2020-03-27) and cacae4329fa (test-lib-functions: simplify packetize()
stdin code, 2020-03-29) to improve them and make them more efficient.
There was one good reason to do so, we needed an equivalent of
"test-tool pkt-line pack", but that command wasn't capable of handling
input with "\n" (a feature) or "\0" (just because it happens to be
printf-based under the hood).
Let's add a "pkt-line-raw" helper for that, and expose is at a
packetize_raw() to go with the existing packetize() on the shell
level, this gives us the smallest amount of change to the tests
themselves.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sat, 17 Jul 2021 00:42:53 +0000 (17:42 -0700)]
Merge branch 'jt/partial-clone-submodule-1'
Prepare the internals for lazily fetching objects in submodules
from their promisor remotes.
* jt/partial-clone-submodule-1:
promisor-remote: teach lazy-fetch in any repo
run-command: refactor subprocess env preparation
submodule: refrain from filtering GIT_CONFIG_COUNT
promisor-remote: support per-repository config
repository: move global r_f_p_c to repo struct
Junio C Hamano [Sat, 17 Jul 2021 00:42:52 +0000 (17:42 -0700)]
Merge branch 'ab/struct-init'
Code cleanup around struct_type_init() functions.
* ab/struct-init:
string-list.h users: change to use *_{nodup,dup}()
string-list.[ch]: add a string_list_init_{nodup,dup}()
dir.[ch]: replace dir_init() with DIR_INIT
*.c *_init(): define in terms of corresponding *_INIT macro
*.h: move some *_INIT to designated initializers
Junio C Hamano [Sat, 17 Jul 2021 00:42:49 +0000 (17:42 -0700)]
Merge branch 'ab/bundle-updates'
Code clean-up and leak plugging in "git bundle".
* ab/bundle-updates:
bundle: remove "ref_list" in favor of string-list.c API
bundle.c: use a temporary variable for OIDs and names
bundle cmd: stop leaking memory from parse_options_cmd_bundle()
Junio C Hamano [Sat, 17 Jul 2021 00:42:48 +0000 (17:42 -0700)]
Merge branch 'ab/mktag-tests'
Fill test gaps.
* ab/mktag-tests:
mktag tests: test fast-export
mktag tests: test for-each-ref
mktag tests: test update-ref and reachable fsck
mktag tests: test hash-object --literally and unreachable fsck
mktag tests: invert --no-strict test
mktag tests: parse out options in helper
Junio C Hamano [Sat, 17 Jul 2021 00:42:48 +0000 (17:42 -0700)]
Merge branch 'ab/show-branch-tests'
Fill test gaps.
* ab/show-branch-tests:
show-branch tests: add missing tests
show-branch: don't <COLOR></RESET> for space characters
show-branch tests: modernize test code
show-branch tests: rename the one "show-branch" test file
Junio C Hamano [Sat, 17 Jul 2021 00:42:48 +0000 (17:42 -0700)]
Merge branch 'ab/fetch-negotiate-segv-fix'
Code recently added to support common ancestry negotiation during
"git push" did not sanity check its arguments carefully enough.
* ab/fetch-negotiate-segv-fix:
fetch: fix segfault in --negotiate-only without --negotiation-tip=*
fetch: document the --negotiate-only option
send-pack.c: move "no refs in common" abort earlier
Junio C Hamano [Sat, 17 Jul 2021 00:42:46 +0000 (17:42 -0700)]
Merge branch 'js/gfw-system-config-loc-fix'
Update the location of system-side configuration file on Windows.
* js/gfw-system-config-loc-fix:
config: normalize the path of the system gitconfig
cmake(windows): set correct path to the system Git config
mingw: move Git for Windows' system config where users expect it
Junio C Hamano [Sat, 17 Jul 2021 00:42:46 +0000 (17:42 -0700)]
Merge branch 'tb/midx-use-checksum'
When rebuilding the multi-pack index file reusing an existing one,
we used to blindly trust the existing file and ended up carrying
corrupted data into the updated file, which has been corrected.
* tb/midx-use-checksum:
midx: report checksum mismatches during 'verify'
midx: don't reuse corrupt MIDXs when writing
commit-graph: rewrite to use checksum_valid()
csum-file: introduce checksum_valid()
The merge code had funny interactions between content based rename
detection and directory rename detection.
* en/merge-dir-rename-corner-case-fix:
merge-recursive: handle rename-to-self case
merge-ort: ensure we consult df_conflict and path_conflicts
t6423: test directory renames causing rename-to-self
Junio C Hamano [Sat, 17 Jul 2021 00:42:45 +0000 (17:42 -0700)]
Merge branch 'en/ort-perf-batch-13'
Performance tweaks of "git merge -sort" around lazy fetching of objects.
* en/ort-perf-batch-13:
merge-ort: add prefetching for content merges
diffcore-rename: use a different prefetch for basename comparisons
diffcore-rename: allow different missing_object_cb functions
t6421: add tests checking for excessive object downloads during merge
promisor-remote: output trace2 statistics for number of objects fetched
Junio C Hamano [Sat, 17 Jul 2021 00:42:45 +0000 (17:42 -0700)]
Merge branch 'en/ort-perf-batch-12'
More fix-ups and optimization to "merge -sort".
* en/ort-perf-batch-12:
merge-ort: miscellaneous touch-ups
Fix various issues found in comments
diffcore-rename: avoid unnecessary strdup'ing in break_idx
merge-ort: replace string_list_df_name_compare with faster alternative
Technical writing seeks to convey information with minimal
friction. One way that a reader can experience friction is if they
encounter a description of "a user" that is later simplified using a
gendered pronoun. If the reader does not consider that pronoun to
apply to them, then they can experience cognitive dissonance that
removes focus from the information.
Give some basic tips to guide us avoid unnecessary uses of gendered
description.
Using a gendered pronoun is appropriate when referring to a specific
person.
There are acceptable existing uses of gendered pronouns within the
Git codebase, such as:
* References to real people (e.g. Linus Torvalds, "the Git maintainer").
Do not misgender real people. If there is any doubt to the gender of a
person, then avoid using pronouns.
* References to fictional people with clear genders (e.g. Alice and
Bob).
* Sample text used in test cases (e.g t3702, t6432).
* The official text of the GPL license contains uses of "he or she",
but using singular "they" (or modifying the text in some other
way) is not within the scope of the Git project.
* Literal email messages in Documentation/howto/ should not be edited
for grammatical concerns such as this, unless we update the entire
document to fit the standard documentation format. If such an effort is
taken on, then the authorship would change and no longer refer to the
exact mail message.
* External projects consumed in contrib/ should not deviate solely for
style reasons. Recommended edits should be contributed to those
projects directly.
Other cases within the Git project were cleaned up by the previous
changes.
Co-authored-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Fri, 16 Jul 2021 01:55:43 +0000 (01:55 +0000)]
parse-options: don't complete option aliases by default
Since 'OPT_ALIAS' was created in 5c387428f1 (parse-options: don't emit
"ambiguous option" for aliases, 2019-04-29), 'git clone
--git-completion-helper', which is used by the Bash completion script to
list options accepted by clone (via '__gitcomp_builtin'), lists both
'--recurse-submodules' and its alias '--recursive', which was not the
case before since '--recursive' had the PARSE_OPT_HIDDEN flag set, and
options with this flag are skipped by 'parse-options.c::show_gitcomp',
which implements 'git <cmd> --git-completion-helper'.
This means that typing 'git clone --recurs<TAB>' will yield both
'--recurse-submodules' and '--recursive', which is not ideal since both
do the same thing, and so the completion should directly complete the
canonical option.
At the point where 'show_gitcomp' is called in 'parse_options_step',
'preprocess_options' was already called in 'parse_options', so any
aliases are now copies of the original options with a modified help text
indicating they are aliases.
Helpfully, since 64cc539fd2 (parse-options: don't leak alias help
messages, 2021-03-21) these copies have the PARSE_OPT_FROM_ALIAS flag
set, so check that flag early in 'show_gitcomp' and do not print them,
unless the user explicitely requested that *all* completion be shown (by
setting 'GIT_COMPLETION_SHOW_ALL'). After all, if we want to encourage
the use of '--recurse-submodules' over '--recursive', we'd better just
suggest the former.
The only other options alias is 'log' and friends' '--mailmap', which is
an alias for '--use-mailmap', but the Bash completion helpers for these
commands do not use '__gitcomp_builtin', and thus are unnaffected by
this change.
Test the new behaviour in t9902-completion.sh. As a side effect, this
also tests the correct behaviour of GIT_COMPLETION_SHOW_ALL, which was
not tested before. Note that since '__gitcomp_builtin' caches the
options it shows, we need to re-source the completion script to clear
that cache for the second test.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
These were last bumped in commit 92c57e5c1d29 (bump rename limit
defaults (again), 2011-02-19), and were bumped both because processors
had gotten faster, and because people were getting ugly merges that
caused problems and reporting it to the mailing list (suggesting that
folks were willing to spend more time waiting).
Since that time:
* Linus has continued recommending kernel folks to set
diff.renameLimit=0 (maps to 32767, currently)
* Folks with repositories with lots of renames were happy to set
merge.renameLimit above 32767, once the code supported that, to
get correct cherry-picks
* Processors have gotten faster
* It has been discovered that the timing methodology used last time
probably used too large example files.
The last point is probably worth explaining a bit more:
* The "average" file size used appears to have been average blob size
in the linux kernel history at the time (probably v2.6.25 or
something close to it).
* Since bigger files are modified more frequently, such a computation
weights towards larger files.
* Larger files may be more likely to be modified over time, but are
not more likely to be renamed -- the mean and median blob size
within a tree are a bit higher than the mean and median of blob
sizes in the history leading up to that version for the linux
kernel.
* The mean blob size in v2.6.25 was half the average blob size in
history leading to that point
* The median blob size in v2.6.25 was about 40% of the mean blob size
in v2.6.25.
* Since the mean blob size is more than double the median blob size,
any file as big as the mean will not be compared to any files of
median size or less (because they'd be more than 50% dissimilar).
* Since it is the number of files compared that provides the O(n^2)
behavior, median-sized files should matter more than mean-sized
ones.
The combined effect of the above is that the file size used in past
calculations was likely about 5x too large. Combine that with a CPU
performance improvement of ~30%, and we can increase the limits by
a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated
time limits.
Keeping the same approximate time limit probably makes sense for
diff.renameLimit (there is no progress feedback in e.g. git log -p),
but the experience above suggests merge.renameLimit could be extended
significantly. In fact, it probably would make sense to have an
unlimited default setting for merge.renameLimit, but that would
likely need to be coupled with changes to how progress is displayed.
(See https://lore.kernel.org/git/YOx+Ok%2FEYvLqRMzJ@coredump.intra.peff.net/
for details in that area.) For now, let's just bump the approximate
time limit from 10s to 1m.
(Note: We do not want to use actual time limits, because getting results
that depend on how loaded your system is that day feels bad, and because
we don't discover that we won't get all the renames until after we've
put in a lot of work rather than just upfront telling the user there are
too many files involved.)
Using the original time limit of 2s for diff.renameLimit, and bumping
merge.renameLimit from 10s to 60s, I found the following timings using
the simple script at the end of this commit message (on an AWS c5.xlarge
which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"):
N Timing
1300 1.995s
7100 59.973s
So let's round down to nice even numbers and bump the limits from
400->1000, and from 1000->7000.
Here is the measure_rename_perf script (adapted from
https://lore.kernel.org/git/20080211113516.GB6344@coredump.intra.peff.net/
in particular to avoid triggering the linear handling from
basename-guided rename detection):
#!/bin/bash
n=$1; shift
rm -rf repo
mkdir repo && cd repo
git init -q -b main
mkdata() {
mkdir $1
for i in `seq 1 $2`; do
(sed "s/^/$i /" <../sample
echo tag: $1
) >$1/$i
done
}
diffcore-rename: treat a rename_limit of 0 as unlimited
In commit 89973554b52c (diffcore-rename: make diff-tree -l0 mean
-l<large>, 2017-11-29), -l0 was given a special magical "large" value,
but one which was not large enough for some uses (as can be seen from
commit 9f7e4bfa3b6d (diff: remove silent clamp of renameLimit,
2017-11-13). Make 0 (or a negative value) be treated as unlimited
instead and update the documentation to mention this.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few places in the docs implied that rename/copy detection is always
quadratic or that all (unpaired) files were involved in the quadratic
portion of rename/copy detection. The following two commits each
introduced an exception to this:
9027f53cb505 (Do linear-time/space rename logic for exact renames,
2007-10-25) bd24aa2f97a0 (diffcore-rename: guide inexact rename detection based
on basenames, 2021-02-14)
(As a side note, for copy detection, the basename guided inexact rename
detection is turned off and the exact renames will only result in
sources (without the dests) being removed from the set of files used in
quadratic detection. So, for copy detection, the documentation was
closer to correct.)
Avoid implying that all files involved in rename/copy detection are
subject to the full quadratic algorithm. While at it, also note the
default values for all these settings.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff: correct warning message when renameLimit exceeded
The warning when quadratic rename detection was skipped referred to
"inexact rename detection". For years, the only linear portion of
rename detection was looking for exact renames, so "inexact rename
detection" was an accurate way to refer to the quadratic portion of
rename detection. However, that changed with commit bd24aa2f97a0
(diffcore-rename: guide inexact rename detection based on basenames,
2021-02-14). Let's instead use the term "exhaustive rename detection"
to refer to the quadratic portion.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stephen Manz [Thu, 15 Jul 2021 02:32:30 +0000 (02:32 +0000)]
worktree: teach `add` to accept --reason <string> with --lock
The default reason stored in the lock file, "added with --lock",
is unlikely to be what the user would have given in a separate
`git worktree lock` command. Allowing `--reason` to be specified
along with `--lock` when adding a working tree gives the user control
over the reason for locking without needing a second command.
Signed-off-by: Stephen Manz <smanz@alum.mit.edu> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ci(check-whitespace): restrict to the intended commits
During a run of the `check-whitespace` we want to verify that the
commits introduced in the Pull Request have no whitespace issues. We
only want to look at those commits, not the upstream commits (because
the contributor cannot do anything about the latter).
However, by using the `-<count>` form in `git log --check`, we run the
risk of looking at the wrong commits. The reason is that the
`actions/checkout` step does _not_ check out the tip commit of the Pull
Request's branch: Instead, it checks out a merge commit that merges that
branch into the target branch. For that reason, we already adjust the
commit count by incrementing it, but that is not enough: if the upstream
branch has newer commits, they are traversed _first_. And obviously we
will then miss some of the commits that we _actually_ wanted to look at.
Therefore, let's be careful to stop assuming a linear, up to date commit
topology in the contributed commits, and instead specify the correct
commit range.
Unfortunately, this means that we no longer can rely on a shallow clone:
There is no way of knowing just how many commits the upstream branch
advanced after the commit from which the PR branch branched off. So
let's just go with a full clone instead, and be safe rather than sorry
(if we have "too shallow" a situation, a commit range `@{u}..` may very
well include a shallow commit itself, and the output of `git show
--check <shallow>` is _not_ pretty).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ci(check-whitespace): stop requiring a read/write token
As part of some recent security tightening, GitHub introduced the
ability to configure GitHub workflows to be run with a read-only token.
This is much more secure, in particular when working in a public
repository: While the regular read/write token might be restricted to
writing to the current branch, it is not necessarily restricted to
access only the current Pull Request.
However, the `check-whitespace` workflow threw a wrench into this plan:
it _requires_ write access (because it wants to add a PR comment in case
of a whitespace issue).
Let's just skip that PR comment. The user can always click through to
the actual error, even if it is slightly less convenient.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>