Junio C Hamano [Mon, 20 Sep 2021 22:20:42 +0000 (15:20 -0700)]
Merge branch 'ab/unbundle-progress'
Add progress display to "git bundle unbundle".
* ab/unbundle-progress:
bundle: show progress on "unbundle"
index-pack: add --progress-title option
bundle API: change "flags" to be "extra_index_pack_args"
bundle API: start writing API documentation
Junio C Hamano [Mon, 20 Sep 2021 22:20:42 +0000 (15:20 -0700)]
Merge branch 'tb/pack-finalize-ordering'
The order in which various files that make up a single (conceptual)
packfile has been reevaluated and straightened up. This matters in
correctness, as an incomplete set of files must not be shown to a
running Git.
* tb/pack-finalize-ordering:
pack-objects: rename .idx files into place after .bitmap files
pack-write: split up finish_tmp_packfile() function
builtin/index-pack.c: move `.idx` files into place last
index-pack: refactor renaming in final()
builtin/repack.c: move `.idx` files into place last
pack-write.c: rename `.idx` files after `*.rev`
pack-write: refactor renaming in finish_tmp_packfile()
bulk-checkin.c: store checksum directly
pack.h: line-wrap the definition of finish_tmp_packfile()
Junio C Hamano [Mon, 20 Sep 2021 22:20:41 +0000 (15:20 -0700)]
Merge branch 'ab/progress-users-adjust-counters'
The code to show progress indicator in a few code paths did not
cover between 0-100%, which has been corrected.
* ab/progress-users-adjust-counters:
entry: show finer-grained counter in "Filtering content" progress line
commit-graph: fix bogus counter in "Scanning merged commits" progress line
Junio C Hamano [Mon, 20 Sep 2021 22:20:41 +0000 (15:20 -0700)]
Merge branch 'dt/submodule-diff-fixes'
"git diff --submodule=diff" showed failure from run_command() when
trying to run diff inside a submodule, when the user manually
removes the submodule directory.
* dt/submodule-diff-fixes:
diff --submodule=diff: don't print failure message twice
diff --submodule=diff: do not fail on ever-initialied deleted submodules
t4060: remove unused variable
Junio C Hamano [Mon, 20 Sep 2021 22:20:40 +0000 (15:20 -0700)]
Merge branch 'lh/systemd-timers'
"git maintenance" scheduler learned to use systemd timers as a
possible backend.
* lh/systemd-timers:
maintenance: add support for systemd timers on Linux
maintenance: `git maintenance run` learned `--scheduler=<scheduler>`
cache.h: Introduce a generic "xdg_config_home_for(…)" function
Junio C Hamano [Mon, 20 Sep 2021 22:20:40 +0000 (15:20 -0700)]
Merge branch 'ab/tr2-leaks-and-fixes'
The tracing of process ancestry information has been enhanced.
* ab/tr2-leaks-and-fixes:
tr2: log N parent process names on Linux
tr2: do compiler enum check in trace2_collect_process_info()
tr2: leave the parent list empty upon failure & don't leak memory
tr2: stop leaking "thread_name" memory
tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
tr2: remove NEEDSWORK comment for "non-procfs" implementations
The code to make "git grep" recurse into submodules has been
updated to migrate away from the "add submodule's object store as
an alternate object store" mechanism (which is suboptimal).
* jt/grep-wo-submodule-odb-as-alternate:
t7814: show lack of alternate ODB-adding
submodule-config: pass repo upon blob config read
grep: add repository to OID grep sources
grep: allocate subrepos on heap
grep: read submodule entry with explicit repo
grep: typesafe versions of grep_source_init
grep: use submodule-ODB-as-alternate lazy-addition
submodule: lazily add submodule ODBs as alternates
Junio C Hamano [Mon, 20 Sep 2021 22:20:39 +0000 (15:20 -0700)]
Merge branch 'tb/multi-pack-bitmaps'
The reachability bitmap file used to be generated only for a single
pack, but now we've learned to generate bitmaps for history that
span across multiple packfiles.
* tb/multi-pack-bitmaps: (29 commits)
pack-bitmap: drop bitmap_index argument from try_partial_reuse()
pack-bitmap: drop repository argument from prepare_midx_bitmap_git()
p5326: perf tests for MIDX bitmaps
p5310: extract full and partial bitmap tests
midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
t7700: update to work with MIDX bitmap test knob
t5319: don't write MIDX bitmaps in t5319
t5310: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
t5326: test multi-pack bitmap behavior
t/helper/test-read-midx.c: add --checksum mode
t5310: move some tests to lib-bitmap.sh
pack-bitmap: write multi-pack bitmaps
pack-bitmap: read multi-pack bitmaps
pack-bitmap.c: avoid redundant calls to try_partial_reuse
pack-bitmap.c: introduce 'bitmap_is_preferred_refname()'
pack-bitmap.c: introduce 'nth_bitmap_object_oid()'
pack-bitmap.c: introduce 'bitmap_num_objects()'
midx: avoid opening multiple MIDXs when writing
midx: close linked MIDXs, avoid leaking memory
...
Junio C Hamano [Mon, 20 Sep 2021 22:20:39 +0000 (15:20 -0700)]
Merge branch 'ps/fetch-optim'
Optimize code that handles large number of refs in the "git fetch"
code path.
* ps/fetch-optim:
fetch: avoid second connectivity check if we already have all objects
fetch: merge fetching and consuming refs
fetch: refactor fetch refs to be more extendable
fetch-pack: optimize loading of refs via commit graph
connected: refactor iterator to return next object ID directly
fetch: avoid unpacking headers in object existence check
fetch: speed up lookup of want refs via commit-graph
Junio C Hamano [Wed, 15 Sep 2021 20:15:26 +0000 (13:15 -0700)]
Merge branch 'pb/test-use-user-env'
Teach "test_pause" and "debug" helpers to allow using the HOME and
TERM environment variables the user usually uses.
* pb/test-use-user-env:
test-lib-functions: keep user's debugger config files and TERM in 'debug'
test-lib-functions: optionally keep HOME, TERM and SHELL in 'test_pause'
test-lib-functions: use 'TEST_SHELL_PATH' in 'test_pause'
Junio C Hamano [Wed, 15 Sep 2021 20:15:26 +0000 (13:15 -0700)]
Merge branch 'jc/trivial-threeway-binary-merge'
The "git apply -3" code path learned not to bother the lower level
merge machinery when the three-way merge can be trivially resolved
without the content level merge.
* jc/trivial-threeway-binary-merge:
apply: resolve trivial merge without hitting ll-merge with "--3way"
Junio C Hamano [Fri, 10 Sep 2021 18:46:29 +0000 (11:46 -0700)]
Merge branch 'ab/retire-advice-config'
Code clean up to migrate callers from older advice_config[] based
API to newer advice_if_enabled() and advice_enabled() API.
* ab/retire-advice-config:
advice: move advice.graftFileDeprecated squashing to commit.[ch]
advice: remove use of global advice_add_embedded_repo
advice: remove read uses of most global `advice_` variables
advice: add enum variants for missing advice variables
Junio C Hamano [Fri, 10 Sep 2021 18:46:29 +0000 (11:46 -0700)]
Merge branch 'mk/clone-recurse-submodules'
After "git clone --recurse-submodules", all submodules are cloned
but they are not by default recursed into by other commands. With
submodule.stickyRecursiveClone configuration set, submodule.recurse
configuration is set to true in a repository created by "clone"
with "--recurse-submodules" option.
* mk/clone-recurse-submodules:
clone: set submodule.recurse=true if submodule.stickyRecursiveClone enabled
Junio C Hamano [Fri, 10 Sep 2021 18:46:25 +0000 (11:46 -0700)]
Merge branch 'ab/commit-graph-usage'
Fixes on usage message from "git commit-graph".
* ab/commit-graph-usage:
commit-graph: show "unexpected subcommand" error
commit-graph: show usage on "commit-graph [write|verify] garbage"
commit-graph: early exit to "usage" on !argc
multi-pack-index: refactor "goto usage" pattern
commit-graph: use parse_options_concat()
commit-graph: remove redundant handling of -h
commit-graph: define common usage with a macro
Junio C Hamano [Fri, 10 Sep 2021 18:46:25 +0000 (11:46 -0700)]
Merge branch 'mh/send-email-reset-in-reply-to'
Even when running "git send-email" without its own threaded
discussion support, a threading related header in one message is
carried over to the subsequent message to result in an unwanted
threading, which has been corrected.
Junio C Hamano [Fri, 10 Sep 2021 18:46:23 +0000 (11:46 -0700)]
Merge branch 'sg/set-ceiling-during-tests'
Buggy tests could damage repositories outside the throw-away test
area we created. We now by default export GIT_CEILING_DIRECTORIES
to limit the damage from such a stray test.
* sg/set-ceiling-during-tests:
test-lib: set GIT_CEILING_DIRECTORIES to protect the surrounding repository
"git rebase" by default skips changes that are equivalent to
commits that are already in the history the branch is rebased onto;
give messages when this happens to let the users be aware of
skipped commits, and also teach them how to tell "rebase" to keep
duplicated changes.
* js/advise-when-skipping-cherry-picked:
sequencer: advise if skipping cherry-picked commit
pack-objects: rename .idx files into place after .bitmap files
In preceding commits the race of renaming .idx files in place before
.rev files and other auxiliary files was fixed in pack-write.c's
finish_tmp_packfile(), builtin/repack.c's "struct exts", and
builtin/index-pack.c's final(). As noted in the change to pack-write.c
we left in place the issue of writing *.bitmap files after the *.idx,
let's fix that issue.
See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21)
for commentary at the time when *.bitmap was implemented about how
those files are written out, nothing in that commit contradicts what's
being done here.
Note that this commit and preceding ones only close any race condition
with *.idx files being written before their auxiliary files if we're
optimistic about our lack of fsync()-ing in this are not tripping us
over. See the thread at [1] for a rabbit hole of various discussions
about filesystem races in the face of doing and not doing fsync() (and
if doing fsync(), not doing it properly).
We may want to fsync the containing directory once after renaming the
*.idx file into place, but that is outside the scope of this series.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-write: split up finish_tmp_packfile() function
Split up the finish_tmp_packfile() function and use the split-up version
in pack-objects.c in preparation for moving the step of renaming the
*.idx file later as part of a function change.
Since the only other caller of finish_tmp_packfile() was in
bulk-checkin.c, and it won't be needing a change to its *.idx renaming,
provide a thin wrapper for the old function as a static function in that
file. If other callers end up needing the simpler version it could be
moved back to "pack-write.c" and "pack.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 9 Sep 2021 23:24:53 +0000 (19:24 -0400)]
builtin/index-pack.c: move `.idx` files into place last
In a similar spirit as preceding patches to `git repack` and `git
pack-objects`, fix the identical problem in `git index-pack`.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the renaming in final() into a helper function, this is
similar in spirit to a preceding refactoring of finish_tmp_packfile()
in pack-write.c.
Before e37d0b8730b (builtin/index-pack.c: write reverse indexes,
2021-01-25) it probably wasn't worth it to have this sort of helper,
due to the differing "else if" case for "pack" files v.s. "idx" files.
But since we've got "rev" as well now, let's do the renaming via a
helper, this is both a net decrease in lines, and improves the
readability, since we can easily see at a glance that the logic for
writing these three types of files is exactly the same, aside from the
obviously differing cases of "*final_name" being NULL, and
"make_read_only_if_same" being different.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 9 Sep 2021 23:24:44 +0000 (19:24 -0400)]
builtin/repack.c: move `.idx` files into place last
In a similar spirit as the previous patch, fix the identical problem
from `git repack` (which invokes `pack-objects` with a temporary
location for output, and then moves the files into their final locations
itself).
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 9 Sep 2021 23:24:41 +0000 (19:24 -0400)]
pack-write.c: rename `.idx` files after `*.rev`
We treat the presence of an `.idx` file as the indicator of whether or
not it's safe to use a packfile. But `finish_tmp_packfile()` (which is
responsible for writing the pack and moving the temporary versions of
all of its auxiliary files into place) is inconsistent about the write
order.
Specifically, it moves the `.rev` file into place after the `.idx`,
leaving open the possibility to open a pack which looks "ready" (because
the `.idx` file exists and is readable) but appears momentarily to not
have a `.rev` file. This causes Git to fall back to generating the
pack's reverse index in memory.
Though racy, this amounts to an unnecessary slow-down at worst, and
doesn't affect the correctness of the resulting reverse index.
Close this race by moving the .rev file into place before moving the
.idx file into place.
This still leaves the issue of `.idx` files being renamed into place
before the auxiliary `.bitmap` file is renamed when in pack-object.c's
write_pack_file() "write_bitmap_index" is true. That race will be
addressed in subsequent commits.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-write: refactor renaming in finish_tmp_packfile()
Refactor the renaming in finish_tmp_packfile() into a helper function.
The callers are now expected to pass a "name_buffer" ending in
"pack-OID." instead of the previous "pack-", we then append "pack",
"idx" or "rev" to it.
By doing the strbuf_setlen() in rename_tmp_packfile() we reuse the
buffer and avoid the repeated allocations we'd get if that function had
its own temporary "struct strbuf".
This approach of reusing the buffer does make the last user in
pack-object.c's write_pack_file() slightly awkward, since we needlessly
do a strbuf_setlen() before calling strbuf_release() for consistency. In
subsequent changes we'll move that bitmap writing code around, so let's
not skip the strbuf_setlen() now.
The previous strbuf_reset() idiom originated with 5889271114a
(finish_tmp_packfile():use strbuf for pathname construction,
2014-03-03), which in turn was a minimal adjustment of pre-strbuf code
added in 0e990530ae (finish_tmp_packfile(): a helper function,
2011-10-28).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 9 Sep 2021 23:24:32 +0000 (19:24 -0400)]
bulk-checkin.c: store checksum directly
finish_bulk_checkin() stores the checksum from finalize_hashfile() by
writing to the `hash` member of `struct object_id`, but that hash has
nothing to do with an object id (it's just a convenient location that
happens to be sized correctly).
Store the hash directly in an unsigned char array. This behaves the same
as writing to the `hash` member, but makes the intent clearer (and
avoids allocating an extra four bytes for the `algo` member of `struct
object_id`). It likewise prevents the possibility of a segfault when
reading `algo` (e.g., by calling `oid_to_hex()`) if it is uninitialized.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 9 Sep 2021 19:57:21 +0000 (15:57 -0400)]
pack-bitmap: drop bitmap_index argument from try_partial_reuse()
Starting in commit 0f533c7284 (pack-bitmap: read multi-pack bitmaps,
2021-08-31), we no longer look at the "struct bitmap_index" passed to
try_partial_reuse(). This is because we only handle verbatim reuse from
a single pack: either the pack whose bitmap we're looking at, or the
"preferred" pack of a midx bitmap. And thus the primary item we look at
is the "pack" parameter added by that same commit, and not the
bitmap_git->pack parameter (which would be NULL for a midx bitmap). It's
our caller, reuse_partial_packfile_from_bitmap(), which decides which
pack to use and passes it in to us.
Drop the unused parameter to prevent confusion.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 9 Sep 2021 19:56:58 +0000 (15:56 -0400)]
pack-bitmap: drop repository argument from prepare_midx_bitmap_git()
We never look at the repository argument which is passed. This makes
sense, since the multi_pack_index struct already tells us everything we
need to access the files in its associated object directory.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack.h: line-wrap the definition of finish_tmp_packfile()
Line-wrap the definition of finish_tmp_packfile() to make subsequent
changes easier to read. See 0e990530ae (finish_tmp_packfile(): a
helper function, 2011-10-28) for the commit that introduced this
overly long line.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Thu, 9 Sep 2021 01:10:12 +0000 (03:10 +0200)]
entry: show finer-grained counter in "Filtering content" progress line
The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop. It works basically like this:
- The work done by the last filter (or the only filter if there is
only one) is never counted, so if the last filter still has some
paths to process, then the counter shown in the "done" progress
line will not match the expected total.
The partially-RFC series to add a GIT_TEST_CHECK_PROGRESS=1
mode[1] helps spot this issue. Under it the 'missing file in
delayed checkout' and 'invalid file in delayed checkout' tests in
't0021-conversion.sh' fail, because both use only one
filter. (The test 'delayed checkout in process filter' uses two
filters but the first one does all the work, so that test already
happens to succeed even with GIT_TEST_CHECK_PROGRESS=1.)
- The progress counter is updated only once per filter, not once per
processed path, so if a filter has a lot of paths to process, then
the counter might stay unchanged for a long while and then make a
big jump (though the user still gets a sense of progress, because
we call display_throughput() after each processed path to show the
amount of processed data).
Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.
After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the GIT_TEST_CHECK_PROGRESS=1
assertion discussed above, but the 'missing file in delayed checkout'
test would still fail.
It'll fail because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all, see [2] for how to
spot that issue without GIT_TEST_CHECK_PROGRESS=1. It's not
straightforward to fix it with the current progress.c library (see [3]
for an attempt), so let's leave it for now.
Let's also initialize the *progress to "NULL" while we're at it. Since 7a132c628e5 (checkout: make delayed checkout respect --quiet and
--no-progress, 2021-08-26) we have had progress conditional on
"show_progress", usually we use the idiom of a "NULL" initialization
of the "*progress", rather than the more verbose ternary added in 7a132c628e5.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Thu, 9 Sep 2021 01:10:11 +0000 (03:10 +0200)]
commit-graph: fix bogus counter in "Scanning merged commits" progress line
The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:
Scanning merged commits: 83% (5/6), done.
This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N. Fix this by passing 'i + 1' to display_progress(), like
most other callsites do.
There's an RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] which
catches this issue in the 'fetch.writeCommitGraph' and
'fetch.writeCommitGraph with submodules' tests in
't5510-fetch.sh'. The GIT_TEST_CHECK_PROGRESS=1 mode is not part of
this series, but future changes to progress.c may add it or similar
assertions to catch this and similar bugs elsewhere.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 8 Sep 2021 20:30:33 +0000 (13:30 -0700)]
Merge branch 'cb/makefile-apple-clang'
Build update for Apple clang.
* cb/makefile-apple-clang:
build: catch clang that identifies itself as "$VENDOR clang"
build: clang version may not be followed by extra words
build: update detect-compiler for newer Xcode version
Junio C Hamano [Wed, 8 Sep 2021 20:30:29 +0000 (13:30 -0700)]
Merge branch 'js/maintenance-launchctl-fix'
"git maintenance" scheduler fix for macOS.
* js/maintenance-launchctl-fix:
maintenance: skip bootout/bootstrap when plist is registered
maintenance: create `launchctl` configuration using a lock file
Jonathan Tan [Mon, 16 Aug 2021 21:09:58 +0000 (14:09 -0700)]
t7814: show lack of alternate ODB-adding
The previous patches have made "git grep" no longer need to add
submodule ODBs as alternates, at least for the code paths tested in
t7814. Demonstrate this by making adding a submodule ODB as an alternate
fatal in this test.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Mon, 16 Aug 2021 21:09:57 +0000 (14:09 -0700)]
submodule-config: pass repo upon blob config read
When reading the config of a submodule, if reading from a blob, read
using an explicitly specified repository instead of by adding the
submodule's ODB as an alternate and then reading an object from
the_repository.
This makes the "grep --recurse-submodules with submodules without
.gitmodules in the working tree" test in t7814 work when
GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB is true.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Mon, 16 Aug 2021 21:09:56 +0000 (14:09 -0700)]
grep: add repository to OID grep sources
Record the repository whenever an OID grep source is created, and teach
the worker threads to explicitly provide the repository when accessing
objects.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Mon, 16 Aug 2021 21:09:55 +0000 (14:09 -0700)]
grep: allocate subrepos on heap
Currently, struct repository objects corresponding to submodules are
allocated on the stack in grep_submodule(). This currently works because
they will not be used once grep_submodule() exits, but a subsequent
patch will require these structs to be accessible for longer (perhaps
even in another thread). Allocate them on the heap and clear them only
at the very end.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Mon, 16 Aug 2021 21:09:54 +0000 (14:09 -0700)]
grep: read submodule entry with explicit repo
Replace an existing parse_object_or_die() call (which implicitly works
on the_repository) with a function call that allows a repository to be
passed in. There is no such direct equivalent to parse_object_or_die(),
but we only need the type of the object, so replace with
oid_object_info().
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Mon, 16 Aug 2021 21:09:53 +0000 (14:09 -0700)]
grep: typesafe versions of grep_source_init
grep_source_init() can create "struct grep_source" objects and,
depending on the value of the type passed, some void-pointer parameters have
different meanings. Because one of these types (GREP_SOURCE_OID) will
require an additional parameter in a subsequent patch, take the
opportunity to increase clarity and type safety by replacing this
function with individual functions for each type.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Mon, 16 Aug 2021 21:09:52 +0000 (14:09 -0700)]
grep: use submodule-ODB-as-alternate lazy-addition
In the parent commit, Git was taught to add submodule ODBs as alternates
lazily, but grep does not use this because it computes the path to add
directly, not going through add_submodule_odb(). Add an equivalent to
add_submodule_odb() that takes the exact ODB path and teach grep to use
it.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Mon, 16 Aug 2021 21:09:51 +0000 (14:09 -0700)]
submodule: lazily add submodule ODBs as alternates
Teach Git to add submodule ODBs as alternates to the object store of
the_repository only upon the first access of an object not in
the_repository, and not when add_submodule_odb() is called.
This provides a means of gradually migrating from accessing a
submodule's object through alternates to accessing a submodule's object
by explicitly passing its repository object. Any Git command can declare
that it might access submodule objects by calling add_submodule_odb()
(as they do now), but the submodule ODBs themselves will not be added
until needed, so individual commands and/or combinations of arguments
can be migrated one by one.
[The advantage of explicit repository-object passing is code clarity (it
is clear which repository an object read is from), performance (there is
no need to linearly search through all submodule ODBs whenever an object
is accessed from any repository, whether superproject or submodule), and
the possibility of future features like partial clone submodules (which
right now is not possible because if an object is missing, we do not
know which repository to lazy-fetch into).]
This commit also introduces an environment variable that a test may set
to make the actual registration of alternates fatal, in order to
demonstrate that its codepaths do not need this registration.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-write: skip *.rev work when not writing *.rev
Fix a performance regression introduced in a587b5a786 (pack-write.c:
extract 'write_rev_file_order', 2021-03-30) and stop needlessly
allocating the "pack_order" array and sorting it with
"pack_order_cmp()", only to throw that work away when we discover that
we're not writing *.rev files after all.
This redundant work was not present in the original version of this
code added in 8ef50d9958 (pack-write.c: prepare to write 'pack-*.rev'
files, 2021-01-25). There we'd call write_rev_file() from
e.g. finish_tmp_packfile(), but we'd "return NULL" early in
write_rev_file() if not doing a "WRITE_REV" or "WRITE_REV_VERIFY".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 7 Sep 2021 22:10:22 +0000 (15:10 -0700)]
hash-object: prefix_filename() returns allocated memory these days
Back when a1be47e4 (hash-object: fix buffer reuse with --path in a
subdirectory, 2017-03-20) was written, the prefix_filename() helper
used a static piece of memory to the caller, making the caller
responsible for copying it, if it wants to keep it across another
call to the same function. Two callers of the prefix_filename() in
hash-object were made to xstrdup() the value obtained from it.
But in the same series, when e4da43b1 (prefix_filename: return newly
allocated string, 2017-03-20) changed the rule to gave the caller
possession of the memory, we forgot to revert one of the xstrdup()
changes, allowing the returned value to leak.
This script was added in f28ac70f48 (Move all dashed-form commands to
libexecdir, 2007-11-28) when commands such as "git-add" lived in the
bin directory, instead of the git exec directory.
This notice helped someone incorrectly installing version v1.6.0 and
later into a directory built for a pre-v1.6.0 git version.
We're now long past the point where anyone who'd be helped by this
warning is likely to be doing that, so let's just remove this check
and warning to simplify the Makefile.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
send-email: fix a "first config key wins" regression in v2.33.0
Fix a regression in my c95e3a3f0b8 (send-email: move trivial config
handling to Perl, 2021-05-28) where we'd pick the first config key out
of multiple defined ones, instead of using the normal "last key wins"
semantics of "git config --get".
This broke e.g. cases where a .git/config would have a different
sendemail.smtpServer than ~/.gitconfig. We'd pick the ~/.gitconfig
over .git/config, instead of preferring the repository-local
version. The same would go for /etc/gitconfig etc.
The full list of impacted config keys (the %config_settings values
which are references to scalars, not arrays) is:
I.e. having any of these set in say ~/.gitconfig and in-repo
.git/config regressed in v2.33.0 to prefer the --global one over the
--local.
To test this add a test of config priority to one of these config
variables, most don't have tests at all, but there was an existing one
for sendemail.8bitEncoding.
The "git config" (instead of "test_config") is somewhat of an
anti-pattern, but follows established conventions in
t9001-send-email.sh, likewise with any other pattern or idiom in this
test.
The populating of home/.gitconfig and setting of HOME= is copied from
a test in t0017-env-helper.sh added in 1ff750b128e (tests: make
GIT_TEST_GETTEXT_POISON a boolean, 2019-06-21). This test fails
without this bugfix, but now it works.
Reported-by: Eli Schwartz <eschwartz@archlinux.org> Tested-by: Eli Schwartz <eschwartz@archlinux.org> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sat, 4 Sep 2021 07:50:58 +0000 (09:50 +0200)]
range-diff: avoid segfault with -I
output() reuses the same struct diff_options for multiple calls of
diff_flush(). Set the option no_free to instruct it to keep the
ignore regexes between calls and release them explicitly at the end.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes 19b2517f (diff-merges: move specific diff-index "-m"
handling to diff-index, 2021-05-21).
That commit disabled handling of all diff for merges options in
diff-index on an assumption that they are unused. However, it later
appeared that -c and --cc, even though undocumented and not being
covered by tests, happen to have had particular effect on diff-index
output.
Restore original -c/--cc options handling by diff-index.
Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2f732bf15e6 (tr2: log parent process name, 2021-07-21) we started
logging parent process names, but only logged all parents on Windows.
on Linux only the name of the immediate parent process was logged.
Extend the functionality added there to also log full parent chain on
Linux.
This requires us to lookup "/proc/<getppid()>/stat" instead of
"/proc/<getppid()>/comm". The "comm" file just contains the name of the
process, but the "stat" file has both that information, and the parent
PID of that process, see procfs(5). We parse out the parent PID of our
own parent, and recursively walk the chain of "/proc/*/stat" files all
the way up the chain. A parent PID of 0 indicates the end of the
chain.
It's possible given the semantics of Linux's PID files that we end up
getting an entirely nonsensical chain of processes. It could happen if
e.g. we have a chain of processes like:
1 (init) => 321 (bash) => 123 (git)
Let's assume that "bash" was started a while ago, and that as shown
the OS has already cycled back to using a lower PID for us than our
parent process. In the time it takes us to start up and get to
trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent
process might exit, and be replaced by an entirely different process!
We'd racily look up our own getppid(), but in the meantime our parent
would exit, and Linux would have cycled all the way back to starting
an entirely unrelated process as PID 321.
If that happens we'll just silently log incorrect data in our ancestry
chain. Luckily we don't need to worry about this except in this
specific cycling scenario, as Linux does not have PID
randomization. It appears it once did through a third-party feature,
but that it was removed around 2006[1]. For anyone worried about this
edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate
it, but not eliminate it.
One thing we don't need to worry about is getting into an infinite
loop when walking "/proc/*/stat". See 353d3d77f4f (trace2: collect
Windows-specific process information, 2019-02-22) for the related
Windows code that needs to deal with that, and [2] for an explanation
of that edge case.
Aside from potential race conditions it's also a bit painful to
correctly parse the process name out of "/proc/*/stat". A simpler
approach is to use fscanf(), see [3] for an implementation of that,
but as noted in the comment being added here it would fail in the face
of some weird process names, so we need our own parse_proc_stat() to
parse it out.
With this patch the "ancestry" chain for a trace2 event might look
like this:
And in the case of naughty process names like the following. This uses
perl's ability to use prctl(PR_SET_NAME, ...). See
Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on
assignment to $0 on Linux, 2010-04-15)[4]:
tr2: do compiler enum check in trace2_collect_process_info()
Change code added in 2f732bf15e6 (tr2: log parent process name,
2021-07-21) to use a switch statement without a "default" branch to
have the compiler error if this code ever drifts out of sync with the
members of the "enum trace2_process_info_reason".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tr2: leave the parent list empty upon failure & don't leak memory
In a subsequent commit I'll be replacing most of this code to log N
parents, but let's first fix bugs introduced in the recent 2f732bf15e6 (tr2: log parent process name, 2021-07-21).
It was using the strbuf_read_file() in the wrong way, its return value
is either a length or a negative value on error. If we didn't have a
procfs, or otherwise couldn't access it we'd end up pushing an empty
string to the trace2 ancestry array.
It was also using the strvec_push() API the wrong way. That API always
does an xstrdup(), so by detaching the strbuf here we'd leak
memory. Let's instead pass in our pointer for strvec_push() to
xstrdup(), and then free our own strbuf. I do have some WIP changes to
make strvec_push_nodup() non-static, which makes this and some other
callsites nicer, but let's just follow the prevailing pattern of using
strvec_push() for now.
We'll also need to free that "procfs_path" strbuf whether or not
strbuf_read_file() succeeds, which was another source of memory leaks
in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a
system where we could read the file from procfs.
Let's move all the freeing of the memory to the end of the
function. If we're still at STRBUF_INIT with "name" due to not having
taken the branch where the strbuf_read_file() succeeds freeing it is
redundant. So we could move it into the body of the "if", but just
handling freeing the same way for all branches of the function makes
it more readable.
In combination with the preceding commit this makes all of
t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak introduced in ee4512ed481 (trace2: create new
combined trace facility, 2019-02-22), we were doing a free() of other
memory allocated in tr2tls_create_self(), but not the "thread_name"
"struct strbuf".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
Rewrite a comment added in 2f732bf15e6 (tr2: log parent process name,
2021-07-21) to describe what we might do under
TRACE2_PROCESS_INFO_EXIT in the future, instead of vaguely referring
to "something extra".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>