Elijah Newren [Sun, 24 Jan 2021 06:01:12 +0000 (22:01 -0800)]
merge-ort: begin performance work; instrument with trace2_region_* calls
Add some timing instrumentation for both merge-ort and diffcore-rename;
I used these to measure and optimize performance in both, and several
future patch series will build on these to reduce the timings of some
select testcases.
=== Setup ===
The primary testcase I used involved rebasing a random topic in the
linux kernel (consisting of 35 patches) against an older version. I
added two variants, one where I rename a toplevel directory, and another
where I only rebase one patch instead of the whole topic. The setup is
as follows:
Now with REBASE standing for either "git rebase [--merge]" (using
merge-recursive) or "test-tool fast-rebase" (using merge-ort), the
testcases are:
Testcase #1: no-renames
$ git checkout v5.4^0
$ REBASE --onto HEAD base hwmon-updates
Note: technically the name is misleading; there are some renames, but
very few. Rename detection only takes about half the overall time.
Testcase #2: mega-renames
$ git checkout 5.4-renames^0
$ REBASE --onto HEAD base hwmon-updates
Testcase #3: just-one-mega
$ git checkout 5.4-renames^0
$ REBASE --onto HEAD base hwmon-just-one
=== Timing results ===
Overall timings, using hyperfine (1 warmup run, 3 runs for mega-renames,
10 runs for the other two cases):
merge-recursive merge-ort
no-renames: 18.912 s ± 0.174 s 14.263 s ± 0.053 s
mega-renames: 5964.031 s ± 10.459 s 5504.231 s ± 5.150 s
just-one-mega: 149.583 s ± 0.751 s 158.534 s ± 0.498 s
A single re-run of each with some breakdowns:
--- no-renames ---
merge-recursive merge-ort
overall runtime: 19.302 s 14.257 s
inexact rename detection: 7.603 s 7.906 s
everything else: 11.699 s 6.351 s
--- mega-renames ---
merge-recursive merge-ort
overall runtime: 5950.195 s 5499.672 s
inexact rename detection: 5746.309 s 5487.120 s
everything else: 203.886 s 17.552 s
--- just-one-mega ---
merge-recursive merge-ort
overall runtime: 151.001 s 158.582 s
inexact rename detection: 143.448 s 157.835 s
everything else: 7.553 s 0.747 s
=== Timing observations ===
0) Maximum speedup
The "everything else" row represents the maximum speedup we could
achieve if we were to somehow infinitely parallelize inexact rename
detection, but leave everything else alone. The fact that this is so
much smaller than the real runtime (even in the case with virtually no
renames) makes it clear just how overwhelmingly large the time spent on
rename detection can be.
1) no-renames
1a) merge-ort is faster than merge-recursive, which is nice. However,
this still should not be considered good enough. Although the "merge"
backend to rebase (merge-recursive) is sometimes faster than the "apply"
backend, this is one of those cases where it is not. In fact, even
merge-ort is slower. The "apply" backend can complete this testcase in
6.940 s ± 0.485 s
which is about 2x faster than merge-ort and 3x faster than
merge-recursive. One goal of the merge-ort performance work will be to
make it faster than git-am on this (and similar) testcases.
2) mega-renames
2a) Obviously rename detection is a huge cost; it's where most the time
is spent. We need to cut that down. If we could somehow infinitely
parallelize it and drive its time to 0, the merge-recursive time would
drop to about 204s, and the merge-ort time would drop to about 17s. I
think this particular stat shows I've subtly baked a couple performance
improvements into merge-ort and into fast-rebase already.
3) just-one-mega
3a) not much to say here, it just gives some flavor for how rebasing
only one patch compares to rebasing 35.
=== Goals ===
This patch is obviously just the beginning. Here are some of my goals
that this measurement will help us achieve:
* Drive the cost of rename detection down considerably for merges
* After the above has been achieved, see if there are other slowness
factors (which would have previously been overshadowed by rename
detection costs) which we can then focus on and also optimize.
* Ensure our rebase testcase that requires little rename detection
is noticeably faster with merge-ort than with apply-based rebase.
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Taylor Blau <ttaylorr@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Sun, 24 Jan 2021 06:01:11 +0000 (22:01 -0800)]
merge-ort: ignore the directory rename split conflict for now
get_provisional_directory_renames() has code to detect directories being
evenly split between different locations. However, as noted previously,
if there are no new files added to that directory that was split evenly,
our inability to determine where the directory was renamed to doesn't
matter since there are no new files to try to move into the new
location. Unfortunately, that code is unaware of whether there are new
files under the directory in question and we just ignore that, causing
us to fail t6423 test 2b but pass test 2a; turn off the error for now,
swapping which tests pass and fail.
The motivating reason for switching this off as a temporary measure is
that as we add optimizations, we'll start looking at only subsets of
renames, and subsets of renames can start switching the result we get
when this error is (wrongly) on. Once we get enough optimizations,
however, we can prevent that code from even running when there are no
new files added to the relevant directory, at which point we can revert
this commit and then both testcases 2a and 2b will pass simultaneously.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Sun, 24 Jan 2021 06:01:10 +0000 (22:01 -0800)]
merge-ort: fix massive leak
When a series of merges was performed (such as for a rebase or series of
cherry-picks), only the data structures allocated by the final merge
operation were being freed. The problem was that while picking out
pieces of merge-ort to upstream, I previously misread a certain section
of merge_start() and assumed it was associated with a later
optimization. Include that section now, which ensures that if there was
a previous merge operation, that we clear out result->priv and then
re-use it for opt->priv, and otherwise we allocate opt->priv.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 21 Jan 2021 06:52:50 +0000 (22:52 -0800)]
Merge branch 'en/ort-directory-rename' into en/merge-ort-perf
* en/ort-directory-rename: (28 commits)
merge-ort: fix a directory rename detection bug
merge-ort: process_renames() now needs more defensiveness
merge-ort: implement apply_directory_rename_modifications()
merge-ort: add a new toplevel_dir field
merge-ort: implement handle_path_level_conflicts()
merge-ort: implement check_for_directory_rename()
merge-ort: implement apply_dir_rename() and check_dir_renamed()
merge-ort: implement compute_collisions()
merge-ort: modify collect_renames() for directory rename handling
merge-ort: implement handle_directory_level_conflicts()
merge-ort: implement compute_rename_counts()
merge-ort: copy get_renamed_dir_portion() from merge-recursive.c
merge-ort: add outline of get_provisional_directory_renames()
merge-ort: add outline for computing directory renames
merge-ort: collect which directories are removed in dirs_removed
merge-ort: initialize and free new directory rename data structures
merge-ort: add new data structures for directory rename detection
merge-ort: add implementation of type-changed rename handling
merge-ort: add implementation of normal rename handling
merge-ort: add implementation of rename collisions
...
Elijah Newren [Tue, 19 Jan 2021 19:53:53 +0000 (19:53 +0000)]
merge-ort: fix a directory rename detection bug
As noted in commit 902c521a35 ("t6423: more involved directory rename
test", 2020-10-15), when we have a case where
* dir/subdir/ has several files
* almost all files in dir/subdir/ are renamed to folder/subdir/
* one of the files in dir/subdir/ is renamed to folder/subdir/newsubdir/
* the other side of history (that doesn't do the renames) adds a
new file to dir/subdir/
Then for the majority of the file renames, the directory rename of
dir/subdir/ -> folder/subdir/
is actually not represented that way but as
dir/ -> folder/
We also had one rename that was represented as
dir/subdir/ -> folder/subdir/newsubdir/
Now, since there's a new file in dir/subdir/, where does it go? Well,
there's only one rule for dir/subdir/, so the code previously noted that
this rule had the "majority" of the one "relevant" rename and thus
erroneously used it to place the file in folder/subdir/newsubdir/. We
really want the heavy weight associated with dir/ -> folder/ to also be
treated as dir/subdir/ -> folder/subdir/, so that we correctly place the
file in folder/subdir/.
Add a bunch of logic to make sure that we use all relevant renamings in
directory rename detection.
Note that testcase 12f of t6423 still fails after this, but it gets
further than merge-recursive does. There are some performance related
bits in that testcase (the region_enter messages) that do not yet
succeed, but the rest of the testcase works after this patch.
Subsequent patch series will fix up the performance side.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 19 Jan 2021 19:53:52 +0000 (19:53 +0000)]
merge-ort: process_renames() now needs more defensiveness
Since directory rename detection adds new paths to opt->priv->paths and
removes old ones, process_renames() needs to now check whether
pair->one->path actually exists in opt->priv->paths instead of just
assuming it does.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function roughly follows the same outline as the function of the
same name from merge-recursive.c, but the code diverges in multiple
ways due to some special considerations:
* merge-ort's version needs to update opt->priv->paths with any new
paths (and opt->priv->paths points to struct conflict_infos which
track quite a bit of metadata for each path); merge-recursive's
version would directly update the index
* merge-ort requires that opt->priv->paths has any leading directories
of any relevant files also be included in the set of paths. And
due to pointer equality requirements on merged_info.directory_name,
we have to be careful how we compute and insert these.
* due to the above requirements on opt->priv->paths, merge-ort's
version starts with a long comment to explain all the special
considerations that need to be handled
* merge-ort can use the full data stored in opt->priv->paths to avoid
making expensive get_tree_entry() calls to regather the necessary
data.
* due to messages being deferred automatically in merge-ort, this is
the best place to handle conflict messages whereas in
merge-recursive.c they are deferred manually so that processing of
entries does all the printing
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 19 Jan 2021 19:53:50 +0000 (19:53 +0000)]
merge-ort: add a new toplevel_dir field
Due to the string-equality-iff-pointer-equality requirements placed on
merged_info.directory_name, apply_directory_rename_modifications() will
need to have access to the exact toplevel directory name string pointer
and can't just use a new empty string. Store it in a field that we can
use.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is copied from merge-recursive.c, with minor tweaks due to:
* using strmap API
* merge-ort not using the non_unique_new_dir field, since it'll
obviate its need entirely later with performance improvements
* adding a new path_in_way() function that uses opt->priv->paths
instead of doing an expensive tree_has_path() lookup to see if
a tree has a given path.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 19 Jan 2021 19:53:48 +0000 (19:53 +0000)]
merge-ort: implement check_for_directory_rename()
This is copied from merge-recursive.c, with minor tweaks due to using strmap
API and the fact that it can use opt->priv->paths to get all pathnames that
exist instead of taking a tree object.
This depends on a new function, handle_path_level_conflicts(), which
just has a placeholder die-not-yet-implemented implementation for now; a
subsequent patch will implement it.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 19 Jan 2021 19:53:46 +0000 (19:53 +0000)]
merge-ort: implement compute_collisions()
This is nearly a wholesale copy of compute_collisions() from
merge-recursive.c, and the logic remains the same, but it has been
tweaked slightly due to:
* using strmap.h API (instead of direct hashmaps)
* allocation/freeing of data structures were done separately in
merge_start() and clear_or_reinit_internal_opts() in an earlier
patch in this series
* there is no non_unique_new_dir data field in merge-ort; that will
be handled a different way
It does depend on two new functions, apply_dir_rename() and
check_dir_renamed() which were introduced with simple
die-not-yet-implemented shells and will be implemented in subsequent
patches.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 19 Jan 2021 19:53:45 +0000 (19:53 +0000)]
merge-ort: modify collect_renames() for directory rename handling
collect_renames() is similar to merge-recursive.c's get_renames(), but
lacks the directory rename handling found in the latter. Port that code
structure over to merge-ort. This introduces three new
die-not-yet-implemented functions that will be defined in future
commits.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is modelled on the version of handle_directory_level_conflicts()
from merge-recursive.c, but is massively simplified due to the following
factors:
* strmap API provides simplifications over using direct hashmap
* we have a dirs_removed field in struct rename_info that we have an
easy way to populate from collect_merge_info(); this was already
used in compute_rename_counts() and thus we do not need to check
for condition #2.
* The removal of condition #2 by handling it earlier in the code also
obviates the need to check for condition #3 -- if both sides renamed
a directory, meaning that the directory no longer exists on either
side, then neither side could have added any new files to that
directory, and thus there are no files whose locations we need to
move due to such a directory rename.
In fact, the same logic that makes condition #3 irrelevant means
condition #1 is also irrelevant so we could drop this function.
However, it is cheap to check if both sides rename the same directory,
and doing so can save future computation. So, simply remove any
directories that both sides renamed from the list of directory renames.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 19 Jan 2021 19:53:43 +0000 (19:53 +0000)]
merge-ort: implement compute_rename_counts()
This function is based on the first half of get_directory_renames() from
merge-recursive.c; as part of the implementation, factor out a routine,
increment_count(), to update the bookkeeping to track the number of
items renamed into new directories.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 19 Jan 2021 19:53:41 +0000 (19:53 +0000)]
merge-ort: add outline of get_provisional_directory_renames()
This function is based on merge-recursive.c's get_directory_renames(),
except that the first half has been split out into a not-yet-implemented
compute_rename_counts(). The primary difference here is our lack of the
non_unique_new_dir boolean in our strmap. The lack of that field will
at first cause us to fail testcase 2b of t6423; however, future
optimizations will obviate the need for that ugly field so we have just
left it out.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Tue, 19 Jan 2021 19:53:40 +0000 (19:53 +0000)]
merge-ort: add outline for computing directory renames
Port some directory rename handling changes from merge-recursive.c's
detect_and_process_renames() to the same-named function of merge-ort.c.
This does not yet add any use or handling of directory renames, just the
outline for where we start to compute them. Thus, a future patch will
add port additional changes to merge-ort's detect_and_process_renames().
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 14 Jan 2021 20:41:54 +0000 (12:41 -0800)]
Merge branch 'en/ort-conflict-handling' into en/merge-ort-perf
* en/ort-conflict-handling:
merge-ort: add handling for different types of files at same path
merge-ort: copy find_first_merges() implementation from merge-recursive.c
merge-ort: implement format_commit()
merge-ort: copy and adapt merge_submodule() from merge-recursive.c
merge-ort: copy and adapt merge_3way() from merge-recursive.c
merge-ort: flesh out implementation of handle_content_merge()
merge-ort: handle book-keeping around two- and three-way content merge
merge-ort: implement unique_path() helper
merge-ort: handle directory/file conflicts that remain
merge-ort: handle D/F conflict where directory disappears due to merge
Junio C Hamano [Thu, 14 Jan 2021 20:41:45 +0000 (12:41 -0800)]
Merge branch 'en/diffcore-rename' into en/merge-ort-perf
* en/diffcore-rename:
diffcore-rename: remove unnecessary duplicate entry checks
diffcore-rename: accelerate rename_dst setup
diffcore-rename: simplify and accelerate register_rename_src()
t4058: explore duplicate tree entry handling in a bit more detail
t4058: add more tests and documentation for duplicate tree entry handling
diffcore-rename: reduce jumpiness in progress counters
diffcore-rename: simplify limit check
diffcore-rename: avoid usage of global in too_many_rename_candidates()
diffcore-rename: rename num_create to num_destinations
Junio C Hamano [Thu, 7 Jan 2021 07:33:44 +0000 (23:33 -0800)]
Merge branch 'en/merge-ort-recursive'
The ORT merge strategy learned to synthesize virtual ancestor tree
by recursively merging multiple merge bases together, just like the
recursive backend has done for years.
* en/merge-ort-recursive:
merge-ort: implement merge_incore_recursive()
merge-ort: make clear_internal_opts() aware of partial clearing
merge-ort: copy a few small helper functions from merge-recursive.c
commit: move reverse_commit_list() from merge-recursive
Junio C Hamano [Thu, 7 Jan 2021 07:33:44 +0000 (23:33 -0800)]
Merge branch 'fc/pull-merge-rebase'
When a user does not tell "git pull" to use rebase or merge, the
command gives a loud message telling a user to choose between
rebase or merge but creates a merge anyway, forcing users who would
want to rebase to redo the operation. Fix an early part of this
problem by tightening the condition to give the message---there is
no reason to stop or force the user to choose between rebase or
merge if the history fast-forwards.
* fc/pull-merge-rebase:
pull: display default warning only when non-ff
pull: correct condition to trigger non-ff advice
pull: get rid of unnecessary global variable
pull: give the advice for choosing rebase/merge much later
pull: refactor fast-forward check
Junio C Hamano [Thu, 7 Jan 2021 07:33:44 +0000 (23:33 -0800)]
Merge branch 'en/merge-ort-2'
More "ORT" merge strategy.
* en/merge-ort-2:
merge-ort: add modify/delete handling and delayed output processing
merge-ort: add die-not-implemented stub handle_content_merge() function
merge-ort: add function grouping comments
merge-ort: add a paths_to_free field to merge_options_internal
merge-ort: add a path_conflict field to merge_options_internal
merge-ort: add a clear_internal_opts helper
merge-ort: add a few includes
Junio C Hamano [Thu, 7 Jan 2021 07:33:43 +0000 (23:33 -0800)]
Merge branch 'en/merge-ort-impl'
The merge backend "done right" starts to emerge.
* en/merge-ort-impl:
merge-ort: free data structures in merge_finalize()
merge-ort: add implementation of record_conflicted_index_entries()
tree: enable cmp_cache_name_compare() to be used elsewhere
merge-ort: add implementation of checkout()
merge-ort: basic outline for merge_switch_to_result()
merge-ort: step 3 of tree writing -- handling subdirectories as we go
merge-ort: step 2 of tree writing -- function to create tree object
merge-ort: step 1 of tree writing -- record basenames, modes, and oids
merge-ort: have process_entries operate in a defined order
merge-ort: add a preliminary simple process_entries() implementation
merge-ort: avoid recursing into identical trees
merge-ort: record stage and auxiliary info for every path
merge-ort: compute a few more useful fields for collect_merge_info
merge-ort: avoid repeating fill_tree_descriptor() on the same tree
merge-ort: implement a very basic collect_merge_info()
merge-ort: add an err() function similar to one from merge-recursive
merge-ort: use histogram diff
merge-ort: port merge_start() from merge-recursive
merge-ort: add some high-level algorithm structure
merge-ort: setup basic internal data structures
Junio C Hamano [Thu, 7 Jan 2021 07:33:43 +0000 (23:33 -0800)]
Merge branch 'ab/trailers-extra-format'
The "--format=%(trailers)" mechanism gets enhanced to make it
easier to design output for machine consumption.
* ab/trailers-extra-format:
pretty format %(trailers): add a "key_value_separator"
pretty format %(trailers): add a "keyonly"
pretty-format %(trailers): fix broken standalone "valueonly"
pretty format %(trailers) doc: avoid repetition
pretty format %(trailers) test: split a long line
Commit 25d5ea410f ("[PATCH] Redo rename/copy detection logic.",
2005-05-24) added a duplicate entry check on rename_src in order to
avoid segfaults; the code at the time was prone to double free()s and an
easy way to avoid it was just to turn off rename detection for any
duplicate entries. Note that the form of the check was modified two
commits ago in this series.
Similarly, commit 4d6be03b95 ("diffcore-rename: avoid processing
duplicate destinations", 2015-02-26) added a duplicate entry check
on rename_dst for the exact same reason -- the code was prone to double
free()s, and an easy way to avoid it was just to turn off rename
detection entirely. Note that the form of the check was modified in the
commit just before this one.
In the original code in both places, the code was dealing with
individual diff_filespecs and trying to match things up, instead of just
keeping the original diff_filepairs around as we do now. The
intervening change in structure has fixed the accounting problems and
the associated double free()s that used to occur, and thus we already
have a better fix. As such, we can remove the band-aid checks for
duplicate entries.
Due to the last two patches, the diffcore_rename() setup is no longer a
sizeable chunk of overall runtime. Thus, in a large rebase of many
commits with lots of renames and several optimizations to inexact rename
detection, this patch only speeds up the overall code by about half a
percent or so and is pretty close to the run-to-run variability making
it hard to get an exact measurement. However, with some trace2 regions
around the setup code in diffcore_rename() so that I can focus on just
it, I measure that this patch consistently saves almost a third of the
remaining time spent in diffcore_rename() setup.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 1 Jan 2021 02:34:48 +0000 (02:34 +0000)]
merge-ort: add handling for different types of files at same path
Add some handling that explicitly considers collisions of the following
types:
* file/submodule
* file/symlink
* submodule/symlink
Leaving them as conflicts at the same path are hard for users to
resolve, so move one or both of them aside so that they each get their
own path.
Note that in the case of recursive handling (i.e. call_depth > 0), we
can just use the merge base of the two merge bases as the merge result
much like we do with modify/delete conflicts, binary files, conflicting
submodule values, and so on.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 1 Jan 2021 02:34:47 +0000 (02:34 +0000)]
merge-ort: copy find_first_merges() implementation from merge-recursive.c
Code is identical for the function body in the two files, the call
signature is just slightly different in merge-ort than merge-recursive
as noted a couple commits ago.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 1 Jan 2021 02:34:46 +0000 (02:34 +0000)]
merge-ort: implement format_commit()
This implementation is based on a mixture of print_commit() and
output_commit_title() from merge-recursive.c so that it can be used to
take over both functions.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 1 Jan 2021 02:34:45 +0000 (02:34 +0000)]
merge-ort: copy and adapt merge_submodule() from merge-recursive.c
Take merge_submodule() from merge-recursive.c and make slight
adjustments, predominantly around deferring output using path_msg()
instead of using merge-recursive's output() and show() functions.
There's also a fix for recursive cases (when call_depth > 0) and a
slight change to argument order for find_first_merges().
find_first_merges() and format_commit() are left unimplemented for
now, but will be added by subsequent commits.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 1 Jan 2021 02:34:44 +0000 (02:34 +0000)]
merge-ort: copy and adapt merge_3way() from merge-recursive.c
Take merge_3way() from merge-recursive.c and make slight adjustments
based on different data structures (direct usage of object_id
rather diff_filespec, separate pathnames which based on our careful
interning of pathnames in opt->priv->paths can be compared with '!='
rather than 'strcmp').
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 1 Jan 2021 02:34:43 +0000 (02:34 +0000)]
merge-ort: flesh out implementation of handle_content_merge()
This implementation is based heavily on merge_mode_and_contents() from
merge-recursive.c, though it has some fixes for recursive merges (i.e.
when call_depth > 0), and has a number of changes throughout based on
slight differences in data structures and in how the functions are
called.
It is, however, based on two new helper functions -- merge_3way() and
merge_submodule -- for which we only provide die-not-implemented stubs
at this point. Future commits will add implementations of these
functions.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 1 Jan 2021 02:34:42 +0000 (02:34 +0000)]
merge-ort: handle book-keeping around two- and three-way content merge
In addition to the content merge (which will go in a subsequent commit),
we need to worry about conflict messages, placing results in higher
order stages in case of a df_conflict, and making sure the results are
placed in ci->merged.result so that they will show up in the working
tree. Take care of all that external book-keeping, moving the
simplistic just-take-HEAD code into the barebones handle_content_merge()
function for now. Subsequent commits will flesh out
handle_content_merge().
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 1 Jan 2021 02:34:41 +0000 (02:34 +0000)]
merge-ort: implement unique_path() helper
Implement unique_path(), based on the one from merge-recursive.c. It is
simplified, however, due to: (1) using strmaps, and (2) the fact that
merge-ort lets the checkout codepath handle possible collisions with the
working tree means that other code locations don't have to.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 1 Jan 2021 02:34:40 +0000 (02:34 +0000)]
merge-ort: handle directory/file conflicts that remain
When a directory/file conflict remains, we can leave the directory where
it is, but need to move the information about the file to a different
pathname. After moving the file to a different pathname, we allow
subsequent process_entry() logic to handle any additional details that
might be relevant.
This depends on a new helper function, unique_path(), that dies with an
unimplemented error currently but will be implemented in a subsequent
commit.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 1 Jan 2021 02:34:39 +0000 (02:34 +0000)]
merge-ort: handle D/F conflict where directory disappears due to merge
When one side has a directory at a given path and the other side of
history has a file at the path, but the merge resolves the directory
away (e.g. because no path within that directory was modified and the
other side deleted it, or because renaming moved all the files
elsewhere), then we don't actually have a conflict anymore. We just
need to clear away any information related to the relevant directory,
and then the subsequent process_entry() handling can handle the given
path.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 23 Dec 2020 21:59:46 +0000 (13:59 -0800)]
Merge branch 'ma/maintenance-crontab-fix'
Hotfix for a topic of this cycle.
* ma/maintenance-crontab-fix:
t7900-maintenance: test for magic markers
gc: fix handling of crontab magic markers
git-maintenance.txt: add missing word
* js/no-more-prepare-for-main-in-test:
tests: drop the `PREPARE_FOR_MAIN_BRANCH` prereq
t9902: use `main` as initial branch name
t6302: use `main` as initial branch name
t5703: use `main` as initial branch name
t5510: use `main` as initial branch name
t5505: finalize transitioning to using the branch name `main`
t3205: finalize transitioning to using the branch name `main`
t3203: complete the transition to using the branch name `main`
t3201: finalize transitioning to using the branch name `main`
t3200: finish transitioning to the initial branch name `main`
t1400: use `main` as initial branch name
Eric Sunshine [Sun, 20 Dec 2020 21:27:40 +0000 (16:27 -0500)]
t/perf: avoid unnecessary test_export() recursion
test_export() has been self-recursive since its inception even though a
simple for-loop would have served just as well to append its arguments
to the `test_export_` variable separated by the pipe character "|".
Recently `test_export_` was changed instead to a space-separated list of
tokens to be exported, an operation which can be accomplished via a
single simple assignment, with no need for looping or recursion.
Therefore, simplify the implementation.
While at it, take advantage of the fact that variable names to be
exported are shell identifiers, thus won't be composed of special
characters or whitespace, thus simple a `$*` can be used rather than
magical `"$@"`.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Tue, 22 Dec 2020 15:44:42 +0000 (15:44 +0000)]
git.txt: fix typos in 'linkgit' macro invocation
The 'linkgit' Asciidoc macro is misspelled as 'linkit' in the
description of 'GIT_SEQUENCE_EDITOR' since the addition of that variable
to git(1) in 902a126eca (doc: mention GIT_SEQUENCE_EDITOR and
'sequence.editor' more, 2020-08-31). Also, it uses two colons instead of
one.
Fix that.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Nipunn Koorapati [Tue, 22 Dec 2020 03:58:16 +0000 (03:58 +0000)]
negative-refspec: fix segfault on : refspec
The logic added to check for negative pathspec match by c0192df630
(refspec: add support for negative refspecs, 2020-09-30) looks at
refspec->src assuming it is never NULL, however when
remote.origin.push is set to ":", then refspec->src is NULL,
causing a segfault within strcmp.
Tell git to handle matching refspec by adding the needle to the
set of positively matched refspecs, since matching ":" refspecs
match anything as src.
Add test for matching refspec pushes fetch-negative-refspec
both individually and in combination with a negative refspec.
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Martin Ågren [Mon, 21 Dec 2020 21:26:33 +0000 (22:26 +0100)]
t7900-maintenance: test for magic markers
When we insert our "BEGIN" and "END" markers into the cron table, it's
so that a Git version from many years into the future would be able to
identify this region in the cron table. Let's add a test to make sure
that these markers don't ever change.
Signed-off-by: Martin Ågren <martin.agren@gmail.com> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Martin Ågren [Mon, 21 Dec 2020 21:26:32 +0000 (22:26 +0100)]
gc: fix handling of crontab magic markers
On `git maintenance start`, we add a few entries to the user's cron
table. We wrap our entries using two magic markers, "# BEGIN GIT
MAINTENANCE SCHEDULE" and "# END GIT MAINTENANCE SCHEDULE". At a later
`git maintenance stop`, we will go through the table and remove these
lines. Or rather, we will remove the "BEGIN" marker, the "END" marker
and everything between them.
Alas, we have a bug in how we detect the "END" marker: we don't. As we
loop through all the lines of the crontab, if we are in the "old
region", i.e., the region we're aiming to remove, we make an early
`continue` and don't get as far as checking for the "END" marker. Thus,
once we've seen our "BEGIN", we remove everything until the end of the
file.
Rewrite the logic for identifying these markers. There are four cases
that are mutually exclusive: The current line starts a region or it ends
it, or it's firmly within the region, or it's outside of it (and should
be printed).
Signed-off-by: Martin Ågren <martin.agren@gmail.com> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Martin Ågren [Mon, 21 Dec 2020 21:26:31 +0000 (22:26 +0100)]
git-maintenance.txt: add missing word
Add a missing "a" before "bunch".
Signed-off-by: Martin Ågren <martin.agren@gmail.com> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
checkout -p: handle tree arguments correctly again
This fixes a segmentation fault.
The bug is caused by dereferencing `new_branch_info->commit` when it is
`NULL`, which is the case when the tree-ish argument is actually a tree,
not a commit-ish. This was introduced in 5602b500c3c (builtin/checkout:
fix `git checkout -p HEAD...` bug, 2020-10-07), where we tried to ensure
that the special tree-ish `HEAD...` is handled correctly.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eric Sunshine [Mon, 21 Dec 2020 08:16:01 +0000 (03:16 -0500)]
worktree: teach `repair` to fix multi-directional breakage
`git worktree repair` knows how to repair the two-way links between the
repository and a worktree as long as a link in one or the other
direction is sound. For instance, if a linked worktree is moved (without
using `git worktree move`), repair is possible because the worktree
still knows the location of the repository even though the repository no
longer knows where the worktree is. Similarly, if the repository is
moved, repair is possible since the repository still knows the locations
of the worktrees even though the worktrees no longer know where the
repository is.
However, if both the repository and the worktrees are moved, then links
are severed in both directions, and no repair is possible. This is the
case even when the new worktree locations are specified as arguments to
`git worktree repair`. The reason for this limitation is twofold. First,
when `repair` consults the worktree's gitfile (/path/to/worktree/.git)
to determine the corresponding <repo>/worktrees/<id>/gitdir file to fix,
<repo> is the old path to the repository, thus it is unable to fix the
`gitdir` file at its new location since it doesn't know where it is.
Second, when `repair` consults <repo>/worktrees/<id>/gitdir to find the
location of the worktree's gitfile (/path/to/worktree/.git), the path
recorded in `gitdir` is the old location of the worktree's gitfile, thus
it is unable to repair the gitfile since it doesn't know where it is.
Fix these shortcomings by teaching `repair` to attempt to infer the new
location of the <repo>/worktrees/<id>/gitdir file when the location
recorded in the worktree's gitfile has become stale but the file is
otherwise well-formed. The inference is intentionally simple-minded.
For each worktree path specified as an argument, `git worktree repair`
manually reads the ".git" gitfile at that location and, if it is
well-formed, extracts the <id>. It then searches for a corresponding
<id> in <repo>/worktrees/ and, if found, concludes that there is a
reasonable match and updates <repo>/worktrees/<id>/gitdir to point at
the specified worktree path. In order for <repo> to be known, `git
worktree repair` must be run in the main worktree or bare repository.
`git worktree repair` first attempts to repair each incoming
/path/to/worktree/.git gitfile to point at the repository, and then
attempts to repair outgoing <repo>/worktrees/<id>/gitdir files to point
at the worktrees. This sequence was chosen arbitrarily when originally
implemented since the order of fixes is immaterial as long as one side
of the two-way link between the repository and a worktree is sound.
However, for this new repair technique to work, the order must be
reversed. This is because the new inference mechanism, when it is
successful, allows the outgoing <repo>/worktrees/<id>/gitdir file to be
repaired, thus fixing one side of the two-way link. Once that side is
fixed, the other side can be fixed by the existing repair mechanism,
hence the order of repairs is now significant.
Two safeguards are employed to avoid hijacking a worktree from a
different repository if the user accidentally specifies a foreign
worktree as an argument. The first, as described above, is that it
requires an <id> match between the repository and the worktree. That
itself is not foolproof for preventing hijack, so the second safeguard
is that the inference will only kick in if the worktree's
/path/to/worktree/.git gitfile does not point at a repository.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Felipe Contreras [Fri, 18 Dec 2020 15:14:06 +0000 (09:14 -0600)]
test: bisect-porcelain: fix location of files
Commit ba7eafe146 (t6030: explicitly test for bisection cleanup,
2017-09-29) introduced checks for files in the $GIT_DIR directory, but
that variable is not always defined, and in this test file it's not.
Therefore these checks always passed regardless of the presence of these
files (unless the user has some /BISECT_LOG file, for some reason).
Let's check the files in the correct location.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jiang Xin [Sun, 20 Dec 2020 23:10:19 +0000 (07:10 +0800)]
Merge remote-tracking branch 'github/master' into git-po-master
* github/master: (42 commits)
Git 2.30-rc1
git-gui: use gray background for inactive text widgets
Another batch before 2.30-rc1
git-gui: Fix selected text colors
Makefile: conditionally include GIT-VERSION-FILE
git-gui: fix colored label backgrounds when using themed widgets
config.mak.uname: remove old NonStop compatibility settings
diff: correct interaction between --exit-code and -I<pattern>
t/perf: fix test_export() failure with BSD `sed`
style: do not "break" in switch() after "return"
compat-util: pretend that stub setitimer() always succeeds
strmap: make callers of strmap_remove() to call it in void context
doc: mention Python 3.x supports
index-format.txt: document v2 format of file system monitor extension
docs: multi-pack-index: remove note about future 'verify' work
init: provide useful advice about init.defaultBranch
get_default_branch_name(): prepare for showing some advice
branch -m: allow renaming a yet-unborn branch
init: document `init.defaultBranch` better
t7900: use --fixed-value in git-maintenance tests
...
Junio C Hamano [Fri, 18 Dec 2020 23:15:17 +0000 (15:15 -0800)]
Merge branch 'js/init-defaultbranch-advice'
Our users are going to be trained to prepare for future change of
init.defaultBranch configuration variable.
* js/init-defaultbranch-advice:
init: provide useful advice about init.defaultBranch
get_default_branch_name(): prepare for showing some advice
branch -m: allow renaming a yet-unborn branch
init: document `init.defaultBranch` better
Junio C Hamano [Fri, 18 Dec 2020 23:07:10 +0000 (15:07 -0800)]
Merge https://github.com/prati0100/git-gui
* https://github.com/prati0100/git-gui:
git-gui: use gray background for inactive text widgets
git-gui: Fix selected text colors
Makefile: conditionally include GIT-VERSION-FILE
git-gui: fix colored label backgrounds when using themed widgets
git-gui: ssh-askpass: add a checkbox to show the input text
git-gui: update Russian translation
git-gui: use commit message template
git-gui: Only touch GITGUI_MSG when needed
Pratyush Yadav [Fri, 18 Dec 2020 19:32:34 +0000 (01:02 +0530)]
Merge branch 'sh/inactive-background'
Set a different background color for selections in inactive widgets.
This inactive color is calculated from the current theme colors to make
sure it works for all themes.
* sh/inactive-background:
git-gui: use gray background for inactive text widgets
Junio C Hamano [Thu, 17 Dec 2020 23:06:40 +0000 (15:06 -0800)]
Merge branch 'rj/make-clean'
Build optimization.
* rj/make-clean:
Makefile: don't use a versioned temp distribution directory
Makefile: don't try to clean old debian build product
gitweb/Makefile: conditionally include ../GIT-VERSION-FILE
Documentation/Makefile: conditionally include ../GIT-VERSION-FILE
Documentation/Makefile: conditionally include doc.dep