Jeff King [Thu, 30 May 2024 06:44:22 +0000 (02:44 -0400)]
mv: move src_dir cleanup to end of cmd_mv()
Commit b6f51e3db9 (mv: cleanup empty WORKING_DIRECTORY, 2022-08-09)
added an auxiliary array where we store directory arguments that we see
while processing the incoming arguments. After actually moving things,
we then use that array to remove now-empty directories, and then
immediately free the array.
But if the actual move queues any errors in only_match_skip_worktree,
that can cause us to jump straight to the "out" label to clean up,
skipping the free() and leaking the array.
Let's push the free() down past the "out" label so that we always clean
up (the array is initialized to NULL, so this is always safe). We'll
hold on to the memory a little longer than necessary, but clarity is
more important than micro-optimizing here.
Note that the adjacent "a_src_dir" strbuf does not suffer the same
problem; it is only allocated during the removal step.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 30 May 2024 06:39:32 +0000 (02:39 -0400)]
t-strvec: use va_end() to match va_start()
Our check_strvec_loc() helper uses a variable argument list. When we
va_start(), we must be sure to va_end() before leaving the function.
This is required by the standard (though the effect of forgetting will
vary between platforms).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 30 May 2024 15:54:58 +0000 (08:54 -0700)]
Merge branch 'ps/leakfixes' into jk/leakfixes
* ps/leakfixes:
builtin/mv: fix leaks for submodule gitfile paths
builtin/mv: refactor to use `struct strvec`
builtin/mv duplicate string list memory
builtin/mv: refactor `add_slash()` to always return allocated strings
strvec: add functions to replace and remove strings
submodule: fix leaking memory for submodule entries
commit-reach: fix memory leak in `ahead_behind()`
builtin/credential: clear credential before exit
config: plug various memory leaks
config: clarify memory ownership in `git_config_string()`
builtin/log: stop using globals for format config
builtin/log: stop using globals for log config
convert: refactor code to clarify ownership of check_roundtrip_encoding
diff: refactor code to clarify memory ownership of prefixes
config: clarify memory ownership in `git_config_pathname()`
http: refactor code to clarify memory ownership
checkout: clarify memory ownership in `unique_tracking_name()`
strbuf: fix leak when `appendwholeline()` fails with EOF
transport-helper: fix leaking helper name
Chandra Pratap [Wed, 29 May 2024 16:59:31 +0000 (22:29 +0530)]
t: improve the test-case for parse_names()
In the existing test-case for parse_names(), the fact that empty
lines should be ignored is not obvious because the empty line is
immediately followed by end-of-string. This can be mistaken as the
empty line getting replaced by NULL. Improve this by adding a
non-empty line after the empty one to demonstrate the intended behavior.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Chandra Pratap [Wed, 29 May 2024 16:59:30 +0000 (22:29 +0530)]
t: add test for put_be16()
put_be16() is a function defined in reftable/basics.{c, h} for which
there are no tests in the current setup. Add a test for the same.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Chandra Pratap [Wed, 29 May 2024 16:59:29 +0000 (22:29 +0530)]
t: move tests from reftable/record_test.c to the new unit test
common_prefix_size(), get_be24() and put_be24() are functions defined
in reftable/basics.{c, h}. Move the tests for these functions from
reftable/record_test.c to the newly ported test.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Chandra Pratap [Wed, 29 May 2024 16:59:28 +0000 (22:29 +0530)]
t: move tests from reftable/stack_test.c to the new unit test
parse_names() and names_equal() are functions defined in
reftable/basics.{c, h}. Move the tests for these functions from
reftable/stack_test.c to the newly ported test.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Chandra Pratap [Wed, 29 May 2024 16:59:27 +0000 (22:29 +0530)]
t: move reftable/basics_test.c to the unit testing framework
reftable/basics_test.c exercise the functions defined in
reftable/basics.{c, h}. Migrate reftable/basics_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 29 May 2024 16:02:16 +0000 (09:02 -0700)]
safe.directory: allow "lead/ing/path/*" match
When safe.directory was introduced in v2.30.3 timeframe, 8959555c
(setup_git_directory(): add an owner check for the top-level
directory, 2022-03-02), it only allowed specific opt-out
directories. Immediately after an embargoed release that included
the change, 0f85c4a3 (setup: opt-out of check with safe.directory=*,
2022-04-13) was done as a response to loosen the check so that a
single '*' can be used to say "I trust all repositories" for folks
who host too many repositories to list individually.
Let's further loosen the check to allow people to say "everything
under this hierarchy is deemed safe" by specifying such a leading
directory with "/*" appended to it.
Junio C Hamano [Wed, 29 May 2024 16:32:24 +0000 (09:32 -0700)]
Merge branch 'ps/leakfixes' into ps/no-writable-strings
* ps/leakfixes:
builtin/mv: fix leaks for submodule gitfile paths
builtin/mv: refactor to use `struct strvec`
builtin/mv duplicate string list memory
builtin/mv: refactor `add_slash()` to always return allocated strings
strvec: add functions to replace and remove strings
submodule: fix leaking memory for submodule entries
commit-reach: fix memory leak in `ahead_behind()`
builtin/credential: clear credential before exit
config: plug various memory leaks
config: clarify memory ownership in `git_config_string()`
builtin/log: stop using globals for format config
builtin/log: stop using globals for log config
convert: refactor code to clarify ownership of check_roundtrip_encoding
diff: refactor code to clarify memory ownership of prefixes
config: clarify memory ownership in `git_config_pathname()`
http: refactor code to clarify memory ownership
checkout: clarify memory ownership in `unique_tracking_name()`
strbuf: fix leak when `appendwholeline()` fails with EOF
transport-helper: fix leaking helper name
t/: migrate helper/test-{sha1, sha256} to unit-tests/t-hash
t/helper/test-{sha1, sha256} and t/t0015-hash.sh test the hash
implementation of SHA-1 and SHA-256 in Git with basic hash values.
Migrate them to the new unit testing framework for better debugging
and runtime performance.
The 'sha1' and 'sha256' subcommands are still not removed due to
pack_trailer():lib-pack.sh's reliance on them. The 'sha1' subcommand
is also relied upon by t0013-sha1dc (which requires 'test-tool
sha1' dying when it is used on a file created to contain the
known sha1 attack).
Helped-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Co-authored-by: Achu Luma <ach.lumap@gmail.com> Signed-off-by: Achu Luma <ach.lumap@gmail.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf: introduce strbuf_addstrings() to repeatedly add a string
In a following commit we are going to port code from
"t/helper/test-sha256.c", t/helper/test-hash.c and "t/t0015-hash.sh" to
a new "t/unit-tests/t-hash.c" file using the recently added unit test
framework.
To port code like: perl -e "$| = 1; print q{aaaaaaaaaa} for 1..100000;"
we are going to need a new strbuf_addstrings() function that repeatedly
adds the same string a number of times to a buffer.
Such a strbuf_addstrings() function would already be useful in
"json-writer.c" and "builtin/submodule-helper.c" as both of these files
already have code that repeatedly adds the same string. So let's
introduce such a strbuf_addstrings() function in "strbuf.{c,h}" and use
it in both "json-writer.c" and "builtin/submodule-helper.c".
We use the "strbuf_addstrings" name as this way strbuf_addstr() and
strbuf_addstrings() would be similar for strings as strbuf_addch() and
strbuf_addchars() for characters.
Helped-by: Junio C Hamano <gitster@pobox.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Co-authored-by: Achu Luma <ach.lumap@gmail.com> Signed-off-by: Achu Luma <ach.lumap@gmail.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/: migrate helper/test-example-decorate to the unit testing framework
helper/test-example-decorate.c along with t9004-example.sh provide
an example of how to use the functions in decorate.h (which provides
a data structure that associates Git objects to void pointers) and
also test their output.
Migrate them to the new unit testing framework for better debugging
and runtime performance.
Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The credential helper that talks with osx keychain learned to avoid
storing back the authentication material it just got received from
the keychain.
* kn/osxkeychain-skip-idempotent-store:
osxkeychain: state to skip unnecessary store operations
osxkeychain: exclusive lock to serialize execution of operations
Junio C Hamano [Tue, 28 May 2024 18:17:07 +0000 (11:17 -0700)]
Merge branch 'ps/builtin-config-cleanup'
Code clean-up to reduce inter-function communication inside
builtin/config.c done via the use of global variables.
* ps/builtin-config-cleanup: (21 commits)
builtin/config: pass data between callbacks via local variables
builtin/config: convert flags to a local variable
builtin/config: track "fixed value" option via flags only
builtin/config: convert `key` to a local variable
builtin/config: convert `key_regexp` to a local variable
builtin/config: convert `regexp` to a local variable
builtin/config: convert `value_pattern` to a local variable
builtin/config: convert `do_not_match` to a local variable
builtin/config: move `respect_includes_opt` into location options
builtin/config: move default value into display options
builtin/config: move type options into display options
builtin/config: move display options into local variables
builtin/config: move location options into local variables
builtin/config: refactor functions to have common exit paths
config: make the config source const
builtin/config: check for writeability after source is set up
builtin/config: move actions into `cmd_config_actions()`
builtin/config: move legacy options into `cmd_config()`
builtin/config: move subcommand options into `cmd_config()`
builtin/config: move legacy mode into its own function
...
Junio C Hamano [Tue, 28 May 2024 18:17:06 +0000 (11:17 -0700)]
Merge branch 'ps/pseudo-ref-terminology'
Terminology to call various ref-like things are getting
straightened out.
* ps/pseudo-ref-terminology:
refs: refuse to write pseudorefs
ref-filter: properly distinuish pseudo and root refs
refs: pseudorefs are no refs
refs: classify HEAD as a root ref
refs: do not check ref existence in `is_root_ref()`
refs: rename `is_special_ref()` to `is_pseudo_ref()`
refs: rename `is_pseudoref()` to `is_root_ref()`
Documentation/glossary: define root refs as refs
Documentation/glossary: clarify limitations of pseudorefs
Documentation/glossary: redefine pseudorefs as special refs
Similar to the preceding commit, we have effectively given tracking
memory ownership of submodule gitfile paths. Refactor the code to start
tracking allocated strings in a separate `struct strvec` such that we
can easily plug those leaks. Mark now-passing tests as leak free.
Note that ideally, we wouldn't require two separate data structures to
track those paths. But we do need to store `NULL` pointers for the
gitfile paths such that we can indicate that its corresponding entries
in the other arrays do not have such a path at all. And given that
`struct strvec`s cannot store `NULL` pointers we cannot use them to
store this information.
There is another small gotcha that is easy to miss: you may be wondering
why we don't want to store `SUBMODULE_WITH_GITDIR` in the strvec. This
is because this is a mere sentinel value and not actually a string at
all.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Memory allocation patterns in git-mv(1) are extremely hard to follow:
We copy around string pointers into manually-managed arrays, some of
which alias each other, but only sometimes, while we also drop some of
those strings at other times without ever daring to free them.
While this may be my own subjective feeling, it seems like others have
given up as the code has multiple calls to `UNLEAK()`. These are not
sufficient though, and git-mv(1) is still leaking all over the place
even with them.
Refactor the code to instead track strings in `struct strvec`. While
this has the effect of effectively duplicating some of the strings
without an actual need, it is way easier to reason about and fixes all
of the aliasing of memory that has been going on. It allows us to get
rid of the `UNLEAK()` calls and also fixes leaks that those calls did
not paper over.
Mark tests which are now leak-free accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
makes the next patch easier, where we will migrate to the paths being
owned by a strvec. given that we are talking about command line
parameters here it's also not like we have tons of allocations that this
would save
while at it, fix a memory leak
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/mv: refactor `add_slash()` to always return allocated strings
The `add_slash()` function will only conditionally return an allocated
string when the passed-in string did not yet have a trailing slash. This
makes the memory ownership harder to track than really necessary.
It's dubious whether this optimization really buys us all that much. The
number of times we execute this function is bounded by the number of
arguments to git-mv(1), so in the typical case we may end up saving an
allocation or two.
Simplify the code to unconditionally return allocated strings.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule: fix leaking memory for submodule entries
In `free_one_config()` we never end up freeing the `url` and `ignore`
fields and thus leak memory. Fix those leaks and mark now-passing tests
as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use a priority queue in `ahead_behind()` to compute the ahead/behind
count for commits. We may not iterate through all commits part of that
queue though in case all of its entries are stale. Consequently, as we
never make the effort to release the remaining commits, we end up
leaking bit arrays that we have allocated for each of the contained
commits.
Plug this leak and mark the corresponding test as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that memory ownership rules around `git_config_string()` and
`git_config_pathname()` are clearer, it also got easier to spot that
the returned memory needs to be free'd. Plug a subset of those cases and
mark now-passing tests as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
config: clarify memory ownership in `git_config_string()`
The out parameter of `git_config_string()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're using global variables to store the log configuration. Many of
these can be set both via the command line and via the config, and
depending on how they are being set, they may contain allocated strings.
This leads to hard-to-track memory ownership and memory leaks.
Refactor the code to instead use a `struct log_config` that is being
allocated on the stack. This allows us to more clearly scope the
variables, track memory ownership and ultimately release the memory.
This also prepares us for a change to `git_config_string()`, which will
be adapted to have a `char **` out parameter instead of `const char **`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
convert: refactor code to clarify ownership of check_roundtrip_encoding
The `check_roundtrip_encoding` variable is tracked in a `const char *`
even though it may contain allocated strings at times. The result is
that those strings may be leaking because we never free them.
Refactor the code to always store allocated strings in this variable.
The default value is handled in `check_roundtrip()` now, which is the
only user of the variable.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff: refactor code to clarify memory ownership of prefixes
The source and destination prefixes are tracked in a `const char *`
array, but may at times contain allocated strings. The result is that
those strings may be leaking because we never free them.
Refactor the code to always store allocated strings in those variables,
freeing them as required. This requires us to handle the default values
a bit different compared to before. But given that there is only a
single callsite where we use the variables to `struct diff_options` it's
easy to handle the defaults there.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
config: clarify memory ownership in `git_config_pathname()`
The out parameter of `git_config_pathname()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are various variables assigned via `git_config_string()` and
`git_config_pathname()` which are never free'd. This bug is relatable
because the out parameter of those functions are a `const char **`, even
though memory ownership is transferred to the caller.
We're about to adapt the functions to instead use `char **`. Prepare the
code accordingly. Note that the `(const char **)` casts will go away
once we have adapted the functions.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
checkout: clarify memory ownership in `unique_tracking_name()`
The function `unique_tracking_name()` returns an allocated string, but
does not clearly indicate this because its return type is `const char *`
instead of `char *`. This has led to various callsites where we never
free its returned memory at all, which causes memory leaks.
Plug those leaks and mark now-passing tests as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf: fix leak when `appendwholeline()` fails with EOF
In `strbuf_appendwholeline()` we call `strbuf_getwholeline()` with a
temporary buffer. In case the call returns an error we indicate this by
returning EOF, but never release the temporary buffer. This can cause a
leak though because `strbuf_getwholeline()` calls getline(3). Quoting
its documentation:
If *lineptr was set to NULL before the call, then the buffer
should be freed by the user program even on failure.
Consequently, the temporary buffer may hold allocated memory even when
the call to `strbuf_getwholeline()` fails.
Fix this by releasing the temporary buffer on error.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
- t7423: Introduced via b20c10fd9b (t7423: add tests for symlinked
submodule directories, 2024-01-28), passes since e8d0608944
(submodule: require the submodule path to contain directories only,
2024-03-26). The fix is not obviously related, but probably works
because we now die early in many code paths.
- t9xxx: All of these are exercising CVS-related tooling and pass
since at least Git v2.40. It's likely that these pass for a long
time already, but nobody ever noticed because Git developers do not
tend to have CVS on their machines.
Mark all of these tests as passing.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When initializing the transport helper in `transport_get()`, we
allocate the name of the helper. We neither end up transferring
ownership of the name, nor do we free it. The associated memory thus
leaks.
Fix this memory leak by freeing the string at the calling side in
`transport_get()`. `transport_helper_init()` now creates its own copy of
the string and thus can free it as required.
An alterantive way to fix this would be to transfer ownership of the
string passed into `transport_helper_init()`, which would avoid the call
to xstrdup(1). But it does make for a more surprising calling convention
as we do not typically transfer ownership of strings like this.
Mark now-passing tests as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In "t/lib-terminal.sh", we declare a lazy prerequisite for tests that
require a TTY. The prerequisite uses a Perl script to figure out whether
we do have a usable TTY or not and thus implicitly depends on the PERL
prerequisite, as well. Furthermore though, the script requires another
dependency that is easy to miss, namely on the IO::Pty module. If that
module is not installed, then the script will exit early due to an
reason unrelated to missing TTYs.
This easily leads to missing test coverage. But most importantly, our CI
systems are missing this dependency and thus don't execute those tests
at all. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ian Wienand [Mon, 27 May 2024 00:30:47 +0000 (10:30 +1000)]
Documentation: alias: rework notes into points
There are a number of caveats when using aliases. Rather than
stuffing them all together in a paragraph, let's separate them out
into individual points to make it clearer what's going on.
Signed-off-by: Ian Wienand <iwienand@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
difftool: add env vars directly in run_file_diff()
Add the environment variables of the child process directly using
strvec_push() instead of building an array out of them and then adding
that using strvec_pushv(). The new code is shorter and avoids magic
array index values and fragile array padding.
Add a configuration option to allow output from the promisor
fetching objects to be suppressed.
This allows us to stop commands like 'git blame' being swamped
with progress messages and gc notifications from the promisor
when used in a partial clone.
Signed-off-by: Tom Hughes <tom@compton.nu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 24 May 2024 23:59:12 +0000 (16:59 -0700)]
Merge branch 'fixes/2.45.1/2.44' into jc/fix-2.45.1-and-friends-for-maint
* fixes/2.45.1/2.44:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Fri, 24 May 2024 23:58:35 +0000 (16:58 -0700)]
Merge branch 'fixes/2.45.1/2.43' into fixes/2.45.1/2.44
* fixes/2.45.1/2.43:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Fri, 24 May 2024 23:58:11 +0000 (16:58 -0700)]
Merge branch 'fixes/2.45.1/2.42' into fixes/2.45.1/2.43
* fixes/2.45.1/2.42:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Fri, 24 May 2024 23:57:43 +0000 (16:57 -0700)]
Merge branch 'fixes/2.45.1/2.41' into fixes/2.45.1/2.42
* fixes/2.45.1/2.41:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Fri, 24 May 2024 23:57:01 +0000 (16:57 -0700)]
Merge branch 'fixes/2.45.1/2.40' into fixes/2.45.1/2.41
* fixes/2.45.1/2.40:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Junio C Hamano [Fri, 24 May 2024 23:02:55 +0000 (16:02 -0700)]
format-patch: move range/inter diff at the end of a single patch output
When running "format-patch" on a multiple patch series, the output
coming from "--interdiff" and "--range-diff" options is inserted
after the "shortlog" list of commits and the overall diffstat.
The idea is that shortlog/diffstat are shorter and with denser
information content, which gives a better overview before the
readers dive into more details of range/inter diff.
When working on a single patch, however, we stuff the inter/range
diff output before the actual patch, next to the diffstat. This
pushes down the patch text way down with inter/range diff output,
distracting readers.
Move the inter/range diff output to the very end of the output,
after all the patch text is shown.
As the inter/range diff is no longer part of the commentary block
(i.e., what comes after the log message and "---", but before the
patch text), stop producing "---" in the function that generates
them. But to separate it out visually (note: this is not needed
to help tools like "git apply" that pay attention to the hunk
headers to figure out the length of the hunks), add an extra blank
line between the end of the patch text and the inter/range diff.
Junio C Hamano [Fri, 24 May 2024 19:29:35 +0000 (12:29 -0700)]
Merge branch 'jc/fix-2.45.1-and-friends-for-2.39' into fixes/2.45.1/2.40
Revert overly aggressive "layered defence" that went into 2.45.1
and friends, which broke "git-lfs", "git-annex", and other use
cases, so that we can rebuild necessary counterparts in the open.
* jc/fix-2.45.1-and-friends-for-2.39:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Taylor Blau [Thu, 23 May 2024 21:27:24 +0000 (17:27 -0400)]
t/perf: implement performance tests for pseudo-merge bitmaps
Implement a straightforward performance test demonstrating the benefit
of pseudo-merge bitmaps by measuring how long it takes to count
reachable objects in a few different scenarios:
- without bitmaps, to demonstrate a reasonable baseline
- with bitmaps, but without pseudo-merges
- with bitmaps and pseudo-merges
Results from running this test on git.git are as follows:
Test this tree
-----------------------------------------------------------------------------------
5333.2: git rev-list --count --all --objects (no bitmaps) 3.54(3.45+0.08)
5333.3: git rev-list --count --all --objects (no pseudo-merges) 0.43(0.40+0.03)
5333.4: git rev-list --count --all --objects (with pseudo-merges) 0.12(0.11+0.01)
On a private repository which is much larger, and has many spikey parts
of history that aren't merged into the 'master' branch, the results are
as follows:
Test this tree
---------------------------------------------------------------------------------------
5333.1: git rev-list --count --all --objects (no bitmaps) 122.29(121.31+0.97)
5333.2: git rev-list --count --all --objects (no pseudo-merges) 21.88(21.30+0.58)
5333.3: git rev-list --count --all --objects (with pseudo-merges) 5.05(4.77+0.28)
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:27:21 +0000 (17:27 -0400)]
pseudo-merge: implement support for finding existing merges
This patch implements support for reusing existing pseudo-merge commits
when writing bitmaps when there is an existing pseudo-merge bitmap which
has exactly the same set of parents as one that we are about to write.
Note that unstable pseudo-merges are likely to change between
consecutive repacks, and so are generally poor candidates for reuse.
However, stable pseudo-merges (see the configuration option
'bitmapPseudoMerge.<name>.stableThreshold') are by definition unlikely
to change between runs (as they represent long-running branches).
Because there is no index from a *set* of pseudo-merge parents to a
matching pseudo-merge bitmap, we have to construct the bitmap
corresponding to the set of parents for each pending pseudo-merge commit
and see if a matching bitmap exists.
This is technically quadratic in the number of pseudo-merges, but is OK
in practice for a couple of reasons:
- non-matching pseudo-merge bitmaps are rejected quickly as soon as
they differ in a single bit
- already-matched pseudo-merge bitmaps are discarded from subsequent
rounds of search
- the number of pseudo-merges is generally small, even for large
repositories
In order to do this, implement (a) a function that finds a matching
pseudo-merge given some uncompressed bitset describing its parents, (b)
a function that computes the bitset of parents for a given pseudo-merge
commit, and (c) call that function before computing the set of reachable
objects for some pending pseudo-merge.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:27:18 +0000 (17:27 -0400)]
ewah: `bitmap_equals_ewah()`
Prepare to reuse existing pseudo-merge bitmaps by implementing a
`bitmap_equals_ewah()` helper.
This helper will be used to see if a raw bitmap (containing the set of
parents for some pseudo-merge) is equal to any existing pseudo-merge's
commits bitmap (which are stored as EWAH-compressed bitmaps on disk).
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:27:15 +0000 (17:27 -0400)]
pack-bitmap: extra trace2 information
Add some extra trace2 lines to capture the number of bitmap lookups that
are hits versus misses, as well as the number of reachability roots that
have bitmap coverage (versus those that do not).
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:27:11 +0000 (17:27 -0400)]
pack-bitmap.c: use pseudo-merges during traversal
Now that all of the groundwork has been laid to support reading and
using pseudo-merges, make use of that work in this commit by teaching
the pack-bitmap machinery to use pseudo-merge(s) when available during
traversal.
The basic operation is as follows:
- When enumerating objects on either side of a reachability query,
first see if any subset of the roots satisfies some pseudo-merge
bitmap. If it does, apply that pseudo-merge bitmap.
- If any pseudo-merge bitmap(s) were applied in the previous step, OR
them into the result[^1]. Then repeat the process over all
pseudo-merge bitmaps (we'll refer to this as "cascading"
pseudo-merges). Once this is done, OR in the resulting bitmap.
- If there is no fill-in traversal to be done, return the bitmap for
that side of the reachability query. If there is fill-in traversal,
then for each commit we encounter via show_commit(), check to see if
any unsatisfied pseudo-merges containing that commit as one of its
parents has been made satisfied by the presence of that commit.
If so, OR in the object set from that pseudo-merge bitmap, and then
cascade. If not, continue traversal.
A similar implementation is present in the boundary-based bitmap
traversal routines.
[^1]: Importantly, we cannot OR in the entire set of roots along with
the objects reachable from whatever pseudo-merge bitmaps were
satisfied. This may leave some dangling bits corresponding to any
unsatisfied root(s) getting OR'd into the resulting bitmap, tricking
other parts of the traversal into thinking we already have a
reachability closure over those commit(s) when we do not.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:27:08 +0000 (17:27 -0400)]
t/test-lib-functions.sh: support `--notick` in `test_commit_bulk()`
One of the tests we'll want to add for pseudo-merge bitmaps needs to be
able to generate a large number of commits at a specific date.
Support the `--notick` option (with identical semantics to the
`--notick` option for `test_commit()`) within `test_commit_bulk` as a
prerequisite for that. Callers can then set the various _DATE variables
themselves.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
These three helpers dump the list of pseudo merges, the "parents" of the
nth pseudo-merges, and the set of objects reachable from those parents,
respectively.
These helpers will be useful in subsequent patches when we add test
coverage for pseudo-merge bitmaps.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:27:02 +0000 (17:27 -0400)]
ewah: implement `ewah_bitmap_popcount()`
Some of the pseudo-merge test helpers (which will be introduced in the
following commit) will want to indicate the total number of commits in
or objects reachable from a pseudo-merge.
Implement a popcount() function that operates on EWAH bitmaps to quickly
determine how many bits are set in each of the respective bitmaps.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions are all documented in pseudo-merge.h, but their rough
descriptions are as follows:
- pseudo_merge_bitmap() reads and inflates the objects EWAH bitmap for
a given pseudo-merge
- use_pseudo_merge() does the same as pseudo_merge_bitmap(), but on
the commits EWAH bitmap, not the objects bitmap
- apply_pseudo_merges_for_commit() applies all satisfied pseudo-merge
commits for a given result set, and cascades any yet-unsatisfied
pseudo-merges if any were applied in the previous step
- cascade_pseudo_merges() applies all pseudo-merges which are
satisfied but have not been previously applied, repeating this
process until no more pseudo-merges can be applied
The core of the API is the latter two functions, which are responsible
for applying pseudo-merges during the object traversal implemented in
the pack-bitmap machinery.
The other two functions (pseudo_merge_bitmap(), and use_pseudo_merge())
are low-level ways to interact with the pseudo-merge machinery, which
will be useful in future commits.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:55 +0000 (17:26 -0400)]
pack-bitmap.c: read pseudo-merge extension
Now that the scaffolding for reading the pseudo-merge extension has been
laid, teach the pack-bitmap machinery to read the pseudo-merge extension
when present.
Note that pseudo-merges themselves are not yet used during traversal,
this step will be taken by a future commit.
In the meantime, read the table and initialize the pseudo_merge_map
structure introduced by a previous commit. When the pseudo-merge
extension is present, `load_bitmap_header()` performs basic sanity
checks to make sure that the table is well-formed.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:52 +0000 (17:26 -0400)]
pseudo-merge: scaffolding for reads
Implement scaffolding within the new pseudo-merge compilation unit
necessary to use the pseudo-merge API from within the pack-bitmap.c
machinery.
The core of this scaffolding is two-fold:
- The `pseudo_merge` structure itself, which represents an individual
pseudo-merge bitmap. It has fields for both bitmaps, as well as
metadata about its position within the memory-mapped region, and
a few extra bits indicating whether or not it is satisfied, and
which bitmaps(s, if any) have been read, since they are initialized
lazily.
- The `pseudo_merge_map` structure, which holds an array of
pseudo_merges, as well as a pointer to the memory-mapped region
containing the pseudo-merge serialization from within a .bitmap
file.
Note that the `bitmap_index` structure is defined statically within the
pack-bitmap.o compilation unit, so we can't take in a `struct
bitmap_index *`. Instead, wrap the primary components necessary to read
the pseudo-merges in this new structure to avoid exposing the
implementation details of the `bitmap_index` structure.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:49 +0000 (17:26 -0400)]
pack-bitmap: extract `read_bitmap()` function
The pack-bitmap machinery uses the `read_bitmap_1()` function to read a
bitmap from within the mmap'd region corresponding to the .bitmap file.
As as side-effect of calling this function, `read_bitmap_1()` increments
the `index->map_pos` variable to reflect the number of bytes read.
Extract the core of this routine to a separate function (that operates
over a `const unsigned char *`, a `size_t` and a `size_t *` pointer)
instead of a `struct bitmap_index *` pointer.
This function (called `read_bitmap()`) is part of the pack-bitmap.h API
so that it can be used within the upcoming portion of the implementation
in pseduo-merge.ch.
Rewrite the existing function, `read_bitmap_1()`, in terms of its more
generic counterpart.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:46 +0000 (17:26 -0400)]
pack-bitmap-write.c: write pseudo-merge table
Now that the pack-bitmap writer machinery understands how to select and
store pseudo-merge commits, teach it how to write the new optional
pseudo-merge .bitmap extension.
No readers yet exist for this new extension to the .bitmap format. The
following commits will take any preparatory step(s) necessary before
then implementing the routines necessary to read this new table.
In the meantime, the new `write_pseudo_merges()` function implements
writing this new format as described by a previous commit in
Documentation/technical/bitmap-format.txt.
Writing this table is fairly straightforward and consists of a few
sub-components:
- a pair of bitmaps for each pseudo-merge (one for the pseudo-merge
"parents", and another for the objects reachable from those parents)
- for each commit, the offset of either (a) the pseudo-merge it
belongs to, or (b) an extended lookup table if it belongs to >1
pseudo-merge groups
- if there are any commits belonging to >1 pseudo-merge group, the
extended lookup tables (which each consist of the number of
pseudo-merge groups a commit appears in, and then that many 4-byte
unsigned )
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:42 +0000 (17:26 -0400)]
pseudo-merge: implement support for selecting pseudo-merge commits
Teach the new pseudo-merge machinery how to select non-bitmapped commits
for inclusion in different pseudo-merge group(s) based on a handful of
criteria.
Note that the selected pseudo-merge commits aren't actually used or
written anywhere yet. This will be done in the following commit.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:39 +0000 (17:26 -0400)]
config: introduce `git_config_double()`
Future commits will want to parse a double-precision floating point
value from configuration, but we have no way to parse such a value prior
to this patch.
The core of the routine is implemented in git_parse_double(). Unlike
git_parse_unsigned() and git_parse_signed(), however, the function
implemented here only works on type "double", and not related types like
"float", or "long double".
This is because "float" and "long double" use different functions to
convert from ASCII strings to floating point values (strtof() and
strtold(), respectively). Likewise, there is no pointer type that can
assign to any of these values (except for "void *"), so the only way to
define this trio of functions would be with a macro expansion that is
parameterized over the floating point type and conversion function.
That is all doable, but likely to be overkill given our current needs,
which is only to parse double-precision floats.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:36 +0000 (17:26 -0400)]
pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public
The pseudo-merge selection code will be added in a subsequent commit,
and will need a way to push the allocated commit structures into the
bitmap writer from a separate compilation unit.
Make the `bitmap_writer_push_bitmapped_commit()` function part of the
pack-bitmap.h header in order to make this possible.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare to implement pseudo-merge bitmap selection by implementing a
necessary new function, `bitmap_writer_has_bitmapped_object_id()`.
This function returns whether or not the bitmap_writer selected the
given object ID for bitmapping. This will allow the pseudo-merge
machinery to reject candidates for pseudo-merges if they have already
been selected as an ordinary bitmap tip.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:29 +0000 (17:26 -0400)]
pack-bitmap-write: support storing pseudo-merge commits
Prepare to write pseudo-merge bitmaps by annotating individual bitmapped
commits (which are represented by the `bitmapped_commit` structure) with
an extra bit indicating whether or not they are a pseudo-merge.
In subsequent commits, pseudo-merge bitmaps will be generated by
allocating a fake commit node with parents covering the full set of
commits represented by the pseudo-merge bitmap. These commits will be
added to the set of "selected" commits as usual, but will be written
specially instead of being included with the rest of the selected
commits.
Mechanically speaking, there are two parts of this change:
- The bitmapped_commit struct gets a new bit indicating whether it is
a pseudo-merge, or an ordinary commit selected for bitmaps.
- A handful of changes to only write out the non-pseudo-merge commits
when enumerating through the selected array (see the new
`bitmap_writer_selected_nr()` function). Pseudo-merge commits appear
after all non-pseudo-merge commits, so it is safe to enumerate
through the selected array like so:
for (i = 0; i < bitmap_writer_selected_nr(); i++)
if (writer.selected[i].pseudo_merge)
BUG("unexpected pseudo-merge");
without encountering the BUG().
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:23 +0000 (17:26 -0400)]
pack-bitmap: move some initialization to `bitmap_writer_init()`
The pack-bitmap-writer machinery uses a oidmap (backed by khash.h) to
map from commits selected for bitmaps (by OID) to a bitmapped_commit
structure (containing the bitmap itself, among other things like its XOR
offset, etc.)
This map was initialized at the end of `bitmap_writer_build()`. New
entries are added in `pack-bitmap-write.c::store_selected()`, which is
called by the bitmap_builder machinery (which is responsible for
traversing history and generating the actual bitmaps).
Reorganize when this field is initialized and when entries are added to
it so that we can quickly determine whether a commit is a candidate for
pseudo-merge selection, or not (since it was already selected to receive
a bitmap, and thus storing it in a pseudo-merge would be redundant).
The changes are as follows:
- Introduce a new `bitmap_writer_init()` function which initializes
the `writer.bitmaps` field (instead of waiting until the end of
`bitmap_writer_build()`).
- Add map entries in `push_bitmapped_commit()` (which is called via
`bitmap_writer_select_commits()`) with OID keys and NULL values to
track whether or not we *expect* to write a bitmap for some given
commit.
- Validate that a NULL entry is found matching the given key when we
store a selected bitmap.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:20 +0000 (17:26 -0400)]
ewah: implement `ewah_bitmap_is_subset()`
In order to know whether a given pseudo-merge (comprised of a "parents"
and "objects" bitmaps) is "satisfied" and can be OR'd into the bitmap
result, we need to be able to quickly determine whether the "parents"
bitmap is a subset of the current set of objects reachable on either
side of a traversal.
Implement a helper function to prepare for that, which determines
whether an EWAH bitmap (the parents bitmap from the pseudo-merge) is a
subset of a non-EWAH bitmap (in this case, the results bitmap from
either side of the traversal).
This function makes use of the EWAH iterator to avoid inflating any part
of the EWAH bitmap after we determine it is not a subset of the non-EWAH
bitmap. This "fail-fast" allows us to avoid a potentially large amount
of wasted effort.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:16 +0000 (17:26 -0400)]
Documentation/technical: describe pseudo-merge bitmaps format
Prepare to implement pseudo-merge bitmaps over the next several commits
by first describing the serialization format which will store the new
pseudo-merge bitmaps themselves.
This format is implemented as an optional extension within the bitmap v1
format, making it compatible with previous versions of Git, as well as
the original .bitmap implementation within JGit.
The format is described in detail in the patch contents below, but the
high-level description is as follows:
- An array of pseudo-merge bitmaps, each containing a pair of EWAH
bitmaps: one describing the set of pseudo-merge "parents", and
another describing the set of object(s) reachable from those
parents.
- A lookup table to determine which pseudo-merge(s) a given commit
appears in. An optional extended lookup table follows when there is
at least one commit which appears in multiple pseudo-merge groups.
- Trailing metadata, including the number of pseudo-merge(s), number
of unique parents, the offset within the .bitmap file for the
pseudo-merge commit lookup table, and the size of the optional
extension itself.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 23 May 2024 21:26:10 +0000 (17:26 -0400)]
Documentation/gitpacking.txt: initial commit
Introduce a new manual page, gitpacking(7) to collect useful information
about advanced packing concepts in Git.
In future commits in this series, this manual page will expand to
describe the new pseudo-merge bitmaps feature, as well as include
examples, relevant configuration bits, use-cases, and so on.
Outside of this series, this manual page may absorb similar pieces from
other parts of Git's documentation about packing.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 23 May 2024 22:50:06 +0000 (15:50 -0700)]
show_log: factor out interdiff/range-diff generation
The integration of "git range-diff" with "git format-patch" for a
single patch (i.e., not generating "range-diff" into the cover
letter) hooks into log-tree.c:show_log(), which is responsible for
writing the log message out and other stuff. Essentially,
everything you see before the diffstat and the patch is generated
there.
Split out the code that spits out the interdiff/range-diff into a
separate helper function show_diff_of_diff(). Hopefully this will
make it easier to move things around in the output stream in the
future patches.
Junio C Hamano [Thu, 23 May 2024 18:04:29 +0000 (11:04 -0700)]
Merge branch 'mt/openindiana-portability'
Portability updates to various uses of grep and sed.
* mt/openindiana-portability:
t/t9001-send-email.sh: sed - remove the i flag for s
t/t9118-git-svn-funky-branch-names.sh: sed needs semicolon
t/t1700-split-index.sh: mv -v is not portable
t/t4202-log.sh: fix misspelled variable
t/t0600-reffiles-backend.sh: rm -v is not portable
t/t9902-completion.sh: backslashes in echo
Switch grep from non-portable BRE to portable ERE
Junio C Hamano [Thu, 23 May 2024 18:04:26 +0000 (11:04 -0700)]
Merge branch 'la/hide-trailer-info'
The trailer API has been reshuffled a bit.
* la/hide-trailer-info:
trailer unit tests: inspect iterator contents
trailer: document parse_trailers() usage
trailer: retire trailer_info_get() from API
trailer: make trailer_info struct private
trailer: make parse_trailers() return trailer_info pointer
interpret-trailers: access trailer_info with new helpers
sequencer: use the trailer iterator
trailer: teach iterator about non-trailer lines
trailer: add unit tests for trailer iterator
Makefile: sort UNIT_TEST_PROGRAMS
Junio C Hamano [Thu, 23 May 2024 16:38:59 +0000 (09:38 -0700)]
Merge branch 'kn/ref-transaction-symref' into kn/update-ref-symref
* kn/ref-transaction-symref:
refs: remove `create_symref` and associated dead code
refs: rename `refs_create_symref()` to `refs_update_symref()`
refs: use transaction in `refs_create_symref()`
refs: add support for transactional symref updates
refs: move `original_update_refname` to 'refs.c'
refs: support symrefs in 'reference-transaction' hook
files-backend: extract out `create_symref_lock()`
refs: accept symref values in `ref_transaction_update()`
Junio C Hamano [Thu, 23 May 2024 16:14:32 +0000 (09:14 -0700)]
Merge branch 'ps/pseudo-ref-terminology' into ps/ref-storage-migration
* ps/pseudo-ref-terminology:
refs: refuse to write pseudorefs
ref-filter: properly distinuish pseudo and root refs
refs: pseudorefs are no refs
refs: classify HEAD as a root ref
refs: do not check ref existence in `is_root_ref()`
refs: rename `is_special_ref()` to `is_pseudo_ref()`
refs: rename `is_pseudoref()` to `is_root_ref()`
Documentation/glossary: define root refs as refs
Documentation/glossary: clarify limitations of pseudorefs
Documentation/glossary: redefine pseudorefs as special refs
Junio C Hamano [Thu, 23 May 2024 04:55:31 +0000 (21:55 -0700)]
Revert "fsck: warn about symlink pointing inside a gitdir"
This reverts commit a33fea08 (fsck: warn about symlink pointing
inside a gitdir, 2024-04-10), which warns against symbolic links
commonly created by git-annex.
The same error can also be triggered when re-initializing an already
existing repository.
The bug has been introduced in 173761e21b (setup: start tracking ref
storage format, 2023-12-29), which wired up the ref storage format. The
root cause is in `init_db()`, which tries to read the config before we
have initialized `the_repository` and most importantly its ref storage
format. We eventually end up calling `include_by_branch()` and execute
`refs_resolve_ref_unsafe()`, but because we have not initialized the ref
storage format yet this will trigger the above bug.
Interestingly, `include_by_branch()` has a mechanism that will only
cause us to resolve the ref when `the_repository->gitdir` is set. This
is also the reason why this only happens when we initialize an already
existing directory or repository: `gitdir` is set in those cases, but
not when creating a new directory.
Now there are two ways to address the issue:
- We can adapt `include_by_branch()` to also make the code conditional
on whether `the_repository->ref_storage_format` is set.
- We can shift around code such that we initialize the repository
format before we read the config.
While the first approach would be safe, it may also cause us to paper
over issues where a ref store should have been set up. In our case for
example, it may be reasonable to expect that re-initializing the repo
will cause the "onbranch:" condition to trigger, but we would not do
that if the ref storage format was not set up yet. This also used to
work before the above commit that introduced this bug.
Rearrange the code such that we set up the repository format before
reading the config. This fixes the bug and ensures that "onbranch:"
conditions can trigger.
Reported-by: Heghedus Razvan <heghedus.razvan@protonmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Tested-by: Heghedus Razvan <heghedus.razvan@protonmail.com>
[jc: fixed a test and backported to v2.44.0 codebase] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 22 May 2024 21:45:48 +0000 (14:45 -0700)]
add-patch: enforce only one-letter response to prompts
In a "git add -p" session, especially when we are not using the
single-key mode, we may see 'qa' as a response to a prompt
(1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]?
and then just do the 'q' thing (i.e. quit the session), ignoring
everything other than the first byte.
If 'q' and 'a' are next to each other on the user's keyboard, there
is a plausible chance that we see 'qa' when the user who wanted to
say 'a' fat-fingered and we ended up doing the 'q' thing instead.
As we didn't think of a good reason during the review discussion why
we want to accept excess letters only to ignore them, it appears to
be a safe change to simply reject input that is longer than just one
byte.
The two exceptions are the 'g' command that takes a hunk number, and
the '/' command that takes a regular expression. They have to be
accompanied by their operands (this makes me wonder how users who
set the interactive.singlekey configuration feed these operands---it
turns out that we notice there is no operand and give them another
chance to type the operand separately, without using single key
input this time), so we accept a string that is more than one byte
long.
Keep the "use only the first byte, downcased" behaviour when we ask
yes/no question, though. Neither on Qwerty or on Dvorak, 'y' and
'n' are not close to each other.
Tom Hughes [Wed, 22 May 2024 20:15:40 +0000 (21:15 +0100)]
push: don't fetch commit object when checking existence
If we're checking to see whether to tell the user to do a fetch
before pushing there's no need for us to actually fetch the object
from the remote if the clone is partial.
Because the promisor doesn't do negotiation actually trying to do
the fetch of the new head can be very expensive as it will try and
include history that we already have and it just results in rejecting
the push with a different message, and in behavior that is different
to a clone that is not partial.
Signed-off-by: Tom Hughes <tom@compton.nu> Signed-off-by: Junio C Hamano <gitster@pobox.com>