Junio C Hamano [Fri, 13 Dec 2024 15:33:41 +0000 (07:33 -0800)]
Merge branch 'es/oss-fuzz'
Backport oss-fuzz tests for us to our codebase.
* es/oss-fuzz:
fuzz: port fuzz-url-decode-mem from OSS-Fuzz
fuzz: port fuzz-parse-attr-line from OSS-Fuzz
fuzz: port fuzz-credential-from-url-gently from OSS-Fuzz
Junio C Hamano [Fri, 13 Dec 2024 15:33:40 +0000 (07:33 -0800)]
Merge branch 'en/fast-import-verify-path'
"git fast-import" learned to reject paths with ".." and "." as
their components to avoid creating invalid tree objects.
* en/fast-import-verify-path:
t9300: test verification of renamed paths
fast-import: disallow more path components
fast-import: disallow "." and ".." path components
Junio C Hamano [Tue, 10 Dec 2024 01:04:57 +0000 (10:04 +0900)]
Merge branch 'ps/reftable-iterator-reuse'
Optimize reading random references out of the reftable backend by
allowing reuse of iterator objects.
* ps/reftable-iterator-reuse:
refs/reftable: reuse iterators when reading refs
reftable/merged: drain priority queue on reseek
reftable/stack: add mechanism to notify callers on reload
refs/reftable: refactor reflog expiry to use reftable backend
refs/reftable: refactor reading symbolic refs to use reftable backend
refs/reftable: read references via `struct reftable_backend`
refs/reftable: figure out hash via `reftable_stack`
reftable/stack: add accessor for the hash ID
refs/reftable: handle reloading stacks in the reftable backend
refs/reftable: encapsulate reftable stack
Junio C Hamano [Tue, 10 Dec 2024 01:04:56 +0000 (10:04 +0900)]
Merge branch 'ps/reftable-detach'
Isolates the reftable subsystem from the rest of Git's codebase by
using fewer pieces of Git's infrastructure.
* ps/reftable-detach:
reftable/system: provide thin wrapper for lockfile subsystem
reftable/stack: drop only use of `get_locked_file_path()`
reftable/system: provide thin wrapper for tempfile subsystem
reftable/stack: stop using `fsync_component()` directly
reftable/system: stop depending on "hash.h"
reftable: explicitly handle hash format IDs
reftable/system: move "dir.h" to its only user
Loosen overly strict ownership check introduced in the recent past,
to keep the promise "cloning a suspicious repository is a safe
first step to inspect it".
* bc/allow-upload-pack-from-other-people:
Allow cloning from repositories owned by another user
Junio C Hamano [Wed, 4 Dec 2024 01:14:49 +0000 (10:14 +0900)]
Merge branch 'ja/git-diff-doc-markup'
Documentation mark-up updates.
* ja/git-diff-doc-markup:
doc: git-diff: apply format changes to config part
doc: git-diff: apply format changes to diff-generate-patch
doc: git-diff: apply format changes to diff-format
doc: git-diff: apply format changes to diff-options
doc: git-diff: apply new documentation guidelines
Junio C Hamano [Wed, 4 Dec 2024 01:14:42 +0000 (10:14 +0900)]
Merge branch 'sj/ref-contents-check'
"git fsck" learned to issue warnings on "curiously formatted" ref
contents that have always been taken valid but something Git
wouldn't have written itself (e.g., missing terminating end-of-line
after the full object name).
* sj/ref-contents-check:
ref: add symlink ref content check for files backend
ref: check whether the target of the symref is a ref
ref: add basic symref content check for files backend
ref: add more strict checks for regular refs
ref: port git-fsck(1) regular refs check for files backend
ref: support multiple worktrees check for refs
ref: initialize ref name outside of check functions
ref: check the full refname instead of basename
ref: initialize "fsck_ref_report" with zero
Junio C Hamano [Wed, 4 Dec 2024 01:14:41 +0000 (10:14 +0900)]
Merge branch 'ps/ref-backend-migration-optim'
The migration procedure between two ref backends has been optimized.
* ps/ref-backend-migration-optim:
reftable: rename scratch buffer
refs: adapt `initial_transaction` flag to be unsigned
reftable/block: optimize allocations by using scratch buffer
reftable/block: rename `block_writer::buf` variable
reftable/writer: optimize allocations by using a scratch buffer
refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG`
refs: skip collision checks in initial transactions
refs: use "initial" transaction semantics to migrate refs
refs/files: support symbolic and root refs in initial transaction
refs: introduce "initial" transaction flag
refs/files: move logic to commit initial transaction
refs: allow passing flags when setting up a transaction
Junio C Hamano [Wed, 4 Dec 2024 01:14:38 +0000 (10:14 +0900)]
Merge branch 'ps/leakfixes-part-10'
Leakfixes.
* ps/leakfixes-part-10: (27 commits)
t: remove TEST_PASSES_SANITIZE_LEAK annotations
test-lib: unconditionally enable leak checking
t: remove unneeded !SANITIZE_LEAK prerequisites
t: mark some tests as leak free
t5601: work around leak sanitizer issue
git-compat-util: drop now-unused `UNLEAK()` macro
global: drop `UNLEAK()` annotation
t/helper: fix leaking commit graph in "read-graph" subcommand
builtin/branch: fix leaking sorting options
builtin/init-db: fix leaking directory paths
builtin/help: fix leaks in `check_git_cmd()`
help: fix leaking return value from `help_unknown_cmd()`
help: fix leaking `struct cmdnames`
help: refactor to not use globals for reading config
builtin/sparse-checkout: fix leaking sanitized patterns
split-index: fix memory leak in `move_cache_to_base_index()`
git: refactor builtin handling to use a `struct strvec`
git: refactor alias handling to use a `struct strvec`
strvec: introduce new `strvec_splice()` function
line-log: fix leak when rewriting commit parents
...
Junio C Hamano [Wed, 4 Dec 2024 01:14:37 +0000 (10:14 +0900)]
Merge branch 'ps/gc-stale-lock-warning'
Give a bit of advice/hint message when "git maintenance" stops finding a
lock file left by another instance that still is potentially running.
* ps/gc-stale-lock-warning:
t7900: fix host-dependent behaviour when testing git-maintenance(1)
builtin/gc: provide hint when maintenance hits a stale schedule lock
Jeff King [Tue, 3 Dec 2024 21:06:52 +0000 (16:06 -0500)]
t9300: test verification of renamed paths
Commit da91a90c2f (fast-import: disallow more path components,
2024-11-30) added two separate verify_path() calls (one for
added/modified files, and one for renames/copies). But our tests only
exercise the first one. Let's protect ourselves against regressions by
tweaking one of the tests to rename into the bad path. There are
adjacent tests that will stay as additions, so now both calls are
covered.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/git-update-ref.txt: add missing word
Add missing word “that” in the phrase “after verifying that”, like
what was done in 1b2dfb70504 (Documentation/git-update-ref.txt: drop
“flag”, 2024-10-21)
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Sat, 30 Nov 2024 01:09:29 +0000 (01:09 +0000)]
fast-import: disallow more path components
Instead of just disallowing '.' and '..', make use of verify_path() to
ensure that fast-import will disallow anything we wouldn't allow into
the index, such as anything under .git/, .gitmodules as a symlink, or
a dos drive prefix on Windows.
Since a few fast-export and fast-import tests that tried to stress-test
the correct handling of quoting relied on filenames that fail
is_valid_win32_path(), such as spaces or periods at the end of filenames
or backslashes within the filename, turn off core.protectNTFS for those
tests to ensure they keep passing.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 27 Nov 2024 13:23:45 +0000 (22:23 +0900)]
CodingGuidelines: a handful of error message guidelines
It is more efficient to have something in the coding guidelines
document to point at, when we want to review and comment on a new
message in the codebase to make sure it "fits" in the set of
existing messages.
Let's write down established best practice we are aware of.
Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Wed, 27 Nov 2024 23:33:12 +0000 (17:33 -0600)]
transport: propagate fsck configuration during bundle fetch
When fetching directly from a bundle, fsck message severity
configuration is not propagated to the underlying git-index-pack(1). It
is only capable of enabling or disabling fsck checks entirely. This does
not align with the fsck behavior for fetches through git-fetch-pack(1).
Use the fsck config parsing from fetch-pack to populate fsck message
severity configuration and wire it through to `unbundle()` to enable the
same fsck verification as done through fetch-pack.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Wed, 27 Nov 2024 23:33:11 +0000 (17:33 -0600)]
fetch-pack: split out fsck config parsing
When `fetch_pack_config()` is invoked, fetch-pack configuration is
parsed from the config. As part of this operation, fsck message severity
configuration is assigned to the `fsck_msg_types` global variable. This
is optionally used to configure the downstream git-index-pack(1) when
the `--strict` option is specified.
The same parsed fsck message severity configuration is also needed
outside of fetch-pack. Instead of exposing/relying on the existing
global state, split out the fsck config parsing logic into
`fetch_pack_fsck_config()` and expose it. In a subsequent commit, this
is used to provide fsck configuration when invoking `unbundle()`.
For `fetch_pack_fsck_config()` to discern between errors and unhandled
config variables, the return code when `git_config_path()` errors is
changed to a different value also indicating success. This frees up the
previous return code to now indicate the provided config variable
was unhandled. The behavior remains functionally the same.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Wed, 27 Nov 2024 23:33:10 +0000 (17:33 -0600)]
bundle: support fsck message configuration
If the `VERIFY_BUNDLE_FLAG` is set during `unbundle()`, the
git-index-pack(1) spawned is configured with the `--fsck-options` flag
to perform fsck verification. With this flag enabled, there is not a way
to configure fsck message severity though.
Extend the `unbundle_opts` type to store fsck message severity
configuration and update `unbundle()` to conditionally append it to the
`--fsck-objects` flag if provided. This enables `unbundle()` call sites
to support optionally setting the severity for specific fsck messages.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Justin Tobler [Wed, 27 Nov 2024 23:33:09 +0000 (17:33 -0600)]
bundle: add bundle verification options type
When `unbundle()` is invoked, fsck verification may be configured by
passing the `VERIFY_BUNDLE_FSCK` flag. This mechanism allows fsck checks
on the bundle to be enabled or disabled entirely. To facilitate more
fine-grained fsck configuration, additional context must be provided to
`unbundle()`.
Introduce the `unbundle_opts` type, which wraps the existing
`verify_bundle_flags`, to facilitate future extension of `unbundle()`
configuration. Also update `unbundle()` and its call sites to accept
this new options type instead of the flags directly. The end behavior is
functionally the same, but allows for the set of configurable options to
be extended. This is leveraged in a subsequent commit to enable fsck
message severity configuration.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 26 Nov 2024 22:57:06 +0000 (07:57 +0900)]
Merge branch 'jk/gcc15'
GCC 15 compatibility updates.
* jk/gcc15:
object-file: inline empty tree and blob literals
object-file: treat cached_object values as const
object-file: drop oid field from find_cached_object() return value
object-file: move empty_tree struct into find_cached_object()
object-file: drop confusing oid initializer of empty_tree struct
object-file: prefer array-of-bytes initializer for hash literals
Junio C Hamano [Tue, 26 Nov 2024 22:57:04 +0000 (07:57 +0900)]
Merge branch 'ps/clar-build-improvement'
Fix for clar unit tests to support CMake build.
* ps/clar-build-improvement:
Makefile: let clar header targets depend on their scripts
cmake: use verbatim arguments when invoking clar commands
cmake: use SH_EXE to execute clar scripts
t/unit-tests: convert "clar-generate.awk" into a shell script
Junio C Hamano [Tue, 26 Nov 2024 22:57:03 +0000 (07:57 +0900)]
Merge branch 'kh/bundle-docs'
Documentation for "git bundle" saw improvements to more prominently
call out the use of '--all' when creating bundles.
* kh/bundle-docs:
Documentation/git-bundle.txt: discuss naïve backups
Documentation/git-bundle.txt: mention --all in spec. refs
Documentation/git-bundle.txt: remove old `--all` example
Documentation/git-bundle.txt: mention full backup example
shejialuo [Tue, 26 Nov 2024 14:40:57 +0000 (22:40 +0800)]
ref-cache: fix invalid free operation in `free_ref_entry`
In cfd971520e (refs: keep track of unresolved reference value in
iterators, 2024-08-09), we added a new field "referent" into the "struct
ref" structure. In order to free the "referent", we unconditionally
freed the "referent" by simply adding a "free" statement.
However, this is a bad usage. Because when ref entry is either directory
or loose ref, we will always execute the following statement:
free(entry->u.value.referent);
This does not make sense. We should never access the "entry->u.value"
field when "entry" is a directory. However, the change obviously doesn't
break the tests. Let's analysis why.
The anonymous union in the "ref_entry" has two members: one is "struct
ref_value", another is "struct ref_dir". On a 64-bit machine, the size
of "struct ref_dir" is 32 bytes, which is smaller than the 48-byte size
of "struct ref_value". And the offset of "referent" field in "struct
ref_value" is 40 bytes. So, whenever we create a new "ref_entry" for a
directory, we will leave the offset from 40 bytes to 48 bytes untouched,
which means the value for this memory is zero (NULL). It's OK to free a
NULL pointer, but this is merely a coincidence of memory layout.
To fix this issue, we now ensure that "free(entry->u.value.referent)" is
only called when "entry->flag" indicates that it represents a loose
reference and not a directory to avoid the invalid memory operation.
Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading references the reftable backend has to:
1. Create a new ref iterator.
2. Seek the iterator to the record we're searching for.
3. Read the record.
We cannot really avoid the last two steps, but re-creating the iterator
every single time we want to read a reference is kind of expensive and a
waste of resources. We couldn't help it in the past though because it
was not possible to reuse iterators. But starting with 5bf96e0c39
(reftable/generic: move seeking of records into the iterator,
2024-05-13) we have split up the iterator lifecycle such that creating
the iterator and seeking are two different concerns.
Refactor the code such that we cache iterators in the reftable backend.
This cache is invalidated whenever the respective stack is reloaded such
that we know to recreate the iterator in that case. This leads to a
sizeable speedup when creating many refs, which requires a lot of random
reference reads:
Benchmark 1: update-ref: create many refs (refcount = 100000, revision = master)
Time (mean ± σ): 1.793 s ± 0.010 s [User: 0.954 s, System: 0.835 s]
Range (min … max): 1.781 s … 1.811 s 10 runs
Benchmark 2: update-ref: create many refs (refcount = 100000, revision = HEAD)
Time (mean ± σ): 1.680 s ± 0.013 s [User: 0.846 s, System: 0.831 s]
Range (min … max): 1.664 s … 1.702 s 10 runs
Summary
update-ref: create many refs (refcount = 100000, revision = HEAD) ran
1.07 ± 0.01 times faster than update-ref: create many refs (refcount = 100000, revision = master)
While 7% is not a huge win, you have to consider that the benchmark is
_writing_ data, so _reading_ references is only one part of what we do.
Flame graphs show that we spend around 40% of our time reading refs, so
the speedup when reading refs is approximately ~2.5x that. I could not
find better benchmarks where we perform a lot of random ref reads.
You can also see a sizeable impact on memory usage when creating 100k
references. Before this change:
HEAP SUMMARY:
in use at exit: 19,112,538 bytes in 200,170 blocks
total heap usage: 8,400,426 allocs, 8,200,256 frees, 454,367,048 bytes allocated
After this change:
HEAP SUMMARY:
in use at exit: 674,416 bytes in 169 blocks
total heap usage: 7,929,872 allocs, 7,929,703 frees, 281,509,985 bytes allocated
As an additional factor, this refactoring opens up the possibility for
more performance optimizations in how we re-seek iterators. Any change
that allows us to optimize re-seeking by e.g. reusing data structures
would thus also directly speed up random reads.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 5bf96e0c39 (reftable/generic: move seeking of records into the
iterator, 2024-05-13) we have refactored the reftable codebase such that
iterators can be initialized once and then re-seeked multiple times.
This feature is used by 1869525066 (refs/reftable: wire up support for
exclude patterns, 2024-09-16) in order to skip records based on exclude
patterns provided by the caller.
The logic to re-seek the merged iterator is insufficient though because
we don't drain the priority queue on a re-seek. This means that the
queue may contain stale entries and thus reading the next record in the
queue will return the wrong entry. While this is an obvious bug, it is
harmless in the context of above exclude patterns:
- If the queue contained stale entries that match the pattern then the
caller would already know to filter out such refs. This is because
our codebase is prepared to handle backends that don't have a way to
efficiently implement exclude patterns.
- If the queue contained stale entries that don't match the pattern
we'd eventually filter out any duplicates. This is because the
reftable code discards items with the same ref name and sorts any
remaining entries properly.
So things happen to work in this context regardless of the bug, and
there is no other use case yet where we re-seek iterators. We're about
to introduce a caching mechanism though where iterators are reused by
the reftable backend, and that will expose the bug.
Fix the issue by draining the priority queue when seeking and add a
testcase that surfaces the issue.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: add mechanism to notify callers on reload
Reftable stacks are reloaded in two cases:
- When calling `reftable_stack_reload()`, if the stat-cache tells us
that the stack has been modified.
- When committing a reftable addition.
While callers can figure out the second case, they do not have a
mechanism to figure out whether `reftable_stack_reload()` led to an
actual reload of the on-disk data. All they can do is thus to assume
that data is always being reloaded in that case.
Improve the situation by introducing a new `on_reload()` callback to the
reftable options. If provided, the function will be invoked every time
the stack has indeed been reloaded. This allows callers to invalidate
data that depends on the current stack data.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/reftable: refactor reflog expiry to use reftable backend
Refactor the callback function that expires reflog entries in the
reftable backend to use `reftable_backend_read_ref()` instead of
accessing the reftable stack directly. This ensures that the function
will benefit from the new caching layer that we're about to introduce.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/reftable: refactor reading symbolic refs to use reftable backend
Refactor the callback function that reads symbolic references in the
reftable backend to use `reftable_backend_read_ref()` instead of
accessing the reftable stack directly. This ensures that the function
will benefit from the new caching layer that we're about to introduce.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/reftable: read references via `struct reftable_backend`
Refactor `read_ref_without_reload()` to accept `struct reftable_backend`
as parameter instead of `struct reftable_stack`. Rename the function to
`reftable_backend_read_ref()` to clarify its scope and move it close to
other functions operating on `struct reftable_backend`.
This change allows us to implement an additional caching layer when
reading refs where we can reuse reftable iterators.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/reftable: figure out hash via `reftable_stack`
The function `read_ref_without_reload()` accepts a ref store as input
only so that we can figure out the hash function used by it. This is
duplicate information though because the reftable stack knows about its
hash function, too.
Drop the superfluous parameter to simplify the calling convention a bit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/reftable: handle reloading stacks in the reftable backend
When accessing a stack we almost always have to reload the stack before
reading data from it. This is mostly because Git does not have a
notification mechanism for when underlying data has been changed, and
thus we are forced to opportunistically reload the stack every single
time to account for any changes that may have happened concurrently.
Handle the reload internally in `backend_for()`. For one this forces
callsites to think about whether or not they need to reload the stack.
But second this makes the logic to access stacks more self-contained by
letting the `struct reftable_backend` manage themselves.
Update callsites where we don't reload the stack to document why we
don't. In some cases it's unclear whether it is the right thing to do in
the first place, but fixing that is outside of the scope of this patch
series.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable ref store needs to keep track of multiple stacks, one for
the main worktree and an arbitrary number of stacks for worktrees. This
is done by storing pointers to `struct reftable_stack`, which we then
access directly.
Wrap the stack in a new `struct reftable_backend`. This will allow us to
attach more data to each respective stack in subsequent commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Mon, 25 Nov 2024 14:55:30 +0000 (15:55 +0100)]
builtin: pass repository to sub commands
In 9b1cb5070f (builtin: add a repository parameter for builtin
functions, 2024-09-13) the repository was passed down to all builtin
commands. This allowed the repository to be passed down to lower layers
without depending on the global `the_repository` variable.
Continue this work by also passing down the repository parameter from
the command to sub-commands. This will help pass down the repository to
other subsystems and cleanup usage of global variables like
'the_repository' and 'the_hash_algo'.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Mon, 25 Nov 2024 19:00:48 +0000 (19:00 +0000)]
fast-import: disallow "." and ".." path components
If a user specified e.g.
M 100644 :1 ../some-file
then fast-import previously would happily create a git history where
there is a tree in the top-level directory named "..", and with a file
inside that directory named "some-file". The top-level ".." directory
causes problems. While git checkout will die with errors and fsck will
report hasDotdot problems, the user is going to have problems trying to
remove the problematic file. Simply avoid creating this bad history in
the first place.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
bisect: address Coverity warning about potential double free
Coverity has started to warn about a potential double-free in
`find_bisection()`. This warning is triggered because we may modify the
list head of the passed-in `commit_list` in case it is an UNINTERESTING
commit, but still call `free_commit_list()` on the original variable
that points to the now-freed head in case where `do_find_bisection()`
returns a `NULL` pointer.
As far as I can see, this double free cannot happen in practice, as
`do_find_bisection()` only returns a `NULL` pointer when it was passed a
`NULL` input. So in order to trigger the double free we would have to
call `find_bisection()` with a commit list that only consists of
UNINTERESTING commits, but I have not been able to construct a case
where that happens.
Drop the `else` branch entirely as it seems to be a no-op anyway.
Another option might be to instead call `free_commit_list()` on `list`,
which is the modified version of `commit_list` and thus wouldn't cause a
double free. But as mentioned, I couldn't come up with any case where a
passed-in non-NULL list becomes empty, so this shouldn't be necessary.
And if it ever does become necessary we'd notice anyway via the leak
sanitizer.
Interestingly enough we did not have a single test exercising this
branch: all tests pass just fine even when replacing it with a call to
`BUG()`. Add a test that exercises it.
Reported-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 26 Nov 2024 01:21:58 +0000 (10:21 +0900)]
Merge branch 'ps/leakfixes-part-10' into ps/bisect-double-free-fix
* ps/leakfixes-part-10: (27 commits)
t: remove TEST_PASSES_SANITIZE_LEAK annotations
test-lib: unconditionally enable leak checking
t: remove unneeded !SANITIZE_LEAK prerequisites
t: mark some tests as leak free
t5601: work around leak sanitizer issue
git-compat-util: drop now-unused `UNLEAK()` macro
global: drop `UNLEAK()` annotation
t/helper: fix leaking commit graph in "read-graph" subcommand
builtin/branch: fix leaking sorting options
builtin/init-db: fix leaking directory paths
builtin/help: fix leaks in `check_git_cmd()`
help: fix leaking return value from `help_unknown_cmd()`
help: fix leaking `struct cmdnames`
help: refactor to not use globals for reading config
builtin/sparse-checkout: fix leaking sanitized patterns
split-index: fix memory leak in `move_cache_to_base_index()`
git: refactor builtin handling to use a `struct strvec`
git: refactor alias handling to use a `struct strvec`
strvec: introduce new `strvec_splice()` function
line-log: fix leak when rewriting commit parents
...
This says that hash2 and hash3 should be squashed into hash1 and
that hash3’s commit message should be used for the resulting commit.
So the user is presented with an editor where the two first commit
messages are commented out and the third is not. However this does
not work if `core.commentChar`/`core.commentString` is in use since
the comment char is hardcoded (#) in this `sequencer.c` function.
As a result the first commit message will not be commented out.
sequencer: comment `--reference` subject line properly
`git revert --reference <commit>` leaves behind a comment in the
first line:[1]
# *** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***
Meaning that the commit will just consist of the next line if the user
exits the editor directly:
This reverts commit <--format=reference commit>
But the comment char here is hardcoded (#). Which means that the
comment line will inadvertently be included in the commit message if
`core.commentChar`/`core.commentString` is in use.
† 1: See 43966ab3156 (revert: optionally refer to commit in the
"reference" format, 2022-05-26)
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git rebase --update-ref` does not insert commands for dependent/sub-
branches which are checked out.[1] Instead it leaves a comment about
that fact. The comment char is hardcoded (#). In turn the comment
line gets interpreted as an invalid command when `core.commentChar`/
`core.commentString` is in use.
† 1: See 900b50c242 (rebase: add --update-refs option, 2022-07-19)
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both `struct block_writer` and `struct reftable_writer` have a `buf`
member that is being reused to optimize the number of allocations.
Rename the variable to `scratch` to clarify its intend and provide a
comment explaining why it exists.
Suggested-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: adapt `initial_transaction` flag to be unsigned
The `initial_transaction` flag is tracked as a signed integer, but we
typically pass around flags via unsigned integers. Adapt the type
accordingly.
Suggested-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7900: fix host-dependent behaviour when testing git-maintenance(1)
We have recently added a new test to t7900 that exercises whether
git-maintenance(1) fails as expected when the "schedule.lock" file
exists. The test depends on whether or not the host has the required
executables present to schedule maintenance tasks in the first place,
like systemd or launchctl -- if not, the test fails with an unrelated
error before even checking for the lock file. This fails for example in
our CI systems, where macOS images do not have launchctl available.
Fix this issue by creating a stub systemctl(1) binary and using the
systemd scheduler.
Reported-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 25 Nov 2024 03:14:01 +0000 (12:14 +0900)]
doc: option value may be separate for valid reasons
Even though `git help cli` recommends users to prefer using
"--option=value" over "--option value", there can be reasons why
giving them separately is a good idea. One reason is that shells do
not perform tilde expansion for `--option=~/path/name` but they
expand `--options ~/path/name` just fine.
This is not a problem for many options whose option parsing is
properly written using OPT_FILENAME(), because the value given to
OPT_FILENAME() is tilde-expanded internally by us, but some commands
take a pathname as a mere string, which needs this trick to have the
shell help us.
I think the reason we originally decided to recommend the stuck form
was because an option that takes an optional value requires you to
use it in the stuck form, and it is one less thing for users to
worry about if they get into the habit to always use the stuck form.
But we should be discouraging ourselves from adding an option with
an optional value in the first place, and we might want to weaken
the current recommendation.
In any case, let's describe this one case where it is necessary to
use the separate form, with an example.
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 25 Nov 2024 03:29:39 +0000 (12:29 +0900)]
Merge branch 'jk/output-prefix-cleanup' into maint-2.47
Code clean-up.
* jk/output-prefix-cleanup:
diff: store graph prefix buf in git_graph struct
diff: return line_prefix directly when possible
diff: return const char from output_prefix callback
diff: drop line_prefix_length field
line-log: use diff_line_prefix() instead of custom helper
Junio C Hamano [Mon, 25 Nov 2024 03:20:42 +0000 (12:20 +0900)]
Merge branch 'master' of https://github.com/j6t/gitk into maint-2.47
* 'master' of https://github.com/j6t/gitk:
Makefile(s): avoid recipe prefix in conditional statements
doc: switch links to https
doc: update links to current pages
Philippe Blain [Fri, 22 Nov 2024 19:50:22 +0000 (19:50 +0000)]
git-difftool--helper.sh: exit upon initialize_merge_tool errors
Since the introduction of 'initialize_merge_tool' in de8dafbada
(mergetool: break setup_tool out into separate initialization function,
2021-02-09), any errors from this function are ignored in
git-difftool--helper.sh::launch_merge_tool, which is not the case for
its call in git-mergetool.sh::merge_file.
Despite the in-code comment, initialize_merge_tool (via its call to
setup_tool) does different checks than run_merge_tool, so it makes sense
to abort early if it encounters errors. Add exit calls if
initialize_merge_tool fails.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Fri, 22 Nov 2024 19:50:21 +0000 (19:50 +0000)]
git-mergetool--lib.sh: add error message for unknown tool variant
In setup_tool, we check if the given tool is a known variant of a tool,
and quietly return with an error if not. This leads to the following
invocation quietly failing:
git mergetool --tool=vimdiff4
Add an error message before returning in this case.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Fri, 22 Nov 2024 19:50:20 +0000 (19:50 +0000)]
git-mergetool--lib.sh: add error message if 'setup_user_tool' fails
In git-mergetool--lib.sh::setup_tool, we check if the given tool is a
known builtin tool, a known variant, or a user-defined tool by calling
setup_user_tool, and we return with the exit code from setup_user_tool
if it was called. setup_user_tool checks if {diff,merge}tool.$tool.cmd
is set and quietly returns with an error if not.
This leads to the following invocation quietly failing:
git mergetool --tool=unknown
which is not very user-friendly. Adjust setup_tool to output an error
message before returning if setup_user_tool returned with an error.
Note that we do not check the result of the second call to
setup_user_tool in setup_tool, as this call is only meant to allow users
to redefine 'cmd' for a builtin tool; it is not an error if they have
not done so.
Note that this behaviour of quietly failing is a regression dating back
to de8dafbada (mergetool: break setup_tool out into separate
initialization function, 2021-02-09), as before this commit an unknown
mergetool would be diagnosed in get_merge_tool_path when called from
run_merge_tool.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Fri, 22 Nov 2024 19:50:19 +0000 (19:50 +0000)]
git-mergetool--lib.sh: use TOOL_MODE when erroring about unknown tool
In git-mergetool--lib.sh::get_merge_tool_path, we check if the chosen
tool is valid via valid_tool and exit with an error message if not. This
error message mentions "Unknown merge tool", even if the command the
user tried was 'git difftool --tool=unknown'. Use the global 'TOOL_MODE'
variable for a more correct error message.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Mon, 8 Apr 2024 15:51:44 +0000 (11:51 -0400)]
Makefile(s): avoid recipe prefix in conditional statements
In GNU Make commit 07fcee35 ([SV 64815] Recipe lines cannot contain
conditional statements, 2023-05-22) and following, conditional
statements may no longer be preceded by a tab character (which Make
refers to as the recipe prefix).
There are a handful of spots in our various Makefile(s) which will break
in a future release of Make containing 07fcee35. For instance, trying to
compile the pre-image of this patch with the tip of make.git results in
the following:
$ make -v | head -1 && make
GNU Make 4.4.90
config.mak.uname:842: *** missing 'endif'. Stop.
The kernel addressed this issue in 82175d1f9430 (kbuild: Replace tabs
with spaces when followed by conditionals, 2024-01-28). Address the
issues in Git's tree by applying the same strategy.
When a conditional word (ifeq, ifneq, ifdef, etc.) is preceded by one or
more tab characters, replace each tab character with 8 space characters
with the following:
The "unless /\\$/" removes any false-positives (like "\telse \"
appearing within a shell script as part of a recipe).
After doing so, Git compiles on newer versions of Make:
$ make -v | head -1 && make
GNU Make 4.4.90
GIT_VERSION = 2.44.0.414.gfac1dc44ca9
[...]
$ echo $?
0
Reported-by: Dario Gjorgjevski <dario.gjorgjevski@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cherry-picked-from: 728b9ac0c3b93aaa4ea80280c591deb198051785 Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Josh Soref [Fri, 24 Nov 2023 03:35:13 +0000 (03:35 +0000)]
doc: switch links to https
These sites offer https versions of their content.
Using the https versions provides some protection for users.
Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cherry-picked-from: d05b08cd52cfda627f1d865bdfe6040a2c9521b5 Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Josh Soref [Fri, 24 Nov 2023 03:35:12 +0000 (03:35 +0000)]
doc: update links to current pages
It's somewhat traditional to respect sites' self-identification.
Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cherry-picked-from: 65175d9ea26bebeb9d69977d0e75efc0e88dbced Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Junio C Hamano [Fri, 22 Nov 2024 05:34:17 +0000 (14:34 +0900)]
Merge branch 'jk/fetch-prefetch-double-free-fix'
Double-free fix.
* jk/fetch-prefetch-double-free-fix:
refspec: store raw refspecs inside refspec_item
refspec: drop separate raw_nr count
fetch: adjust refspec->raw_nr when filtering prefetch refspecs
Taylor Blau [Thu, 21 Nov 2024 20:29:24 +0000 (15:29 -0500)]
t/perf: use 'test_file_size' in more places
The perf test suite prefers to use test_file_size over 'wc -c' when
inside of a test_size block. One advantage is that accidentally writign
"wc -c file" (instead of "wc -c <file") does not inadvertently break the
tests (since the former will include the filename in the output of wc).
Both of the two uses of test_size use "wc -c", but let's convert those
to the more conventional test_file_size helper instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Thu, 21 Nov 2024 22:50:43 +0000 (17:50 -0500)]
pack-bitmap.c: typofix in `find_boundary_objects()`
In the boundary-based bitmap traversal, we use the given 'rev_info'
structure to first do a commit-only walk in order to determine the
boundary between interesting and uninteresting objects.
That walk only looks at commit objects, regardless of the state of
revs->blob_objects, revs->tree_objects, and so on. In order to do this,
we store the state of these variables in temporary fields before
setting them back to zero, performing the traversal, and then setting
them back.
But there is a typo here that dates back to b0afdce5da (pack-bitmap.c:
use commit boundary during bitmap traversal, 2023-05-08), where we
incorrectly store the value of the "tags" field as "revs->blob_objects".
This could lead to problems later on if, say, the caller wants tag
objects but *not* blob objects. In the pre-image behavior, we'd set
revs->tag_objects back to the old value of revs->blob_objects, thus
emitting fewer objects than expected back to the caller.
Fix that by correctly assigning the value of 'revs->tag_objects' to the
'tmp_tags' field.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there
is no longer a need to have that variable declared in all of our tests.
Drop it.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Over the last two releases we have plugged a couple hundred of memory
leaks exposed by the Git test suite. With the preceding commits we have
finally fixed the last leak exposed by our test suite, which means that
we are now basically leak free wherever we have branch coverage.
From hereon, the Git test suite should ideally stay free of memory
leaks. Most importantly, any test suite that is being added should
automatically be subject to the leak checker, and if that test does not
pass it is a strong signal that the added code introduced new memory
leaks and should not be accepted without further changes.
Drop the infrastructure around TEST_PASSES_SANITIZE_LEAK to reflect this
new requirement. Like this, all test suites will be subject to the leak
checker by default.
This is being intentionally strict, but we still have an escape hatch:
the SANITIZE_LEAK prerequisite. There is one known case in t5601 where
the leak sanitizer itself is buggy, so adding this prereq in such cases
is acceptable. Another acceptable situation is when a newly added test
uncovers preexisting memory leaks: when fixing that memory leak would be
sufficiently complicated it is fine to annotate and document the leak
accordingly. But in any case, the burden is now on the patch author to
explain why exactly they have to add the SANITIZE_LEAK prerequisite.
The TEST_PASSES_SANITIZE_LEAK annotations will be dropped in the next
patch.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a couple of !SANITIZE_LEAK prerequisites for tests that used to
fail due to memory leaks. These have all been fixed by now, so let's
drop the prerequisite.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>