* ps/reftable-reflog-iteration-perf:
refs/reftable: track last log record name via strbuf
reftable/record: use scratch buffer when decoding records
reftable/record: reuse message when decoding log records
reftable/record: reuse refnames when decoding log records
reftable/record: avoid copying author info
reftable/record: convert old and new object IDs to arrays
refs/reftable: reload correct stack when creating reflog iter
Junio C Hamano [Thu, 21 Mar 2024 21:55:13 +0000 (14:55 -0700)]
Merge branch 'jc/safe-implicit-bare'
Users with safe.bareRepository=explicit can still work from within
$GIT_DIR of a seconary worktree (which resides at .git/worktrees/$name/)
of the primary worktree without explicitly specifying the $GIT_DIR
environment variable or the --git-dir=<path> option.
* jc/safe-implicit-bare:
setup: notice more types of implicit bare repositories
Junio C Hamano [Thu, 21 Mar 2024 21:55:12 +0000 (14:55 -0700)]
Merge branch 'ps/reftable-block-search-fix'
The reftable code has its own custom binary search function whose
comparison callback has an unusual interface, which caused the
binary search to degenerate into a linear search, which has been
corrected.
* ps/reftable-block-search-fix:
reftable/block: fix binary search over restart counter
reftable/record: fix memory leak when decoding object records
Junio C Hamano [Thu, 21 Mar 2024 21:55:12 +0000 (14:55 -0700)]
Merge branch 'ps/reftable-stack-tempfile'
The code in reftable backend that creates new table files works
better with the tempfile framework to avoid leaving cruft after a
failure.
* ps/reftable-stack-tempfile:
reftable/stack: register compacted tables as tempfiles
reftable/stack: register lockfiles during compaction
reftable/stack: register new tables as tempfiles
lockfile: report when rollback fails
* jh/trace2-missing-def-param-fix:
trace2: emit 'def_param' set with 'cmd_name' event
trace2: avoid emitting 'def_param' set more than once
t0211: demonstrate missing 'def_param' events for certain commands
Beat Bolli [Fri, 15 Mar 2024 19:45:58 +0000 (20:45 +0100)]
doc: avoid redundant use of cat
The update-hook-example.txt script uses this anti-pattern twice. Call grep
with the input file name directy. While at it, merge the two consecutive
grep calls.
Signed-off-by: Beat Bolli <dev+git@drbeat.li> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jiamu Sun [Thu, 14 Mar 2024 04:00:16 +0000 (04:00 +0000)]
bugreport.c: fix a crash in `git bugreport` with `--no-suffix` option
`git bugreport` does not complain when `--no-suffix` is given, but
it leads to a segmentation fault as the it is not prepared to see a
NULL assigned to the option_suffix variable.
Signed-off-by: Jiamu Sun <barroit@linux.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 15 Mar 2024 23:05:59 +0000 (16:05 -0700)]
Merge branch 'kh/branch-ref-syntax-advice'
When git refuses to create a branch because the proposed branch
name is not a valid refname, an advice message is given to refer
the user to exact naming rules.
* kh/branch-ref-syntax-advice:
branch: advise about ref syntax rules
advice: use double quotes for regular quoting
advice: use backticks for verbatim
advice: make all entries stylistically consistent
t3200: improve test style
John Cai [Fri, 15 Mar 2024 04:57:26 +0000 (04:57 +0000)]
t5300: fix test_with_bad_commit()
0f8edf7317 (index-pack: --fsck-objects to take an optional argument for
fsck msgs, 2024-02-01) added a test function test_with_bad_commit() that
contained two bugs. test_expect_fail was used instead of test_must_fail,
and a && was not included at the end of the line.
Fix these two issues in the test.
Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 14 Mar 2024 21:05:25 +0000 (14:05 -0700)]
Merge branch 'rj/complete-worktree-paths-fix'
The logic to complete the command line arguments to "git worktree"
subcommand (in contrib/) has been updated to correctly honor things
like "git -C dir" etc.
Junio C Hamano [Thu, 14 Mar 2024 21:05:24 +0000 (14:05 -0700)]
Merge branch 'jc/test-i18ngrep'
With release 2.44 we got rid of all uses of test_i18ngrep and there
is no in-flight topic that adds a new use of it. Make a call to
test_i18ngrep a hard failure, so that we can remove it at the end
of this release cycle.
* jc/test-i18ngrep:
test_i18ngrep: hard deprecate and forbid its use
Junio C Hamano [Thu, 14 Mar 2024 21:05:24 +0000 (14:05 -0700)]
Merge branch 'la/trailer-api'
Trailer API updates.
Acked-by: Christian Couder <christian.couder@gmail.com>
cf. <CAP8UFD1Zd+9q0z1JmfOf60S2vn5-sD3SafDvAJUzRFwHJKcb8A@mail.gmail.com>
* la/trailer-api:
format_trailers_from_commit(): indirectly call trailer_info_get()
format_trailer_info(): move "fast path" to caller
format_trailers(): use strbuf instead of FILE
trailer_info_get(): reorder parameters
trailer: move interpret_trailers() to interpret-trailers.c
trailer: reorder format_trailers_from_commit() parameters
trailer: rename functions to use 'trailer'
shortlog: add test for de-duplicating folded trailers
trailer: free trailer_info _after_ all related usage
Junio C Hamano [Thu, 14 Mar 2024 21:05:23 +0000 (14:05 -0700)]
Merge branch 'ps/reftable-iteration-perf-part2'
The code to iterate over refs with the reftable backend has seen
some optimization.
* ps/reftable-iteration-perf-part2:
refs/reftable: precompute prefix length
reftable: allow inlining of a few functions
reftable/record: decode keys in place
reftable/record: reuse refname when copying
reftable/record: reuse refname when decoding
reftable/merged: avoid duplicate pqueue emptiness check
reftable/merged: circumvent pqueue with single subiter
reftable/merged: handle subiter cleanup on close only
reftable/merged: remove unnecessary null check for subiters
reftable/merged: make subiters own their records
reftable/merged: advance subiter on subsequent iteration
reftable/merged: make `merged_iter` structure private
reftable/pq: use `size_t` to track iterator index
Rubén Justo [Thu, 14 Mar 2024 18:08:58 +0000 (19:08 +0100)]
checkout: plug some leaks in git-restore
In git-restore we need to free the pathspec and pathspec_from_file
values from the struct checkout_opts.
A simple fix could be to free them in cmd_restore, after the call to
checkout_main returns, like we are doing [1][2] in the sibling function
cmd_checkout.
However, we can do even better.
We have git-switch and git-restore, both of them spin-offs[3][4] of
git-checkout. All three are implemented as thin wrappers around
checkout_main. Considering this, it makes a lot of sense to do the
cleanup closer to checkout_main.
Move the cleanups, including the new_branch_info variable, to
checkout_main.
As a consequence, mark: t2070, t2071, t2072 and t6418 as leak-free.
Beat Bolli [Wed, 13 Mar 2024 22:54:23 +0000 (23:54 +0100)]
date: make "iso-strict" conforming for the UTC timezone
ISO 8601-1:2020-12 specifies that a zero timezone offset must be denoted
with a "Z" suffix instead of the numeric "+00:00". Add the correponding
special case to show_date() and a new test.
Changing an established output format which might be depended on by
scripts is always problematic, but here we choose to adhere more closely
to the published standard.
Reported-by: Michael Osipov <michael.osipov@innomotics.com> Signed-off-by: Beat Bolli <dev+git@drbeat.li> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonas Wunderlich [Tue, 12 Mar 2024 21:34:11 +0000 (21:34 +0000)]
doc: status.showUntrackedFiles does not take "false"
The `status.showUntrackedFiles` config option only accepts the
values "no", "normal" or "all", but not as this part of the man page
suggested "false". While we are at it, camel-case the name of the
variable.
Signed-off-by: Jonas Wunderlich <git@03j.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 11 Mar 2024 21:12:30 +0000 (14:12 -0700)]
Merge branch 'rs/t-ctype-simplify'
Code simplification to one unit-test program.
* rs/t-ctype-simplify:
t-ctype: avoid duplicating class names
t-ctype: align output of i
t-ctype: simplify EOF check
t-ctype: allow NUL anywhere in the specification string
Junio C Hamano [Sat, 9 Mar 2024 23:27:09 +0000 (15:27 -0800)]
setup: notice more types of implicit bare repositories
Setting the safe.bareRepository configuration variable to explicit
stops git from using a bare repository, unless the repository is
explicitly specified, either by the "--git-dir=<path>" command line
option, or by exporting $GIT_DIR environment variable. This may be
a reasonable measure to safeguard users from accidentally straying
into a bare repository in unexpected places, but often gets in the
way of users who need valid accesses to the repository.
Earlier, 45bb9162 (setup: allow cwd=.git w/ bareRepository=explicit,
2024-01-20) loosened the rule such that being inside the ".git"
directory of a non-bare repository does not really count as
accessing a "bare" repository. The reason why such a loosening is
needed is because often hooks and third-party tools run from within
$GIT_DIR while working with a non-bare repository.
More importantly, the reason why this is safe is because a directory
whose contents look like that of a "bare" repository cannot be a
bare repository that came embedded within a checkout of a malicious
project, as long as its directory name is ".git", because ".git" is
not a name allowed for a directory in payload.
There are at least two other cases where tools have to work in a
bare-repository looking directory that is not an embedded bare
repository, and accesses to them are still not allowed by the recent
change.
- A secondary worktree (whose name is $name) has its $GIT_DIR
inside "worktrees/$name/" subdirectory of the $GIT_DIR of the
primary worktree of the same repository.
- A submodule worktree (whose name is $name) has its $GIT_DIR
inside "modules/$name/" subdirectory of the $GIT_DIR of its
superproject.
As long as the primary worktree or the superproject in these cases
are not bare, the pathname of these "looks like bare but not really"
directories will have "/.git/worktrees/" and "/.git/modules/" as a
substring in its leading part, and we can take advantage of the same
security guarantee allow git to work from these places.
Extend the earlier "in a directory called '.git' we are OK" logic
used for the primary worktree to also cover the secondary worktree's
and non-embedded submodule's $GIT_DIR, by moving the logic to a
helper function "is_implicit_bare_repo()". We deliberately exclude
secondary worktrees and submodules of a bare repository, as these
are exactly what safe.bareRepository=explicit setting is designed to
forbid accesses to without an explicit GIT_DIR/--git-dir=<path>
Helped-by: Kyle Lippincott <spectral@google.com> Helped-by: Kyle Meyer <kyle@kyleam.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Sun, 10 Mar 2024 20:04:56 +0000 (20:04 +0000)]
ci(github): make Windows test artifacts name unique
If several jobs in the windows-test or vs-test matrices fail, the
upload-artifact action in each job tries to upload the test directories
of the failed tests as "failed-tests-windows.zip", which fails for all
jobs except the one which finishes first with the following error:
Error: Failed to CreateArtifact: Received non-retryable error:
Failed request: (409) Conflict: an artifact with this name
already exists on the workflow run
Make the artifacts name unique by using the 'matrix.nr' token, and
disambiguate the vs-test artifacts from the windows-test ones.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge-ort/merge-recursive: do report errors in `merge_submodule()`
In 24876ebf68b (commit-reach(repo_in_merge_bases_many): report missing
commits, 2024-02-28), I taught `merge_submodule()` to handle errors
reported by `repo_in_merge_bases_many()`.
However, those errors were not passed through to the callers. That was
unintentional, and this commit remedies that.
Note that `find_first_merges()` can now also return -1 (because it
passes through that return value from `repo_in_merge_bases()`), and this
commit also adds the forgotten handling for that scenario.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge-recursive: prepare for `merge_submodule()` to report errors
The `merge_submodule()` function returns an integer that indicates
whether the merge was clean (returning 1) or unclean (returning 0).
Like the version in `merge-ort.c`, the version in `merge-recursive.c`
does not report any errors (such as repository corruption) by returning
-1 as of time of writing, even if the callers in `merge-ort.c` are
prepared for exactly such errors.
However, we want to teach (both variants of) the `merge_submodule()`
function that trick: to report errors by returning -1. Therefore,
prepare the caller in `merge-recursive.c` to handle that scenario.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The upload-pack program, when talking over v2, accepted the
packfile-uris protocol extension from the client, even if it did
not advertise the capability, which has been corrected.
* jk/upload-pack-v2-capability-cleanup:
upload-pack: only accept packfile-uris if we advertised it
upload-pack: use existing config mechanism for advertisement
upload-pack: centralize setup of sideband-all config
upload-pack: use repository struct to get config
Junio C Hamano [Thu, 7 Mar 2024 23:59:42 +0000 (15:59 -0800)]
Merge branch 'jk/upload-pack-bounded-resources'
Various parts of upload-pack has been updated to bound the resource
consumption relative to the size of the repository to protect from
abusive clients.
* jk/upload-pack-bounded-resources:
upload-pack: free tree buffers after parsing
upload-pack: use PARSE_OBJECT_SKIP_HASH_CHECK in more places
upload-pack: always turn off save_commit_buffer
upload-pack: disallow object-info capability by default
upload-pack: accept only a single packfile-uri line
upload-pack: use a strmap for want-ref lines
upload-pack: use oidset for deepen_not list
upload-pack: switch deepen-not list to an oid_array
upload-pack: drop separate v2 "haves" array
A custom remote helper no longer cannot access the newly created
repository during "git clone", which is a regression in Git 2.44.
This has been corrected.
* ps/remote-helper-repo-initialization-fix:
builtin/clone: allow remote helpers to detect repo
"git log --merge" learned to pay attention to CHERRY_PICK_HEAD and
other kinds of *_HEAD pseudorefs.
* ml/log-merge-with-cherry-pick-and-other-pseudo-heads:
revision: implement `git log --merge` also for rebase/cherry-pick/revert
revision: ensure MERGE_HEAD is a ref in prepare_show_merge
Junio C Hamano [Thu, 7 Mar 2024 23:59:41 +0000 (15:59 -0800)]
Merge branch 'jt/commit-redundant-scissors-fix'
"git commit -v --cleanup=scissors" used to add the scissors line
twice in the log message buffer, which has been corrected.
* jt/commit-redundant-scissors-fix:
commit: unify logic to avoid multiple scissors lines when merging
commit: avoid redundant scissor line with --cleanup=scissors -v
Junio C Hamano [Thu, 7 Mar 2024 23:59:41 +0000 (15:59 -0800)]
Merge branch 'js/merge-tree-3-trees'
"git merge-tree" has learned that the three trees involved in the
3-way merge only need to be trees, not necessarily commits.
* js/merge-tree-3-trees:
fill_tree_descriptor(): mark error message for translation
cache-tree: avoid an unnecessary check
Always check `parse_tree*()`'s return value
t4301: verify that merge-tree fails on missing blob objects
merge-ort: do check `parse_tree()`'s return value
merge-tree: fail with a non-zero exit code on missing tree objects
merge-tree: accept 3 trees as arguments
reftable/block: fix binary search over restart counter
Records store their keys prefix-compressed. As many records will share a
common prefix (e.g. "refs/heads/"), this can end up saving quite a bit
of disk space. The downside of this is that it is not possible to just
seek into the middle of a block and consume the corresponding record
because it may depend on prefixes read from preceding records.
To help with this usecase, the reftable format writes every n'th record
without using prefix compression, which is called a "restart". The list
of restarts is stored at the end of each block so that a reader can
figure out entry points at which to read a full record without having to
read all preceding records.
This allows us to do a binary search over the records in a block when
searching for a particular key by iterating through the restarts until
we have found the section in which our record must be located. From
thereon we perform a linear search to locate the desired record.
This mechanism is broken though. In `block_reader_seek()` we call
`binsearch()` over the count of restarts in the current block. The
function we pass to compare records with each other computes the key at
the current index and then compares it to our search key by calling
`strbuf_cmp()`, returning its result directly. But `binsearch()` expects
us to return a truish value that indicates whether the current index is
smaller than the searched-for key. And unless our key exactly matches
the value at the restart counter we always end up returning a truish
value.
The consequence is that `binsearch()` essentially always returns 0,
indicacting to us that we must start searching right at the beginning of
the block. This works by chance because we now always do a linear scan
from the start of the block, and thus we would still end up finding the
desired record. But needless to say, this makes the optimization quite
useless.
Fix this bug by returning whether the current key is smaller than the
searched key. As the current behaviour was correct it is not possible to
write a test. Furthermore it is also not really possible to demonstrate
in a benchmark that this fix speeds up seeking records.
This may cause the reader to question whether this binary search makes
sense in the first place if it doesn't even help with performance. But
it would end up helping if we were to read a reftable with a much larger
block size. Blocks can be up to 16MB in size, in which case it will
become much more important to avoid the linear scan. We are not yet
ready to read or write such larger blocks though, so we have to live
without a benchmark demonstrating this.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/record: fix memory leak when decoding object records
When decoding records it is customary to reuse a `struct
reftable_ref_record` across calls. Thus, it may happen that the record
already holds some allocated memory. When decoding ref and log records
we handle this by releasing or reallocating held memory. But we fail to
do this for object records, which causes us to leak memory.
Fix this memory leak by releasing object records before we decode into
them. We may eventually want to reuse memory instead to avoid needless
reallocations. But for now, let's just plug the leak and be done.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Florian Schmidt [Thu, 7 Mar 2024 18:37:38 +0000 (18:37 +0000)]
wt-status: don't find scissors line beyond buf len
If
(a) There is a "---" divider in a commit message,
(b) At some point beyond that divider, there is a cut-line (that is,
"# ------------------------ >8 ------------------------") in the
commit message,
(c) the user does not explicitly set the "no-divider" option,
then "git interpret-trailers" will hang indefinitively.
This is because when (a) is true, find_end_of_log_message() will invoke
ignored_log_message_bytes() with a len that is intended to make it
ignore the part of the commit message beyond the divider. However,
ignored_log_message_bytes() calls wt_status_locate_end(), and that
function ignores the length restriction when it tries to locate the cut
line. If it manages to find one, the returned cutoff value is greater
than len. At this point, ignored_log_message_bytes() goes into an
infinite loop, because it won't advance the string parsing beyond len,
but the exit condition expects to reach cutoff.
Make wt_status_locate_end() honor the length parameter passed in, to
fix this issue.
In general, if wt_status_locate_end() is given a piece of the memory
that lacks NUL at all, strstr() may continue across page boundaries
and run into an unmapped page. For our current callers, this is not
a problem, as all of them except one uses a memory owned by a strbuf
(which guarantees an implicit NUL-termination after its payload),
and the one exception in trailer.c:find_end_of_log_message() uses
strlen() to compute the length before calling this function.
Signed-off-by: Florian Schmidt <flosch@nutanix.com> Reviewed-by: Jonathan Davies <jonathan.davies@nutanix.com>
[jc: tweaked the commit log message and the implementation a bit] Signed-off-by: Junio C Hamano <gitster@pobox.com>