In order to follow the common manpage usage, the synopsis of the
commands needs to be heavily typeset. A first try was performed with
using native markup, but it turned out to make the document source
almost unreadable, difficult to write and prone to mistakes with
unwanted Asciidoc's role attributes.
In order to both simplify the writer's task and obtain a consistant
typesetting in the synopsis, a custom 'synopsis' paragraph type is
created and the processor for backticked text are modified. The
backends of asciidoc and asciidoctor take in charge to correctly add
the required typesetting.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/reftable: reload locked stack when preparing transaction
When starting a reftable transaction we lock all stacks we are about to
modify. While it may happen that the stack is out-of-date at this point
in time we don't really care: transactional updates encode the expected
state of a certain reference, so all that we really want to verify is
that the _current_ value matches that expected state.
Pass `REFTABLE_STACK_NEW_ADDITION_RELOAD` when locking the stack such
that an out-of-date stack will be reloaded after having been locked.
This change is safe because all verifications of the expected state
happen after this step anyway.
Add a testcase that verifies that many writers are now able to write to
the stack concurrently without failures and with a deterministic end
result.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `reftable_stack_new_addition()` we first lock the stack and then
check whether it is still up-to-date. If it is not we return an error to
the caller indicating that the stack is outdated.
This is overly restrictive in our ref transaction interface though: we
lock the stack right before we start to verify the transaction, so we do
not really care whether it is outdated or not. What we really want is
that the stack is up-to-date after it has been locked so that we can
verify queued updates against its current state while we know that it is
locked for concurrent modification.
Introduce a new flag `REFTABLE_STACK_NEW_ADDITION_RELOAD` that alters
the behaviour of `reftable_stack_init_addition()` in this case: when we
notice that it is out-of-date we reload it instead of returning an error
to the caller.
This logic will be wired up in the reftable backend in the next commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When multiple concurrent processes try to update references in a
repository they may try to lock the same lockfiles. This can happen even
when the updates are non-conflicting and can both be applied, so it
doesn't always make sense to abort the transaction immediately. Both the
"loose" and "packed" backends thus have a grace period that they wait
for the lock to be released that can be controlled via the config values
"core.filesRefLockTimeout" and "core.packedRefsTimeout", respectively.
The reftable backend doesn't have such a setting yet and instead fails
immediately when it sees such a lock. But the exact same concepts apply
here as they do apply to the other backends.
Introduce a new "reftable.lockTimeout" config that controls how long we
may wait for a "tables.list" lock to be released. The default value of
this config is 100ms, which is the same default as we have it for the
"loose" backend.
Note that even though we also lock individual tables, this config really
only applies to the "tables.list" file. This is because individual
tables are only ever locked when we already hold the "tables.list" lock
during compaction. When we observe such a lock we in fact do not want to
compact the table at all because it is already in the process of being
compacted by a concurrent process. So applying the same timeout here
would not make any sense and only delay progress.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
config: fix evaluating "onbranch" with nonexistent git dir
The `include_by_branch()` function is responsible for evaluating whether
or not a specific include should be pulled in based on the currently
checked out branch. Naturally, his condition can only be evaluated when
we have a properly initialized repository with a ref store in the first
place. This is why the function guards against the case when either
`data->repo` or `data->repo->gitdir` are `NULL` pointers.
But the second check is insufficient: the `gitdir` may be set even
though the repository has not been initialized. Quoting "setup.c":
NEEDSWORK: currently we allow bogus GIT_DIR values to be set in some
code paths so we also need to explicitly setup the environment if the
user has set GIT_DIR. It may be beneficial to disallow bogus GIT_DIR
values at some point in the future.
So when either the GIT_DIR environment variable or the `--git-dir`
global option are set by the user then `the_repository` may end up with
an initialized `gitdir` variable. And this happens even when the dir is
invalid, like for example when it doesn't exist. It follows that only
checking for whether or not `gitdir` is `NULL` is not sufficient for us
to determine whether the repository has been properly initialized.
This issue can lead to us triggering a BUG: when using a config with an
"includeIf.onbranch:" condition outside of a repository while using the
`--git-dir` option pointing to an invalid Git directory we may end up
trying to evaluate the condition even though the ref storage format has
not been set up.
This bisects to 173761e21b (setup: start tracking ref storage format,
2023-12-29), but that commit really only starts to surface the issue
that has already existed beforehand. The code to check for `gitdir` was
introduced via 85fe0e800c (config: work around bug with
includeif:onbranch and early config, 2019-07-31), which tried to fix
similar issues when we didn't yet have a repository set up. But the fix
was incomplete as it missed the described scenario.
As the quoted comment mentions, we'd ideally refactor the code to not
set up `gitdir` with an invalid value in the first place, but that may
be a bigger undertaking. Instead, refactor the code to use the ref
storage format as an indicator of whether or not the ref store has been
set up to fix the bug.
Reported-by: Ronan Pigott <ronan@rjp.ie> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a couple more tests for "onbranch" includes for several edge cases.
All tests except for the last one pass, so for the most part this change
really only aims to nail down behaviour of include conditionals further.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git sparse-checkout disable' with the sparse index
enabled, Git is expected to expand the index into a full index. However,
it currently outputs the advice message saying that that is unexpected
and likely due to an issue with the working directory.
Disable this advice message when in this code path. Establish a pattern
for doing a similar removal in the future.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 23 Sep 2024 17:35:09 +0000 (10:35 -0700)]
Merge branch 'jc/pass-repo-to-builtins'
The convention to calling into built-in command implementation has
been updated to pass the repository, if known, together with the
prefix value.
* jc/pass-repo-to-builtins:
add: pass in repo variable instead of global the_repository
builtin: remove USE_THE_REPOSITORY for those without the_repository
builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h
builtin: add a repository parameter for builtin functions
When a remote-helper dies before Git writes to it, SIGPIPE killed
Git silently. We now explain the situation a bit better to the end
user in our error message.
* jk/diag-unexpected-remote-helper-death:
print an error when remote helpers die during capabilities
Junio C Hamano [Mon, 23 Sep 2024 17:35:04 +0000 (10:35 -0700)]
Merge branch 'ps/environ-wo-the-repository'
Code clean-up.
* ps/environ-wo-the-repository: (21 commits)
environment: stop storing "core.notesRef" globally
environment: stop storing "core.warnAmbiguousRefs" globally
environment: stop storing "core.preferSymlinkRefs" globally
environment: stop storing "core.logAllRefUpdates" globally
refs: stop modifying global `log_all_ref_updates` variable
branch: stop modifying `log_all_ref_updates` variable
repo-settings: track defaults close to `struct repo_settings`
repo-settings: split out declarations into a standalone header
environment: guard state depending on a repository
environment: reorder header to split out `the_repository`-free section
environment: move `set_git_dir()` and related into setup layer
environment: make `get_git_namespace()` self-contained
environment: move object database functions into object layer
config: make dependency on repo in `read_early_config()` explicit
config: document `read_early_config()` and `read_very_early_config()`
environment: make `get_git_work_tree()` accept a repository
environment: make `get_graft_file()` accept a repository
environment: make `get_index_file()` accept a repository
environment: make `get_object_directory()` accept a repository
environment: make `get_git_common_dir()` accept a repository
...
Eric Sunshine [Mon, 23 Sep 2024 07:54:16 +0000 (03:54 -0400)]
worktree: repair copied repository and linked worktrees
For each linked worktree, Git maintains two pointers: (1)
<repo>/worktrees/<id>/gitdir which points at the linked worktree, and
(2) <worktree>/.git which points back at <repo>/worktrees/<id>. Both
pointers are absolute pathnames.
Aside from manually manipulating those raw files, it is possible to
easily "break" one or both pointers by ignoring the "git worktree move"
command and instead manually moving a linked worktree, moving the
repository, or moving both. The "git worktree repair" command was
invented to handle this case by restoring these pointers to sane values.
For the "repair" command, the "git worktree" manual page states:
Repair worktree administrative files, if possible, if they have
become corrupted or outdated due to external factors.
The "if possible" clause was chosen deliberately to convey that the
existing implementation may not be able to fix every possible breakage,
and to imply that improvements may be made to handle other types of
breakage.
A recent problem report[*] illustrates a case in which "git worktree
repair" not only fails to fix breakage, but actually causes breakage.
Specifically, if a repository / main-worktree and linked worktrees are
*copied* as a unit (rather than *moved*), then "git worktree repair" run
in the copy leaves the copy untouched but botches the pointers in the
original repository and the original worktrees.
if "orig" is copied (not moved) to "dup", then immediately after the
manual copy operation:
* orig/main/.git/worktrees/linked/gitdir points at orig/linked/.git
* orig/linked/.git points at orig/main/.git/worktrees/linked
* dup/main/.git/worktrees/linked/gitdir points at orig/linked/.git
* dup/linked/.git points at orig/main/.git/worktrees/linked
So, dup/main thinks its linked worktree is orig/linked, and worktree
dup/linked thinks its repository / main-worktree is orig/main.
"git worktree repair" is reasonably simple-minded; it wants to trust
valid-looking pointers, hence doesn't try to second-guess them. In this
case, when validating dup/linked/.git, it finds a legitimate repository
pointer, orig/main/.git/worktrees/linked, thus trusts that is correct,
but does notice that gitdir in that directory doesn't point at
dup/linked/.git, so it (incorrectly) _fixes_
orig/main/.git/worktrees/linked/gitdir to point at dup/linked/.git.
Similarly, when validating dup/main/.git/worktrees/linked/gitdir, it
finds a legitimate worktree pointer, orig/linked/.git, but notices that
its .git file doesn't point back at dup/main, thus (incorrectly) _fixes_
orig/linked/.git to point at dup/main/.git/worktrees/linked. Hence, it
has modified and broken the linkage between orig/main and orig/linked
rather than fixing dup/main and dup/linked as expected.
Fix this problem by also checking if a plausible .git/worktrees/<id>
exists in the *current* repository -- not just in the repository pointed
at by the worktree's .git file -- and comparing whether they are the
same. If not, then it is likely because the repository / main-worktree
and linked worktrees were copied, so prefer the discovered plausible
pointer rather than the one from the existing .git file.
Reported-by: Russell Stuart <russell+git.vger.kernel.org@stuart.id.au> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sun, 22 Sep 2024 13:40:42 +0000 (15:40 +0200)]
commit-graph: remove unnecessary UNLEAK
When f4dbdfc4d5 (commit-graph: clean up leaked memory during write,
2018-10-03) added the UNLEAK, it was right before a call to die_errno(). e103f7276f (commit-graph: return with errors during write, 2019-06-12)
made it unnecessary, as it was then followed by a free() call for the
allocated string.
The code moved to write_commit_graph_file() in the meantime and the
string pointer is now part of a struct, but the function's only caller
still cleans up the allocation. Drop the superfluous UNLEAK.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sat, 21 Sep 2024 20:23:39 +0000 (22:23 +0200)]
archive: load index before pathspec checks
git archive checks whether pathspec arguments match anything to avoid
surprises due to typos and later loads the index to get attributes.
This order was OK when these features were introduced by ba053ea96c
(archive: do not read .gitattributes in working directory, 2009-04-18)
and d5f53d6d6f (archive: complain about path specs that don't match
anything, 2009-12-12).
But when attribute matching was added to pathspec in b0db704652
(pathspec: allow querying for attributes, 2017-03-13), the pathspec
checker in git archive did not support it fully, because it lacks the
attributes from the index.
Load the index earlier, before the pathspec check, to support attr
pathspecs.
Reported-by: Ronan Pigott <ronan@rjp.ie> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sat, 21 Sep 2024 15:09:54 +0000 (17:09 +0200)]
diff: report modified binary files as changes in builtin_diff()
The diff machinery has two ways to detect changes to set the exit code:
Just comparing hashes and comparing blob contents. The latter is needed
if certain changes have to be ignored, e.g. with --ignore-space-change
or --ignore-matching-lines. It's enabled by the diff_options flag
diff_from_contents.
The code for handling binary files added by 1aaf69e669 (diff: shortcut
for diff'ing two binary SHA-1 objects, 2014-08-16) always uses a quick
hash-only comparison, even if the slow way is taken. We need it to
report a hash difference as a change for the purpose of setting the
exit code, though, but it never did. Fix that.
d7b97b7185 (diff: let external diffs report that changes are
uninteresting, 2024-06-09) set diff_from_contents if external diff
programs are allowed. This is the default e.g. for git diff, and so
that change exposed the inconsistency much more widely.
Reported-by: Kohei Shibata <shiba200712@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
scalar: configure maintenance during 'reconfigure'
The 'scalar reconfigure' command is intended to update registered repos
with the latest settings available. However, up to now we were not
reregistering the repos with background maintenance.
In particular, this meant that the background maintenance schedule would
not be updated if there are improvements between versions.
Be sure to register repos for maintenance during the reconfigure step.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the moment, some background jobs are getting blocked on credentials
during the 'prefetch' task. This leads to other tasks, such as
incremental repacks, getting blocked. Further, if a user manages to fix
their credentials, then they still need to cancel the background process
before their background maintenance can continue working.
Update the background schedules for our four scheduler integrations to
include these config options via '-c' options:
* 'credential.interactive=false' will stop Git and some credential
helpers from prompting in the UI (assuming the '-c' parameters are
carried through and respected by GCM).
* 'core.askPass=true' will replace the text fallback for a username
and password into the 'true' command, which will return a success in
its exit code, but Git will treat the empty string returned as an
invalid password and move on.
We can do some testing that the credentials are passed, at least in the
systemd case due to writing the service files.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When scripts or background maintenance wish to perform HTTP(S) requests,
there is a risk that our stored credentials might be invalid. At the
moment, this causes the credential helper to ping the user and block the
process. Even if the credential helper does not ping the user, Git falls
back to the 'askpass' method, which includes a direct ping to the user
via the terminal.
Even setting the 'core.askPass' config as something like 'echo' will
causes Git to fallback to a terminal prompt. It uses
git_terminal_prompt(), which finds the terminal from the environment and
ignores whether stdin has been redirected. This can also block the
process awaiting input.
Create a new config option to prevent user interaction, favoring a
failure to a blocked process.
The chosen name, 'credential.interactive', is taken from the config
option used by Git Credential Manager to already avoid user
interactivity, so there is already one credential helper that integrates
with this option. However, older versions of Git Credential Manager also
accepted other string values, including 'auto', 'never', and 'always'.
The modern use is to use a boolean value, but we should still be
careful that some users could have these non-booleans. Further, we
should respect 'never' the same as 'false'. This is respected by the
implementation and test, but not mentioned in the documentation.
The implementation for the Git interactions takes place within
credential_getpass(). The method prototype is modified to return an
'int' instead of 'void'. This allows us to detect that no attempt was
made to fill the given credential, changing the single caller slightly.
Also, a new trace2 region is added around the interactive portion of the
credential request. This provides a way to measure the amount of time
spent in that region for commands that _are_ interactive. It also makes
a conventient way to test that the config option works with
'test_region'.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fatal: failed to recurse into submodule $submodule
When "git submodule--helper" recurses into a submodule it creates a
child process. If that process fails then the error message above is
displayed by the parent. In the case above the child is killed by
SIGPIPE as "grep -q" exits as soon as it sees the first match. Fix this
by propagating SIGPIPE so that it is visible to the process running
git. We could propagate other signals but I'm not sure there is much
value in doing that. In the common case of the user pressing Ctrl-C or
Ctrl-\ then SIGINT or SIGQUIT will be sent to the foreground process
group and so the parent process will receive the same signal as the
child.
Reported-by: Matt Liberty <mliberty@precisioninno.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 20 Sep 2024 15:59:27 +0000 (08:59 -0700)]
The 19th batch
Merge the topics that have been cooking since 2024-09-13 or so in
'next'.
Let's try a new workflow to update the maintenance track by removing
the "merge ... later to maint" comments from the draft release notes
on the 'master' track.
Junio C Hamano [Fri, 20 Sep 2024 18:16:29 +0000 (11:16 -0700)]
Merge branch 'pw/rebase-autostash-fix'
"git rebase --autostash" failed to resurrect the autostashed
changes when the command gets aborted after giving back control
asking for hlep in conflict resolution.
* pw/rebase-autostash-fix:
rebase: apply and cleanup autostash when rebase fails to start
With the recent effort to make the test suite free of memory leaks we
now run a lot more of test suites with the leak-sanitizer enabled. While
we were originally only executing around 23000 tests, we're now at 30000
tests. Naturally, this has a significant impact on the runtime of such a
test run.
Naturally, this impact can also be felt for our leak-checking CI jobs.
While macOS used to be the slowest-executing job on GitLab CI with ~15
minutes of runtime, nowadays it is our leak checks which take around 45
to 55 minutes.
Our Linux runners for GitLab CI are untagged, which means that they
default to the "small" machine type with two CPU cores [1]. Upgrade
these to the "medium" runner, which provide four CPU cores and which
should thus provide a noticeable speedup.
In theory, we could upgrade to an ever larger machine than that. The
official mirror [2] has an Ultimate license, so we could get up to 128
cores. But anybody running a fork of the Git project without such a
license wouldn't be able to use those beefier machines and thus their
pipelines would fail.
cmake: generalize the handling of the `UNIT_TEST_OBJS` list
In a15d4465a991 (cmake: also build unit tests, 2023-09-25), I
accommodated the CMake definition. Seeing that a `UNIT_TEST_OBJS` list
was introduced that was built by transforming the `UNIT_TEST_PROGRAMS`
list and then adding a single, hard-coded file
("t/unit-tests/test-lib.c"), I decided to hard-code that in the CMake
definition, too.
The reason why I hard-coded it instead of imitating the
`parse_makefile_for_sources()` paradigm that was used elsewhere when
using the `Makefile` as source of truth for given lists of files: This
function expects _only_ hard-coded values, and that transformed
`UNIT_TEST_PROGRAMS` list complicated everything.
In 872721538c26 (cmake: fix build of `t-oidtree`, 2024-07-12), I
accommodated the CMake definition again, after seeing that the
`UNIT_TEST_OBJS` was still defined via that transformed list but now
appending _two_ hard-coded files ("t/unit-tests/lib-oid.c" joined the
fray).
In 428672a3b16 (Makefile: stop listing test library objects twice,
2024-09-16), the `Makefile` was changed so that `UNIT_TEST_OBJS` is
finally only constructed using hard-coded file names just like the other
`*_OBJS` variables. I missed that and therefore did not adjust the CMake
definition. Besides, the code was working, so there was no real need to
adjust it.
With a4f50bb1e9b (t/unit-tests: introduce reftable library, 2024-09-16),
however, the `UNIT_TEST_OBJS` list became a trio, and the CMake
definition has to be adjusted again. Now that we can use the
`parse_makefile_for_sources()` function without many complications,
let's do that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmake: stop looking for `REFTABLE_TEST_OBJS` in the Makefile
As of 15e29ea1c648 (t: move reftable/stack_test.c to the unit testing
framework, 2024-09-08), the reftable tests are no longer part of
`test-tool.exe`, so let's stop looking for those lines that are no
longer in the `Makefile`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmake: rename clar-related variables to avoid confusion
In c3de556a841f (Makefile: rename clar-related variables to avoid
confusion, 2024-09-10) some `Makefile` variables were renamed that were
partially used by the CMake definition. Adapt the latter to the new lay
of the land.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 19 Sep 2024 01:02:05 +0000 (18:02 -0700)]
Merge branch 'es/chainlint-message-updates'
The error messages from the test script checker have been improved.
* es/chainlint-message-updates:
chainlint: reduce annotation noise-factor
chainlint: make error messages self-explanatory
chainlint: don't be fooled by "?!...?!" in test body
Junio C Hamano [Thu, 19 Sep 2024 01:02:05 +0000 (18:02 -0700)]
Merge branch 'ps/clar-unit-test'
Import clar unit tests framework libgit2 folks invented for our
use.
* ps/clar-unit-test:
Makefile: rename clar-related variables to avoid confusion
clar: add CMake support
t/unit-tests: convert ctype tests to use clar
t/unit-tests: convert strvec tests to use clar
t/unit-tests: implement test driver
Makefile: wire up the clar unit testing framework
Makefile: do not use sparse on third-party sources
Makefile: make hdr-check depend on generated headers
Makefile: fix sparse dependency on GENERATED_H
clar: stop including `shellapi.h` unnecessarily
clar(win32): avoid compile error due to unused `fs_copy()`
clar: avoid compile error with mingw-w64
t/clar: fix compatibility with NonStop
t: import the clar unit testing framework
t: do not pass GIT_TEST_OPTS to unit tests with prove
apply: refactor `struct image` to use a `struct strbuf`
The `struct image` uses a character array to track the pre- or postimage
of a patch operation. This has multiple downsides:
- It is somewhat hard to track memory ownership. In fact, we have
several memory leaks in git-apply(1) because we do not (and cannot
easily) free the buffer in all situations.
- We have to reinvent the wheel and manually implement a lot of
functionality that would already be provided by `struct strbuf`.
- We have to carefully track whether `update_pre_post_images()` can do
an in-place update of the postimage or whether it has to allocate a
new buffer for it.
This is all rather cumbersome, and especially `update_pre_post_images()`
is really hard to understand as a consequence even though what it is
doing is rather trivial.
Refactor the code to use a `struct strbuf` instead, addressing all of
the above. Like this we can easily perform in-place updates in all
situations, the logic to perform those updates becomes way simpler and
the lifetime of the buffer becomes a ton easier to track.
This refactoring also plugs some leaking buffers as a side effect.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
apply: rename members that track line count and allocation length
The `struct image` has two members `nr` and `alloc` that track the
number of lines as well as how large its array is. It is somewhat easy
to confuse these members with `len` though, which tracks the length of
the `buf` member.
Rename these members to `line_nr` and `line_alloc` respectively to avoid
confusion. This is in line with how we typically name variables that
track an array in this way.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `struct image` has two members `line` and `line_allocated`. The
former member is the one that should be used throughout the code,
whereas the latter one is used to track whether the lines have been
allocated or not.
In practice, the array of lines is always allocated. The reason why we
have `line_allocated` is that `remove_first_line()` will advance the
array pointer to drop the first entry, and thus it points into the array
instead of to the array header.
Refactor the function to use memmove(3P) instead, which allows us to get
rid of this double bookkeeping. This is less efficient, but I doubt that
this matters much in practice. If this judgement call is found to be
wrong at a later point in time we can likely refactor the surrounding
loop such that we first calculate the number of leading context lines to
remove and then remove them in a single call to memmove(3P).
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
apply: introduce macro and function to init images
We're about to convert the `struct image` to gain a `struct strbuf`
member, which requires more careful initialization than just memsetting
it to zeros. Introduce the `IMAGE_INIT` macro and `image_init()`
function to prepare for this change.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
apply: reorder functions to move image-related things together
While most of the functions relating to `struct image` are relatively
close to one another, `fuzzy_matchlines()` sits in between those even
though it is rather unrelated.
Reorder functions such that `struct image`-related functions are next to
each other. While at it, move `clear_image()` to the top such that it is
close to the struct definition itself. This makes this lifecycle-related
thing easy to discover.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 16 Sep 2024 22:27:08 +0000 (15:27 -0700)]
Merge branch 'jk/ci-linux32-update'
CI updates
* jk/ci-linux32-update:
ci: add Ubuntu 16.04 job to GitLab CI
ci: use regular action versions for linux32 job
ci: use more recent linux32 image
ci: unify ubuntu and ubuntu32 dependencies
ci: drop run-docker scripts
Junio C Hamano [Mon, 16 Sep 2024 22:27:08 +0000 (15:27 -0700)]
Merge branch 'jc/ci-upload-artifact-and-linux32'
CI started failing completely for linux32 jobs, as the step to
upload failed test directory uses GitHub actions that is deprecated
and is now disabled. Remove the step so at least we will know if
the tests are passing.
* jc/ci-upload-artifact-and-linux32:
ci: remove 'Upload failed tests' directories' step from linux32 jobs
Junio C Hamano [Mon, 16 Sep 2024 22:13:24 +0000 (15:13 -0700)]
Merge branch 'jk/ci-linux32-update' into maint-2.46
CI updates
* jk/ci-linux32-update:
ci: add Ubuntu 16.04 job to GitLab CI
ci: use regular action versions for linux32 job
ci: use more recent linux32 image
ci: unify ubuntu and ubuntu32 dependencies
ci: drop run-docker scripts
Junio C Hamano [Mon, 16 Sep 2024 22:13:24 +0000 (15:13 -0700)]
Merge branch 'jc/ci-upload-artifact-and-linux32' into maint-2.46
CI started failing completely for linux32 jobs, as the step to
upload failed test directory uses GitHub actions that is deprecated
and is now disabled. Remove the step so at least we will know if
the tests are passing.
* jc/ci-upload-artifact-and-linux32:
ci: remove 'Upload failed tests' directories' step from linux32 jobs
Junio C Hamano [Mon, 16 Sep 2024 21:22:55 +0000 (14:22 -0700)]
Merge branch 'jk/ref-filter-trailer-fixes'
Bugfixes and leak plugging in "git for-each-ref --format=..." code
paths.
* jk/ref-filter-trailer-fixes:
ref-filter: fix leak with unterminated %(if) atoms
ref-filter: add ref_format_clear() function
ref-filter: fix leak when formatting %(push:remoteref)
ref-filter: fix leak with %(describe) arguments
ref-filter: fix leak of %(trailers) "argbuf"
ref-filter: store ref_trailer_buf data per-atom
ref-filter: drop useless cast in trailers_atom_parser()
ref-filter: strip signature when parsing tag trailers
ref-filter: avoid extra copies of payload/signature
t6300: drop newline from wrapped test title
Junio C Hamano [Mon, 16 Sep 2024 21:22:53 +0000 (14:22 -0700)]
Merge branch 'cp/unit-test-reftable-stack'
Another reftable test migrated to the unit-test framework.
* cp/unit-test-reftable-stack:
t-reftable-stack: add test for stack iterators
t-reftable-stack: add test for non-default compaction factor
t-reftable-stack: use reftable_ref_record_equal() to compare ref records
t-reftable-stack: use Git's tempfile API instead of mkstemp()
t: harmonize t-reftable-stack.c with coding guidelines
t: move reftable/stack_test.c to the unit testing framework
Junio C Hamano [Mon, 16 Sep 2024 21:06:06 +0000 (14:06 -0700)]
Merge branch 'cp/unit-test-reftable-stack' into ps/reftable-alloc-failures
* cp/unit-test-reftable-stack:
t-reftable-stack: add test for stack iterators
t-reftable-stack: add test for non-default compaction factor
t-reftable-stack: use reftable_ref_record_equal() to compare ref records
t-reftable-stack: use Git's tempfile API instead of mkstemp()
t: harmonize t-reftable-stack.c with coding guidelines
t: move reftable/stack_test.c to the unit testing framework
refs/reftable: wire up support for exclude patterns
Exclude patterns can be used by reference backends to skip over blocks
of references that are uninteresting to the caller. Reference backends
do not have to wire up support for them, and all callers are expected to
behave as if the backend didn't support them. In fact, the only backend
that supports exclude patterns right now is the "packed" backend.
Exclude patterns can be quite an important performance optimization in
repositories that have loads of references. The patterns are set up in
case "transfer.hideRefs" and friends are configured during a fetch, so
handling these patterns becomes important once there are lots of hidden
refs in a served repository.
Now that we have properly re-seekable reftable iterators we can also
wire up support for these patterns in the "reftable" backend. Doing so
is conceptually simple: once we hit a reference whose prefix matches the
current exclude pattern we re-seek the iterator to the first reference
that doesn't match the pattern anymore. This schema only works for
trivial patterns that do not have any globbing characters in them, but
this restriction also applies do the "packed" backend.
This makes t1419 work with the "reftable" backend with some slight
modifications. Of course it also speeds up listing of references with
hidden refs. The following benchmark prints one reference with 1 million
hidden references:
Benchmark 1: HEAD~
Time (mean ± σ): 93.3 ms ± 2.1 ms [User: 90.3 ms, System: 2.5 ms]
Range (min … max): 89.8 ms … 97.2 ms 33 runs
Benchmark 2: HEAD
Time (mean ± σ): 4.2 ms ± 0.6 ms [User: 2.2 ms, System: 1.8 ms]
Range (min … max): 3.1 ms … 8.1 ms 765 runs
Summary
HEAD ran
22.15 ± 3.19 times faster than HEAD~
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 67ce50ba26 (Merge branch 'ps/reftable-reusable-iterator', 2024-05-30)
we have refactored the interface of reftable iterators such that they
can be reused in theory. This patch series only landed the required
changes on the interface level, but didn't yet implement the actual
logic to make iterators reusable.
As it turns out almost all of the infrastructure already does support
re-seeking. The only exception is the table iterator, which does not
reset its `is_finished` bit. Do so and add a couple of tests that verify
that we can re-seek iterators.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have recently migrated all of the reftable unit tests that were part
of the reftable library into our own unit testing framework. As part of
that migration we have duplicated some of the functionality that was
part of the reftable test framework into each of the migrated test
suites. This was a sensible decision to not have all of the migrations
dependent on each other, but now that the migration is done it makes
sense to deduplicate the functionality again.
Introduce a new reftable test library that hosts some shared code and
adapt tests to use it.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever one adds another test library compilation unit one has to wire
it up twice in the Makefile: once to append it to `UNIT_TEST_OBJS`, and
once to append it to the `UNIT_TEST_PROGS` target. Ideally, we'd just
reuse the `UNIT_TEST_OBJS` variable in the target so that we can avoid
the duplication. But it also contains all the objects for our test
programs, each of which contains a `cmd_main()`, and thus we cannot link
them all into the target executable.
Refactor the code such that `UNIT_TEST_OBJS` does not contain the unit
test program objects anymore, which we can instead manually append to
the `OBJECTS` variable. Like this, the former variable now only contains
objects for test libraries and can thus be reused.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/receive-pack: fix exclude patterns when announcing refs
In `write_head_info()` we announce references to the remote client. We
need to honor "transfer.hideRefs" here so that we do not announce any
references that the client shouldn't be able to learn about. This is
done via two separate mechanisms:
- We hand over exclude patterns to the reference backend. We can only
honor "plain" exclude patterns here that do not have prefixes with
special meaning such as "^" or "!". Filtering down the references is
handled by `hidden_refs_to_excludes()`.
- In `show_ref_cb()` we perform a second check against hidden refs.
For one this is done such that we can handle those special prefixes.
And second, handling exclude patterns in ref backends is optional,
so we also have to handle "normal" patterns.
The special-meaning "^" prefix alters whether a hidden ref applies to
the namespace-stripped reference name or the full name. So while we
would usually call `refs_for_each_namespaced_ref()` to only get those
references in the current namespace, we can't because we'd get the
already-rewritten reference names. Instead, we are forced to use
`refs_for_each_fullref_in()` and then manually strip away the namespace
prefix such that we have access to both names.
But this also means that we do not get namespace handling for exclude
patterns, which `refs_for_each_namespaced_ref()` brings for free. This
results in a bug because we potentially end up hiding away references
based on their namespaced name and not on the stripped name as we really
should be doing.
Fix this by manually rewriting the exclude patterns to their namespaced
variants.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: properly apply exclude patterns to namespaced refs
Reference namespaces allow commands like git-upload-pack(1) to serve
different sets of references to the client depending on which namespace
is enabled, which is for example useful in fork networks. Namespaced
refs are stored with a `refs/namespaces/$namespace` prefix, but all the
user will ultimately see is a stripped version where that prefix is
removed.
The way that this interacts with "transfer.hideRefs" is not immediately
obvious: the hidden refs can either apply to the stripped references, or
to the non-stripped ones that still have the namespace prefix. In fact,
the "transfer.hideRefs" machinery does the former and applies to the
stripped reference by default, but rules can have "^" prefixed to switch
this behaviour to instead match against the full reference name.
Namespaces are exclusively handled at the generic "refs" layer, the
respective backends have no clue that such a thing even exists. This
also has the consequence that they cannot handle hiding references as
soon as reference namespaces come into play because they neither know
whether a namespace is active, nor do they know how to strip references
if they are active.
Handling such exclude patterns in `refs_for_each_namespaced_ref()` and
`refs_for_each_fullref_in_prefixes()` is broken though, as both support
that the user passes both namespaces and exclude patterns. In the case
where both are set we will exclude references with unstripped names,
even though we really wanted to exclude references based on their
stripped names.
This only surfaces when:
- A repository uses reference namespaces.
- "transfer.hideRefs" is active.
- The namespaced references are packed into the "packed-refs" file.
None of our tests exercise this scenario, and thus we haven't ever hit
it. While t5509 exercises both (1) and (2), it does not happen to hit
(3). It is trivial to demonstrate the bug though by explicitly packing
refs in the tests, and then we indeed surface the breakage.
Fix this bug by prefixing exclude patterns with the namespace in the
generic layer. The newly introduced function will be used outside of
"refs.c" in the next patch, so we add a declaration to "refs.h".
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>