Zheng Yuting [Wed, 26 Mar 2025 07:52:46 +0000 (15:52 +0800)]
send-email: finer-grained SMTP error handling
Code captured errors but did not process them further.
This treated all failures the same without distinguishing SMTP status.
Add handle-smtp_error to extract SMTP status codes using a regex (as
defined in RFC 5321) and handle errors as follows:
- No error present:
- If a result is provided, return 1 to indicate success.
- Otherwise, return 0 to indicate failure.
- Error present with a captured three-digit status code:
- For 4yz (transient errors), return 1 and allow retries.
- For 5yz (permanent errors), return 0 to indicate failure.
- For any other recognized status code, return 1, treating it as
a transient error.
- Error present but no status code found:
- Return 1 as a transient error.
Signed-off-by: Zheng Yuting <05ZYT30@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Zheng Yuting [Wed, 26 Mar 2025 07:52:45 +0000 (15:52 +0800)]
send-email: capture errors in an eval {} block
Auth relied solely on return values without catching errors. This misjudges
non-credential errors as auth failure without error info.
Patch wraps the entire auth process in an eval {} block to catch
all exceptions, including non-credential errors. It adds a new $error var,
uses 'or do' to prevent flow break, and returns $result ? 1 : 0. And merges
if/else branches, integrates SASL and basic auth, with comments for
future status code handling.
Signed-off-by: Zheng Yuting <05ZYT30@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/table: move printing logic into test helper
The logic to print individual blocks in a table is hosted in the
reftable library. This is only the case due to historical reasons though
because users of the library had no interfaces to read blocks one by
one. Otherwise, printing individual blocks has no place in the reftable
library given that the format will not be generic in the first place.
We have now grown a public interface to iterate through blocks contained
in a table, and thus we can finally move the logic to print them into
the test helper.
Move over the logic and refactor it accordingly. Note that the iterator
also trivially allows us to access index sections, which we previously
didn't print at all. This omission wasn't intentional though, so start
dumping those sections as well so that we can assert that indices are
written as expected.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/constants: make block types part of the public interface
Now that reftable blocks can be read individually via the public
interface it becomes necessary for callers to be able to distinguish the
different types of blocks. Expose the relevant constants.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/table: introduce iterator for table blocks
Introduce a new iterator that allows the caller to iterate through all
blocks contained in a table. This gives users more fine-grained control
over how exactly those blocks are being read and exposes information to
callers that was previously inaccessible.
This iterator will be required by a future patch series that adds
consistency checks for the reftable backend. In addition to that though
we will also reimplement `reftable_table_print_blocks()` on top of this
new iterator in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/table: add `reftable_table` to the public interface
The `reftable_table` interface is an internal implementation detail that
callers have no access to. Having direct access to this structure is
important though for a subsequent patch series that will implement
consistency checks for the reftable backend.
Move the structure into "reftable-table.h" so that it part of the public
interface.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: expose a generic iterator over reftable records
Expose a generic iterator over reftable records and expose it via the
public interface. Together with an upcoming iterator for reftable blocks
contained in a table this will allow users to trivially iterate through
blocks and their respective records individually.
This functionality will be used to implement consistency checks for the
reftable backend, which requires more fine-grained control over how we
read data.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the block iterators so that initialization and seeking are
different from one another. This makes the iterator trivially reseekable
by storing the pointer to the block at initialization time, which we can
then reuse on every seek.
This refactoring prepares the code for exposing a `reftable_iterator`
interface for blocks in a subsequent commit. Callsites are adjusted
accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: store block pointer in the block iterator
The block iterator requires access to a bunch of data from the
underlying `reftable_block` that it is iterating over. This data is
stored by copying over relevant data into a separate set of variables.
This has multiple downsides:
- We require more storage space than necessary. This is more of a
theoretical issue as we shouldn't ever have many blocks.
- We have to perform more bookkeeping, and the variable names are
inconsistent across the two data structures. This can lead to some
confusion.
- The lifetime of the block iterator is tied to the block anyway, but
we hide that a bit by only storing pointers pointing into the block.
There isn't really any good reason why we rip out parts of the block
instead of storing a pointer to the block itself.
Refactor the code to do so. Despite being simpler, it also allows us to
decouple the lifetime of the block iterator from seeking in a subsequent
commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: create public interface for reading blocks
While users of the reftable library wouldn't generally require access to
individual blocks in a reftable table, there are valid usecases where
one may require low-level access to them. One such upcoming usecase in
the Git codebase is to implement consistency checks for the reftable
library where we want to verify each block individually.
Create a public interface for reading blocks. The interface isn't yet
complete and lacks e.g. a way to read individual records from a block.
Such missing functionality will be backfilled in subsequent commits.
Note that this change also requires us to expose `reftable_buf`, which
is used by the `reftable_block_first_key()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-zlib: use `struct z_stream_s` instead of typedef
Throughout the Git codebase we're using the typedeffed version of
`z_stream`, which maps to `struct z_stream_s`. By using a typedef
instead of the struct it becomes somewhat harder to predeclare the
symbol so that headers depending on the struct can do so without having
to pull in "zlib-compat.h".
We don't yet have users that would really care about this: the only
users that declare `z_stream` as a pointer are in "reftable/block.h",
which is a header that is internal to the reftable library. But in the
next step we're going to expose the `struct reftable_block` publicly,
and that struct does contain a pointer to `z_stream`. And as the public
header shouldn't depend on "reftable/system.h", which is an internal
implementation detail, we won't have the typedef for `z_stream` readily
available.
Prepare for this change by using `struct z_stream_s` throughout our code
base. In case zlib-ng is used we use a define to map from `z_stream_s`
to `zng_stream_s`.
Drop the pre-declaration of `struct z_stream` while at it. This struct
does not exist in the first place, and the declaration wasn't needed
because "reftable/block.h" already includes "reftable/basics.h" which
transitively includes "reftable/system.h" and thus "git-zlib.h".
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: rename `block_reader` to `reftable_block`
The `block_reader` structure is used to access parsed data of a reftable
block. The structure is currently treated as an internal implementation
detail and not exposed via our public interfaces. The functionality
provided by the structure is useful to external users of the reftable
library though, for example when implementing consistency checks that
need to scan through the blocks manually.
Rename the structure to `reftable_block` now that the name has been made
available in the preceding commit. This name is in line with the naming
schema used for other data structures like `reftable_table` in that it
describes the underlying entity that it provides access to.
The new data structure isn't yet exposed via the public interface, which
is left for a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `reftable_block` structure associates a byte slice with a block
source. As such it only holds the data of a reftable block without
actually encoding any of the details for how to access that data.
Rename the structure to instead be called `reftable_block_data`. Besides
clarifying that this really only holds data, it also allows us to rename
the `reftable_block_reader` to `reftable_block` in the next commit, as
this is the structure that actually encapsulates access to the reftable
blocks.
Rename the `struct reftable_block_reader::block` member accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/table: move reading block into block reader
The logic to read blocks from a reftable is scattered across both the
table and the block subsystems. Besides causing somewhat fuzzy
responsibilities, it also means that we have to awkwardly pass around
the ownership of blocks between the subsystems.
Refactor the code so that we stop passing the block when initializing a
reader, but instead by passing in the block source plus the offset at
which we're supposed to read a block. Like this, the ownership of the
block itself doesn't need to get handed over as the block reader is the
one owning the block right from the start.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: simplify how we track restart points
Restart points record the location of reftable records that do not use
prefix compression and are used to perform a binary search inside of a
block. These restart points are encoded at the end of a block, between
the record data and the footer of a table.
The block structure contains three different variables related to these
restart points:
- The block length contains the length of the reftable block up to the
restart points.
- The restart count contains the number of restart points contained in
the block.
- The restart bytes variable tracks where the restart point data
begins.
Tracking all three of these variables is unnecessary though as the data
can be derived from one another: the block length without restart points
is the exact same as the offset of the restart count data, which we
already track via the `restart_bytes` data.
Refactor the code so that we track the location of restart bytes not as
a pointer, but instead as an offset. This allows us to trivially get rid
of the `block_len` variable as described above. This avoids having the
confusing `block_len` variable and allows us to do less bookkeeping
overall.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/blocksource: consolidate code into a single file
The code that implements block sources is distributed across a couple of
files. Consolidate all of it into "reftable/blocksource.c" and its
accompanying header so that it is easier to locate and more self
contained.
While at it, rename some of the functions to have properly scoped names.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `struct reftable_reader` subsystem encapsulates a table that has
been read from the disk. As such, the current name of that structure is
somewhat hard to understand as it only talks about the fact that we read
something from disk, without really giving an indicator _what_ that is.
Furthermore, this naming schema doesn't really fit well into how the
other structures are named: `reftable_merged_table`, `reftable_stack`,
`reftable_block` and `reftable_record` are all named after what they
encapsulate.
Rename the subsystem to `reftable_table`, which directly gives a hint
that the data structure is about handling the individual tables part of
the stack.
While this change results in a lot of churn, it prepares for us exposing
the APIs to third-party callers now that the reftable library is a
standalone library that can be linked against by other projects.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
blame: print unblamable and ignored commits in porcelain mode
The 'git-blame(1)' command allows users to ignore specific revisions via
the '--ignore-rev <rev>' and '--ignore-revs-file <file>' flags. These
flags are often combined with the 'blame.markIgnoredLines' and
'blame.markUnblamableLines' config options. These config options prefix
ignored and unblamable lines with a '?' and '*', respectively.
However, this option was never extended to the porcelain mode of
'git-blame(1)'. Since the documentation does not indicate this
exclusion, it is a bug.
Fix this by printing 'ignored' and 'unblamable' respectively for the
options when using the porcelain modes.
Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Toon Claes <toon@iotcl.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use Perl due to two different reasons in t5703:
- To filter advertised capabilities.
- To set up a CGI script with HTTPD.
Refactor the first category to use `test_grep` instead. Refactoring the
second category would be a bit more involved, so instead we add the
PERL_TEST_HELPERS prerequisite to those individual tests now.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t5316: refactor `max_chain()` to not depend on Perl
The `max_chain()` helper function is used to extract the maximum delta
chain of a packfile as printed by git-index-pack(1). The script uses
Perl to extract that data, but it can be trivially refactored to use
awk(1) instead.
Refactor the helper accordingly so that we can drop a couple of
PERL_TEST_HELPERS prerequisites.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The output generated by our trace2 mechanism contains several fields
that are dependent on the environment they're being run in, which makes
it somewhat harder to test it. As a countermeasure we scrub the output
and strip out any fields that contain such information.
The logic to do so is implemented in Perl, but it can be trivially
ported to instead use sed(1). Refactor the code accordingly so that we
can drop the PERL_TEST_HELPERS prerequisite.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t0021: refactor `generate_random_characters()` to not depend on Perl
The `generate_random_characters()` helper function generates N
random characters in the range 'a-z' and writes them into a file. The
logic currently uses Perl, but it can be adapted rather easily by:
- Making `test-tool genrandom` generate an infinite stream.
- Using `tr -dc` to strip all characters which aren't in the range of
'a-z'.
- Using `test_copy_bytes()` to copy the first N bytes.
This allows us to drop the PERL_TEST_HELPERS prerequisite.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/lib-httpd: refactor "one-time-perl" CGI script to not depend on Perl
Our Apache HTTPD setup exposes an "one_time_perl" endpoint to access
repositories. If used, we execute the "apply-one-time-perl.sh" CGI
script that checks whether we have a "one-time-perl" script. If so, that
script gets executed so that it can munge what would be served. Once
done, the script gets removed so that it doesn't execute a second time.
As the name says, this functionality expects the user to pass a Perl
script. This isn't really necessary though: we can just as easily
implement the same thing with arbitrary scripts.
Refactor the code so that we instead expect an arbitrary script to
exist and rename the functionality to "one-time-script". Adapt callers
to use shell utilities instead of Perl so that we can drop the
PERL_TEST_HELPERS prerequisite.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/lib-t6000: refactor `name_from_description()` to not depend on Perl
The `name_from_description()` test helper uses Perl to munge a given
description and convert it into a name. Refactor it to instead use a
combination of sed(1) and tr(1) so that we drop PERL_TEST_HELPERS
prerequisites in users of this library.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/lib-gpg: refactor `sanitize_pgp()` to not depend on Perl
The `sanitize_pgp()` test helper uses Perl to strip PGP signatures from
stdin. Refactor it to instead use sed(1) so that we drop the
PERL_TEST_HELPERS prerequisite in users of this library.
Note that we have to add PERL_TEST_HELPERS to a subset of tests in t6300
now that the test suite doesn't bail out early anymore in case the
prerequisite isn't set.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t: refactor tests depending on Perl for textconv scripts
We have a couple of tests that depend on Perl for textconv scripts.
Refactor these tests to instead be implemented via shell utilities so
that we can drop a couple of PERL_TEST_HELPERS prerequisites.
Note that the conversion in t4030 is not a one-to-one equivalent to the
previous textconv script. Before this change we used to essentially do a
hexdump via Perl. The obvious conversion here would be to use `test-tool
hexdump` like we do for the other tests. But this would lead to a ripple
effect where we would have to adapt a bunch of other tests with a bunch
of seemingly unrelated changes, which would be somewhat awkward.
Instead, we're going with the minimum viable change: the test files we
write contain "\001" and "\000", and the test's expectation is that
those get translated into proper ASCII characters. So instead of doing a
full hexdump, we simply use tr(1) to translate these specific bytes.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t: refactor tests depending on Perl transliteration operator
We have a bunch of tests that use Perl to perform character
transliteration via the "y/" or "tr/" operator. These usecases can be
trivially replaced with tr(1).
Refactor the tests accordingly so that we can drop a couple of
PERL_TEST_HELPERS prerequisites.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Makefile for our tests has a couple of targets that depend on Perl.
Adapt those targets to only run conditionally in case Perl is available
on the system so that it becomes possible to run the test suite without
Perl.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Perl interpreter used to be a strict dependency for running our test
suite. This requirement is explicit in the Meson build system, where we
require Perl to be present unless tests have been disabled.
With the preceding commits we have loosened this restriction so that it
is now possible to run tests when Perl is unavailable. Loosen the above
requirement accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of our tests depend on the PERL prerequisite even though it
isn't needed. These tests fall into one of the following classes:
- The underlying logic used to be implemented in Perl but isn't
anymore. Here we can simply drop the dependency altogether.
- The test logic used to depend on Perl but doesn't anymore. Again, we
can simply drop the dependency.
- The test logic still relies on a Perl interpreter. These tests
should use the newly introduced PERL_TEST_HELPERS prerequisite.
Adapt test cases accordingly.
Note that in t1006 we have to introduce another new prerequisite
depending on whether or not the IPC::Open2 module is available. Funny
enough, when starting to use `test_lazy_prereq` to do so we also get a
conflict of variables with the "script" variable that contains the Perl
logic because `test_run_lazy_prereq_` also sets that variable. We thus
rename the variable in t1006 to "perl_script".
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the early days of Git, Perl was used quite prominently throughout the
project. This has changed significantly as almost all of the executables
we ship nowadays have eventually been rewritten in C. Only a handful of
subsystems remain that require Perl:
- gitweb, a read-only web interface.
- A couple of scripts that allow importing repositories from GNU Arch,
CVS and Subversion.
- git-send-email(1), which can be used to send mails.
- git-request-pull(1), which is used to request somebody to pull from
a URL by sending an email.
- git-filter-branch(1), which uses Perl with the `--state-branch`
option. This command is typically recommended against nowadays in
favor of git-filter-repo(1).
- Our Perl bindings for Git.
- The netrc Git credential helper.
None of these subsystems can really be considered to be part of the
"core" of Git, and an installation without them is fully functional.
It is more likely than not that an end user wouldn't even notice that
any features are missing if those tools weren't installed. But while
Perl nowadays very much is an optional dependency of Git, there is a
significant limitation when Perl isn't available: developers cannot run
our test suite.
Preceding commits have started to lift this restriction by removing the
strict dependency on Perl in many central parts of the test library. But
there are still many tests that rely on small Perl helpers to do various
different things.
Introduce a new PERL_TEST_HELPERS prerequisite that guards all tests
that require Perl. This prerequisite is explicitly different than the
preexisting PERL prerequisite:
- PERL records whether or not features depending on the Perl
interpreter are built.
- PERL_TEST_HELPERS records whether or not a Perl interpreter is
available for our tests.
By having these two separate prerequisites we can thus distinguish
between tests that inherently depend on Perl because the underlying
feature does, and those tests that depend on Perl because the test
itself is using Perl.
Adapt all tests to set the PERL_TEST_HELPERS prerequisite as needed.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `test_readlink()` helper function reads a symbolic link and returns
the path it is pointing to. It is thus equivalent to the readlink(1)
utility, which isn't available on all supported platforms. As such, it
is implemented using Perl so that we can use it even on platforms where
the shell utility isn't available.
While using readlink(1) is not an option, what we can do is to implement
the logic ourselves in our test-tool. Do so, which allows a bunch of
tests to pass when Perl is not available.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `test_copy_bytes()` helper function copies up to N bytes from stdin
to stdout. This is implemented using Perl, but it can be trivially
adapted to instead use dd(1).
Refactor the helper accordingly, which allows a bunch of tests to pass
when Perl is not available.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t: adapt character translation helpers to not use Perl
We have a couple of helper functions that translate characters, e.g.
from LF to NUL or NUL to 'Q' and vice versa. These helpers use Perl
scripts, but they can be trivially adapted to instead use tr(1).
Note that one specialty here is the handling of NUL characters in tr(1),
which historically wasn't implemented correctly on all platforms. But
quoting tr(1p):
It was considered that automatically stripping NUL characters from
the input was not correct functionality. However, the removal of -n
in a later proposal does not remove the requirement that tr
correctly process NUL characters in its input stream.
So when tr(1) is implemented following the POSIX standard then it is
expected to handle the transliteration of NUL just fine.
Refactor the helpers accordingly, which allows a bunch of tests to pass
when Perl is not available.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t: refactor environment sanitization to not use Perl
Before executing tests we first sanitize the environment. Part of the
sanitization is to unset a couple of environment variables that we know
will change the behaviour of Git. This is done with a small Perl script,
which has the consequence that having a Perl interpreter available is a
strict requirement for running our unit tests.
The logic itself isn't particularly involved: we simply unset every
environment variable whose key starts with 'GIT_', but then explicitly
allow a subset of these.
Refactor the logic to instead use sed(1) so that it becomes possible to
execute our tests without Perl.
Based-on-patch-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our chainlint script verifies that test files have proper '&&' chains.
This script is written in Perl and executed for every test file before
executing the test logic itself.
In subsequent commits we're about to refactor our test suite so that
Perl becomes an optional dependency, only. And while it is already
possible to disable this linter, developers that don't have Perl
available at all would always have to disable the linter manually, which
is rather cumbersome.
Disable the chain linter automatically in case PERL_PATH isn't set to
make this a bit less annoying. Bail out with an error in case the
developer has asked explicitly for the chain linter.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/cat-file: use bitmaps to efficiently filter by object type
While it is now possible to filter objects by type, this mechanism is
for now mostly a convenience. Most importantly, we still have to iterate
through the whole packfile to find all objects of a specific type. This
can be prohibitively expensive depending on the size of the packfiles.
It isn't really possible to do better than this when only considering a
packfile itself, as the order of objects is not fixed. But when we have
a packfile with a corresponding bitmap, either because the packfile
itself has one or because the multi-pack index has a bitmap for it, then
we can use these bitmaps to improve the runtime.
While bitmaps are typically used to compute reachability of objects,
they also contain one bitmap per object type that encodes which object
has what type. So instead of reading through the whole packfile(s), we
can use the bitmaps and iterate through the type-specific bitmap.
Typically, only a subset of packfiles will have a bitmap. But this isn't
really much of a problem: we can use bitmaps when available, and then
use the non-bitmap walk for every packfile that isn't covered by one.
Overall, this leads to quite a significant speedup depending on how many
objects of a certain type exist. The following benchmarks have been
executed in the Chromium repository, which has a 50GB packfile with
almost 25 million objects. As expected, there isn't really much of a
change in performance without an object filter:
Benchmark 1: cat-file with no-filter (revision = HEAD~)
Time (mean ± σ): 89.675 s ± 4.527 s [User: 40.807 s, System: 10.782 s]
Range (min … max): 83.052 s … 96.084 s 10 runs
Benchmark 2: cat-file with no-filter (revision = HEAD)
Time (mean ± σ): 88.991 s ± 2.488 s [User: 42.278 s, System: 10.305 s]
Range (min … max): 82.843 s … 91.271 s 10 runs
Summary
cat-file with no-filter (revision = HEAD) ran
1.01 ± 0.06 times faster than cat-file with no-filter (revision = HEAD~)
We still have to scan through all objects as we yield all of them, so
using the bitmap in this case doesn't really buy us anything. What is
noticeable in this benchmark is that we're I/O-bound, not CPU-bound, as
can be seen from the user/system runtimes, which combined are way lower
than the overall benchmarked runtime.
But when we do use a filter we can see a significant improvement:
Benchmark 1: cat-file with filter=object:type=commit (revision = HEAD~)
Time (mean ± σ): 86.444 s ± 4.081 s [User: 36.830 s, System: 11.312 s]
Range (min … max): 80.305 s … 93.104 s 10 runs
Benchmark 2: cat-file with filter=object:type=commit (revision = HEAD)
Time (mean ± σ): 2.089 s ± 0.015 s [User: 1.872 s, System: 0.207 s]
Range (min … max): 2.073 s … 2.119 s 10 runs
Summary
cat-file with filter=object:type=commit (revision = HEAD) ran
41.38 ± 1.98 times faster than cat-file with filter=object:type=commit (revision = HEAD~)
This is because we don't have to scan through all packfiles anymore, but
can instead directly look up relevant objects.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/cat-file: deduplicate logic to iterate over all objects
Pull out a common function that allows us to iterate over all objects in
a repository. Right now the logic is trivial and would only require two
function calls, making this refactoring a bit pointless. But in the next
commit we will iterate on this logic to make use of bitmaps, so this is
about to become a bit more complex.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-bitmap: add function to iterate over filtered bitmapped objects
Introduce a function that allows the caller to iterate over all
bitmapped objects that match a given filter. This mechanism will be used
in a subsequent commit to optimize object filters in git-cat-file(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-bitmap: allow passing payloads to `show_reachable_fn()`
The `show_reachable_fn` callback is used by a couple of functions to
present reachable objects to the caller. The function does not provide a
way for the caller to pass a payload though, which is functionality that
we'll require in a subsequent commit.
Change the callback type to accept a payload and adapt all callsites
accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/cat-file: support "blob:none" objects filter
Implement support for the "blob:none" filter in git-cat-file(1), which
causes us to omit all blobs.
Note that this new filter requires us to read the object type via
`oid_object_info_extended()` in `batch_object_write()`. But as we try to
optimize away reading objects from the database the `data->info.typep`
pointer may not be set. We thus have to adapt the logic to conditionally
set the pointer in cases where the filter is given.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/cat-file: wire up an option to filter objects
In batch mode, git-cat-file(1) enumerates all objects and prints them
by iterating through both loose and packed objects. This works without
considering their reachability at all, and consequently most options to
filter objects as they exist in e.g. git-rev-list(1) are not applicable.
In some situations it may still be useful though to filter objects based
on properties that are inherent to them. This includes the object size
as well as its type.
Such a filter already exists in git-rev-list(1) with the `--filter=`
command line option. While this option supports a couple of filters that
are not applicable to our usecase, some of them are quite a neat fit.
Wire up the filter as an option for git-cat-file(1). This allows us to
reuse the same syntax as in git-rev-list(1) so that we don't have to
reinvent the wheel. For now, we die when any of the filter options has
been passed by the user, but they will be wired up in subsequent
commits.
Further note that the filters that we are about to introduce don't
significantly speed up the runtime of git-cat-file(1). While we can skip
emitting a lot of objects in case they are uninteresting to us, the
majority of time is spent reading the packfile, which is bottlenecked by
I/O and not the processor. This will change though once we start to make
use of bitmaps, which will allow us to skip reading the whole packfile.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/cat-file: introduce function to report object status
We have multiple callsites that report the status of an object, for
example when the objec tis missing or its name is ambiguous. We're about
to add a couple more such callsites to report on "excluded" objects.
Prepare for this by introducing a new function `report_object_status()`
that encapsulates the functionality.
Note that this function also flushes stdout, which is a requirement so
that request-response style batched modes can learn about the status
before proceeding to the next object. We already flush correctly at all
existing callsites, even though the flush in `batch_one_object()` only
comes after the switch statement. That flush is now redundant, and we
could in theory deduplicate it by moving it into all branches that don't
use `report_object_status()`. But that doesn't quite feel sensible:
- The duplicate flush should ultimately just be a no-op for us and
thus shouldn't impact performance significantly.
- By keeping the flush in `report_object_status()` we ensure that all
future callers get semantics correct.
So let's just be pragmatic and live with the duplicated flush.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/cat-file: rename variable that tracks usage
The usage strings for git-cat-file(1) that we pass to `parse_options()`
and `usage_msg_optf()` are stored in a variable called `usage`. This
variable shadows the declaration of `usage()`, which we'll want to use
in a subsequent commit.
Rename the variable to `builtin_catfile_usage`, which is in line with
how the variable is typically called in other builtins.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 06c92dafb8 (Makefile: allow specifying a SHA-1 for non-cryptographic
uses, 2024-09-26), support for unsafe SHA-1 is added. Add the unsafe
SHA-1 build info to `git version --build-info` and update corresponding
documentation.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the `--build-options` flag is used with git-version(1), additional
information about the built version of Git is printed. During build
time, different SHA implementations may be configured, but this
information is not included in the version info.
Add the SHA implementations Git is built with to the version info by
requiring each backend to define a SHA1_BACKEND or SHA256_BACKEND symbol
as appropriate and use the value in the printed build options.
Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 7 Apr 2025 21:23:17 +0000 (14:23 -0700)]
Merge branch 'cc/lop-remote'
Bugfix in newly introduced large-object-promisor remote support.
* cc/lop-remote:
promisor-remote: compare remote names case sensitively
promisor-remote: fix possible issue when no URL is advertised
promisor-remote: fix segfault when remote URL is missing
t5710: arrange to delete the client before cloning
Junio C Hamano [Mon, 7 Apr 2025 21:23:17 +0000 (14:23 -0700)]
Merge branch 'jc/name-rev-stdin'
Using "git name-rev --stdin" as an example, improve the framework to
prepare tests to pretend to be in the future where the breaking
changes have already happened.
* jc/name-rev-stdin:
name-rev: remove "--stdin" support
t6120: further modernize
t6120: avoid hiding "git" exit status
t: introduce WITH_BREAKING_CHANGES prerequisite
t: extend test_lazy_prereq
t: document test_lazy_prereq
Mark Levedahl [Tue, 1 Apr 2025 03:01:02 +0000 (23:01 -0400)]
gitk: limit PATH search to bare executable names
The path search overrides used by gitk on Windows are applied to any
executable whose name is not 'absolute', meaning that
[exec foo/bar ...]
will search each element of $PATH to find one with subdirectory foo
containing bar. But, per POSIX, and Tcl implementation on all platforms,
foo/bar is taken as $(pwd)/foo/bar, and is not searched on $PATH.
Fix this descrepency using the same approach applied to git-gui in
commit 3f71c97e. The key is that the executable name must have no path
component, indicated by [file split $exename] having array length 1.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Mark Levedahl [Tue, 1 Apr 2025 03:01:01 +0000 (23:01 -0400)]
gitk: _search_exe is no longer needed
The _search_exe variable allows specifying the suffix used for executables,
typically {} on unix, .exe on Windows. But, the override code is now
used only on Windows, so _search_exe is no longer needed. Eliminate it.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Mark Levedahl [Tue, 1 Apr 2025 03:01:00 +0000 (23:01 -0400)]
gitk: override $PATH search only on Windows
Commit 4cbe9e0e2 was written to address problems that result from Tcl's
documented behavior on Windows where the current working directory and a
number of Windows system directories are automatically prepended to
$PATH when searching for executables [1]. This basic Windows behavior
has resulted in more than one CVE against git for Windows:
CVE-2023-23618, CVE-2022-41953 are listed on the git for Windows github
website for the Tcl components of git (gitk, git-gui).
4cbe9e0e2 is intended to restrict the search to looking only in
directories given in $PATH and in the given order, which is exactly the
Tcl behavior documented to exist on non-Windows platforms [1]. Thus,
this change could have been written to affect only Windows, leaving
other platforms alone.
However, 4cbe9e0e2 implements the override for all platforms. This
includes specialized code for Cygwin, copied from git-gui prior to
commit 7145c654 on https://github.com/j6t/git-gui, so targets a
long retired Cygwin port of the Windows Tcl/Tk using Windows pathnames.
Since 2012, Cygwin uses a Unix/X11 port requiring Unix pathnames,
meaning 4cbe9e0e2 is incompatible. 4cbe9e0e2 also induces an infinite
recursion as _which now invokes the exec wrapper that invokes _which.
This is part of git v2.49.0, so gitk on Cygwin is broken in that
release.
Rather than fix the unnecessary override code for Cygwin, let's just
limit the override of exec/open to Windows, leaving all other platforms
using their native exec/open as they did prior to 4cbe9e0e2. This patch
wraps the override code in an "if {[is_Windows]} { ... }" block while
removing the non-Windows code added in 4cbe9e0e2.
[1] see https://www.tcl-lang.org/man/tcl8.6/TclCmd/exec.htm
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
brian m. carlson [Mon, 31 Mar 2025 21:53:58 +0000 (21:53 +0000)]
t5605: fix test for cloning from a different user
This test currently passes, but for the wrong reason. The
repo_is_hardlinked function expects a .git directory or a bare
repository and currently fails because it cannot find the objects
directory.
One solution is to use the --bare argument, but then --show-toplevel
won't work. We could change that, but there's no need to, so just add
the missing .git directory.
In addition, use the built-in negation functionality of test_grep to
avoid mishandling real errors (such as a missing file) and, as a final
fix, remove the extra newline.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 1 Apr 2025 10:05:13 +0000 (19:05 +0900)]
Merge branch 'ps/reftable-sans-compat-util' into ps/reftable-api-revamp
* ps/reftable-sans-compat-util:
Makefile: skip reftable library for Coccinelle
reftable: decouple from Git codebase by pulling in "compat/posix.h"
git-compat-util.h: split out POSIX-emulating bits
compat/mingw: split out POSIX-related bits
reftable/basics: introduce `REFTABLE_UNUSED` annotation
reftable/basics: stop using `SWAP()` macro
reftable/stack: stop using `sleep_millisec()`
reftable/system: introduce `reftable_rand()`
reftable/reader: stop using `ARRAY_SIZE()` macro
reftable/basics: provide wrappers for big endian conversion
reftable/basics: stop using `st_mult()` in array allocators
reftable: stop using `BUG()` in trivial cases
reftable/record: don't `BUG()` in `reftable_record_cmp()`
reftable/record: stop using `BUG()` in `reftable_record_init()`
reftable/record: stop using `COPY_ARRAY()`
reftable/blocksource: stop using `xmmap()`
reftable/stack: stop using `write_in_full()`
reftable/stack: stop using `read_in_full()`
Add a new builtin driver for generic INI files (e. g. the gitconfig
files), where:
- the funcname regular expression matches section names, i. e. any
string between brackets at the beginning of the line, with or without
indentation;
- word_regex matches any word with one or more non-whitespace
characters without checking if it is a valid variable name or value.
Also add tests for the new userdiff driver. These files define sections
and subsections, with and without indentation.
Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: D. Ben Knoble <ben.knoble@gmail.com> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Matt Hunter [Sun, 30 Mar 2025 11:24:06 +0000 (07:24 -0400)]
revision: fix --left/right-only use with unrelated histories
This is a similar fix as 023756f4eb (revision walker: --cherry-pick is a
limited operation), but for the --left-only and --right-only options.
When computing a symmetric difference between two unrelated histories,
no suitable merge base exists, and so no boundary commit is flagged as
UNINTERESTING. Previously, we relied on the presence of such boundary
to trigger limiting and thus consideration of either "revs->left_only"
or "revs->right_only".
A number of other entries in the option parser have started including
overrides for "revs->limited = 1". Do the same for these options.
Signed-off-by: Matt Hunter <m@lfurio.us> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Arnav Bhate [Sun, 30 Mar 2025 17:45:06 +0000 (23:15 +0530)]
pathspec: fix sign comparison warnings
There are multiple places, especially in loops, where a signed and an
unsigned data type are compared. Git uses a mix of signed and unsigned
types to store lengths of arrays. This sometimes leads to using a signed
index for an array whose length is stored in an unsigned variable or
vice versa. In some cases, where both signed and unsigned data types
have been used to store lengths of arrays in the same function, only
one variable was used to iterate over both types.
Replace signed data types with unsigned data types and vice versa
wherever necessary. Where both types of iterators are required, move
the declaration inside the for loop. In cases where this is not
possible, add appropriate cast.
Remove #define DISABLE_SIGN_COMPARE_WARNINGS.
Signed-off-by: Arnav Bhate <bhatearnav@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ci: use Visual Studio for win+meson job on GitHub Workflows
In 7304bd2bc39 (ci: wire up Visual Studio build with Meson, 2025-01-22)
we have wired up a new CI job that builds and tests Git with Meson on a
Windows machine. The expectation here was that this build uses the
Visual Studio toolchain to do so, and that is true on GitLab CI. But on
GitHub Workflows it is not the case because we've got GCC in our PATH,
and thus Meson favors that compiler toolchain over Visual Studio's.
Fix this by explicitly asking Meson to use the Visual Studio toolchain.
While this is only really required for GitHub Workflows, let's also pass
the flag in GitLab CI so that we don't implicitly assume the toolchain
that Meson is going to pick.
Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Almost all of the tools we discover during the build process need to be
native programs. There are only a handful of exceptions, which typically
are programs whose paths we need to embed into the resulting executable
so that they can be found on the target system when Git executes. While
this distinction typically doesn't matter, it does start to matter when
considering cross-compilation where the build and target machines are
different.
Meson supports cross-compilation via so-called machine files. These
machine files allow the user to override parameters for the build
machine, but also for the target machine when cross-compiling. Part of
the machine file is a section that allows the user to override the
location where binaries are to be found in the target system. The
following machine file would for example override the path of the POSIX
shell:
[binaries]
sh = '/usr/xpg4/bin/sh'
It can be handed over to Meson via `meson setup --cross-file`.
We do not handle this correctly right now though because we don't know
to distinguish binaries for the build and target hosts at all. Address
this by explicitly passing the `native:` parameter to `find_program()`:
- When set to `true`, we get binaries discovered on the build host.
- When set to `false`, we get either the path specified in the
machine file. Or, if no machine file exists or it doesn't specify
the binary path, then we fall back to the binary discovered on the
build host.
As mentioned, only a handful of binaries are not native: only the system
shell, Python and Perl need to be treated specially here.
Reported-by: Peter Seiderer <ps.report@gmx.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both the "netrc" credential helper and git-subtree(1) from "contrib/"
carry a couple of tests with them. These tests get wired up in Meson
unconditionally even in the case where `-Dtests=false`. As those tests
depend on the `test_enviroment` variable, which only gets defined in
case `-Dtests=true`, the result is an error:
In 19d8fe7da65 (Makefile: extract script to generate gitweb.js,
2024-12-06) we have extracted the logic to build "gitweb.js" into a
separate script. As part of that the rules that builds the script
has gained a new dependency on that script.
This refactoring is broken though because we use "$^" to determine
the set of JavaScript files that need to be concatenated, and this
implicit variable now also contains the build script itself. As a
result, the build script ends up ni the generated "gitweb.js" file,
which is wrong.
Fix the issue by filtering out non-JavaScript files.
Based-on-patch-by: Thorsten Glaser <tg@debian.org> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "curl" option controls whether or not a couple of features that
depend on curl shall be included. Most importantly, these features
include the HTTP remote helpers, which are rather quintessential for a
well-functioning Git installation. So while the dependency can in theory
be dropped, most users wouldn't consider the resulting installation to
be fully functional.
The "curl" option is defined as a feature, which means that it can be
"enabled", "disabled" or "auto", which has the effect that the feature
will be enabled if the dependency itself has been found. While most of
the other features have "auto" as default value, the "curl" option is
set to "enabled" by default due to it being so important. Consequently,
autoconfiguration of Git will fail by default if the library cannot be
found.
There is a bug though with how we handle the option in case the user
overrides the feature with `meson setup -Dcurl=auto`: while we will try
to find the library in that case, we won't ever use it because we later
on check for `get_option('curl').enabled()` when deciding whether or not
we want to build dependent sources. But `enabled()` only returns true if
the option has the value "enabled", for "auto" it will return false.
Fix the issue by instead checking for `curl.found()`, which is only true
if the library has been found. And as we only try to find the library
when `get_option('curl')` returns "true" or "auto" this is exactly what
we want.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Arnav Bhate [Sat, 29 Mar 2025 06:03:14 +0000 (11:33 +0530)]
rm: fix sign comparison warnings
There are multiple places in loops, where a signed and an
unsigned data type are compared. Git uses a mix of signed and unsigned
types to store lengths of arrays. This sometimes leads to using a signed
index for an array whose length is stored in an unsigned variable or
vice versa.
get_ours_cache_pos is a special case where i, though derived from a
signed variable is never negative. Move this part to the caller side
and make i an unsigned argument of the function. Rename i to
pos to make it descriptive, now that it is a function argument.
Replace signed data types with unsigned data types and vice versa
wherever necessary. Where both signed and unsigned data types have been
used, define a new variable in the scope of the for loop for use as the
iterator. Remove #define DISABLE_SIGN_COMPARE_WARNINGS.
Signed-off-by: Arnav Bhate <bhatearnav@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sat, 29 Mar 2025 07:39:10 +0000 (16:39 +0900)]
Merge branch 'en/random-cleanups'
Miscellaneous code clean-ups.
* en/random-cleanups:
merge-ort: remove extraneous word in comment
merge-ort: fix accidental strset<->strintmap
t7615: be more explicit about diff algorithm used
t6423: fix a comment that accidentally reversed two commits
stash: remove merge-recursive.h include
Junio C Hamano [Sat, 29 Mar 2025 07:39:10 +0000 (16:39 +0900)]
Merge branch 'jk/use-wunreachable-code-for-devs'
Enable -Wunreachable-code for developer builds.
* jk/use-wunreachable-code-for-devs:
config.mak.dev: enable -Wunreachable-code
git-compat-util: add NOT_CONSTANT macro and use it in atfork_prepare()
run-command: use errno to check for sigfillset() error
Junio C Hamano [Sat, 29 Mar 2025 07:39:08 +0000 (16:39 +0900)]
Merge branch 'jk/fetch-ref-prefix-cleanup'
In protocol v2 where the refs advertisement is constrained, we try
to tell the server side not to limit the advertisement when there
is no specific need to, which has been the source of confusion and
recent bugs. Revamp the logic to simplify.
* jk/fetch-ref-prefix-cleanup:
fetch: use ref prefix list to skip ls-refs
fetch: avoid ls-refs only to ask for HEAD symref update
fetch: stop protecting additions to ref-prefix list
fetch: ask server to advertise HEAD for config-less fetch
refspec_ref_prefixes(): clean up refspec_item logic
t5516: beef up exact-oid ref prefixes test
t5516: drop NEEDSWORK about v2 reachability behavior
t5516: prefer "oid" to "sha1" in some test titles
t5702: fix typo in test name
First step of deprecating and removing merge-recursive.
* en/merge-ort-prepare-to-remove-recursive:
am: switch from merge_recursive_generic() to merge_ort_generic()
merge-ort: fix merge.directoryRenames=false
t3650: document bug when directory renames are turned off
merge-ort: support having merge verbosity be set to 0
merge-ort: allow rename detection to be disabled
merge-ort: add new merge_ort_generic() function
Junio C Hamano [Sat, 29 Mar 2025 07:39:06 +0000 (16:39 +0900)]
Merge branch 'cc/signed-fast-export-import'
"git fast-export | git fast-import" learns to deal with commit and
tag objects with embedded signatures a bit better.
* cc/signed-fast-export-import:
fast-export, fast-import: add support for signed-commits
fast-export: do not modify memory from get_commit_buffer
git-fast-export.adoc: clarify why 'verbatim' may not be a good idea
fast-export: rename --signed-tags='warn' to 'warn-verbatim'
fast-export: fix missing whitespace after switch
git-fast-import.adoc: add missing LF in the BNF
Philippe Blain [Fri, 28 Mar 2025 17:07:49 +0000 (17:07 +0000)]
p9210: fix 'scalar clone' when running from a detached HEAD
In p9210-scalar-clone.sh, we test using 'scalar clone' to clone
$GIT_PERF_LARGE_REPO (copied locally as 'to-clone'), which defaults to
the git.git checkout we are running the test from.
When --branch is not specified (as in this test), 'scalar clone' tries
to get the default branch of the remote repository by parsing the output
of 'git ls-remote --symref $URL HEAD', as implemented in
scalar.c:remote_default_branch. When the git.git checkout we are running
the test from is in detached HEAD, this fails and we fall back to using
the name of the currently checked out branch in the newly initialized
repository, which in this case is the value returned earlier in
cmd_clone by repo_default_branch_name.
We then invoke 'git checkout -t origin/$branch', with $branch being the
name we got from remote_default_branch. This invocation fails if
'$branch' does not exist as a branch in the current git.git checkout.
Fix this by creating a local branch in 'to-clone' in the setup test
"enable server-side partial clone", making sure to use '-B' in case a
branch named 'test-branch' already exists.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Fri, 28 Mar 2025 17:07:48 +0000 (17:07 +0000)]
p7821: fix test_perf invocation for prereqs
Since 5dccd9155f (t/perf: add iteration setup mechanism to perf-lib,
2022-04-04), perf tests need to declare their prerequisites with
'--prereq', after the test title. p7821 was forgotten in that commit,
such that running that test on a machine where the PCRE prereq is not
satisfied aborts the test with:
error: bug in the test script: test_wrapper_ needs 2 positional parameters
Fix this by correcting the two 'test_perf' invocations in that test
suite.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phillip Wood [Fri, 28 Mar 2025 14:45:40 +0000 (14:45 +0000)]
merge-file doc: set conflict-marker-size attribute
When committing a conflict resolution for a merge containing 1f010d6bdf7 (doc: use .adoc extension for AsciiDoc files, 2025-01-20)
my pre-commit hook failed because "git diff --check" thought there was
a left over conflict marker in "merge-file.adoc". Fix this by setting
the "conflict-marker-size" attribute as we do for all the other
documentation files that contain example conflict markers.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sat, 29 Mar 2025 01:10:25 +0000 (10:10 +0900)]
Merge branch 'tb/incremental-midx-part-2' into ps/cat-file-filter-batch
* tb/incremental-midx-part-2:
midx: implement writing incremental MIDX bitmaps
pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators
pack-bitmap.c: keep track of each layer's type bitmaps
ewah: implement `struct ewah_or_iterator`
pack-bitmap.c: apply pseudo-merge commits with incremental MIDXs
pack-bitmap.c: compute disk-usage with incremental MIDXs
pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs
pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs
pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs
pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs
pack-bitmap.c: open and store incremental bitmap layers
pack-revindex: prepare for incremental MIDX bitmaps
Documentation: describe incremental MIDX bitmaps
Documentation: remove a "future work" item from the MIDX docs