git.ipfire.org Git - thirdparty/git.git/log

Documentation: stop depending on Perl to generate command list

The "cmd-list.perl" script is used to extract the list of commands part
of a specific category and extracts the description of each command from
its respective manpage. The generated output is then included in git(1)
to list all Git commands.

The script is written in Perl. Refactor it to use shell scripting
exclusively so that we can get rid of the mandatory dependency on Perl
to build our documentation.

The converted script is slower compared to its Perl implementation. But
by being careful and not spawning external commands in `format_one ()`
we can mitigate the performance hit to a reasonable level:

    Benchmark 1: Perl
      Time (mean ± σ):      10.3 ms ±   0.2 ms    [User: 7.0 ms, System: 3.3 ms]
      Range (min … max):    10.0 ms …  11.1 ms    200 runs

    Benchmark 2: Shell
      Time (mean ± σ):      74.4 ms ±   0.4 ms    [User: 48.6 ms, System: 24.7 ms]
      Range (min … max):    73.1 ms …  75.5 ms    200 runs

    Summary
      Perl ran
        7.23 ± 0.13 times faster than Shell

While a sevenfold slowdown is significant, the benefit of not requiring
Perl for a fully-functioning Git installation outweighs waiting a couple
of milliseconds longer during the build process.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation: stop depending on Perl to massage user manual

The "fix-texi.perl" script is used to fix up the output of
`docbook2x-texi`:

- It changes the filename to be "git.info".

- It changes the directory category and entry.

The script is written in Perl, but it can be rather trivially converted
to a shell script. Do so to remove the dependency on Perl for building
the user manual.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

request-pull: stop depending on Perl

While git-request-pull(1) is written as a shell script, for it to
function we depend on Perl being available. The script gets installed
unconditionally though, regardless of whether or not Perl is even
available on the system. When it's not available, the `@PERL_PATH@`
variable may be substituted with a nonexistent executable path and thus
cause the script to fail.

Refactor the script so that it does not depend on Perl at all anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

filter-branch: stop depending on Perl

While git-filter-branch(1) is written as a shell script, the
`--state-branch` feature depends on Perl to save and extract the object
ID mappings. This can lead to subtle breakage though:

  - We execute `perl` directly without respecting the `PERL_PATH`
    configured by the distribution. As such, it may happen that we use
    the wrong version of Perl.

  - We install the script unchanged even if Perl isn't available at all
    on the system, so using `--state-branch` would lead to failure
    altogether in that case.

Fix this by dropping Perl and instead implementing the feature with
shell scripting exclusively.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ps/test-wo-perl-prereq' into ps/fewer-perl

* ps/test-wo-perl-prereq:
  t5703: refactor test to not depend on Perl
  t5316: refactor `max_chain()` to not depend on Perl
  t0210: refactor trace2 scrubbing to not use Perl
  t0021: refactor `generate_random_characters()` to not depend on Perl
  t/lib-httpd: refactor "one-time-perl" CGI script to not depend on Perl
  t/lib-t6000: refactor `name_from_description()` to not depend on Perl
  t/lib-gpg: refactor `sanitize_pgp()` to not depend on Perl
  t: refactor tests depending on Perl for textconv scripts
  t: refactor tests depending on Perl to print data
  t: refactor tests depending on Perl substitution operator
  t: refactor tests depending on Perl transliteration operator
  Makefile: stop requiring Perl when running tests
  meson: stop requiring Perl when tests are enabled
  t: adapt existing PERL prerequisites
  t: introduce PERL_TEST_HELPERS prerequisite
  t: adapt `test_readlink()` to not use Perl
  t: adapt `test_copy_bytes()` to not use Perl
  t: adapt character translation helpers to not use Perl
  t: refactor environment sanitization to not use Perl
  t: skip chain lint when PERL_PATH is unset

The fourth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'dk/vimdiff-doc-fix'

Doc update.

* dk/vimdiff-doc-fix:
vimdiff: clarify the sigil used for marking the buffer to save

Merge branch 'fr/vimdiff-layout-fixes'

Layout configuration in vimdiff backend didn't work as advertised,
which has been corrected.

* fr/vimdiff-layout-fixes:
mergetools: vimdiff: add tests for layout with REMOTE as the target
mergetools: vimdiff: fix layout where REMOTE is the target

Merge branch 'es/meson-build-skip-coccinelle'

Build fix.

* es/meson-build-skip-coccinelle:
meson: disable coccinelle configuration when building from a tarball

Merge branch 'ta/bulk-checkin-signed-compare-false-warning-fix'

Compiler warnings workaround.

* ta/bulk-checkin-signed-compare-false-warning-fix:
bulk-checkin: fix sign compare warnings

Merge branch 'rs/clear-commit-marks-simplify'

Code clean-up.

* rs/clear-commit-marks-simplify:
commit: move clear_commit_marks_many() loop body to clear_commit_marks()

Merge branch 'tb/incremental-midx-part-2'

Incrementally updating multi-pack index files.

* tb/incremental-midx-part-2:
  midx: implement writing incremental MIDX bitmaps
  pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators
  pack-bitmap.c: keep track of each layer's type bitmaps
  ewah: implement `struct ewah_or_iterator`
  pack-bitmap.c: apply pseudo-merge commits with incremental MIDXs
  pack-bitmap.c: compute disk-usage with incremental MIDXs
  pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs
  pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs
  pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs
  pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs
  pack-bitmap.c: open and store incremental bitmap layers
  pack-revindex: prepare for incremental MIDX bitmaps
  Documentation: describe incremental MIDX bitmaps
  Documentation: remove a "future work" item from the MIDX docs

Merge branch 'ps/reftable-sans-compat-util'

Make the code in reftable library less reliant on the service
routines it used to borrow from Git proper, to make it easier to
use by external users of the library.

* ps/reftable-sans-compat-util:
  Makefile: skip reftable library for Coccinelle
  reftable: decouple from Git codebase by pulling in "compat/posix.h"
  git-compat-util.h: split out POSIX-emulating bits
  compat/mingw: split out POSIX-related bits
  reftable/basics: introduce `REFTABLE_UNUSED` annotation
  reftable/basics: stop using `SWAP()` macro
  reftable/stack: stop using `sleep_millisec()`
  reftable/system: introduce `reftable_rand()`
  reftable/reader: stop using `ARRAY_SIZE()` macro
  reftable/basics: provide wrappers for big endian conversion
  reftable/basics: stop using `st_mult()` in array allocators
  reftable: stop using `BUG()` in trivial cases
  reftable/record: don't `BUG()` in `reftable_record_cmp()`
  reftable/record: stop using `BUG()` in `reftable_record_init()`
  reftable/record: stop using `COPY_ARRAY()`
  reftable/blocksource: stop using `xmmap()`
  reftable/stack: stop using `write_in_full()`
  reftable/stack: stop using `read_in_full()`

Merge branch 'ps/ci-meson-check-build-docs'

CI update.

* ps/ci-meson-check-build-docs:
ci: perform build and smoke tests for Meson docs

Merge branch 'tb/http-curl-keepalive'

TCP keepalive behaviour on http transports can now be configured by
calling cURL library.

* tb/http-curl-keepalive:
  http.c: allow custom TCP keepalive behavior via config
  http.c: inline `set_curl_keepalive()`
  http.c: introduce `set_long_from_env()` for convenience
  http.c: remove unnecessary casts to long

Merge branch 'tb/refspec-fetch-cleanup'

Code clean-up.

* tb/refspec-fetch-cleanup:
  refspec: replace `refspec_item_init()` with fetch/push variants
  refspec: remove refspec_item_init_or_die()
  refspec: replace `refspec_init()` with fetch/push variants
  refspec: treat 'fetch' as a Boolean value

Merge branch 'ms/reftable-block-writer-errors'

Give more meaningful error return values from block writer layer of
the reftable ref-API backend.

* ms/reftable-block-writer-errors:
  reftable: adapt write_object_record() to propagate block_writer_add() errors
  reftable: adapt writer_add_record() to propagate block_writer_add() errors
  reftable: propagate specific error codes in block_writer_add()

Merge branch 'en/assert-wo-side-effects'

Ensure what we write in assert() does not have side effects,
and introduce ASSERT() macro to mark those that cannot be
mechanically checked for lack of side effects.

* en/assert-wo-side-effects:
  treewide: replace assert() with ASSERT() in special cases
  ci: add build checking for side-effects in assert() calls
  git-compat-util: introduce ASSERT() macro

t5703: refactor test to not depend on Perl

We use Perl due to two different reasons in t5703:

- To filter advertised capabilities.

- To set up a CGI script with HTTPD.

Refactor the first category to use `test_grep` instead. Refactoring the
second category would be a bit more involved, so instead we add the
PERL_TEST_HELPERS prerequisite to those individual tests now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5316: refactor `max_chain()` to not depend on Perl

The `max_chain()` helper function is used to extract the maximum delta
chain of a packfile as printed by git-index-pack(1). The script uses
Perl to extract that data, but it can be trivially refactored to use
awk(1) instead.

Refactor the helper accordingly so that we can drop a couple of
PERL_TEST_HELPERS prerequisites.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0210: refactor trace2 scrubbing to not use Perl

The output generated by our trace2 mechanism contains several fields
that are dependent on the environment they're being run in, which makes
it somewhat harder to test it. As a countermeasure we scrub the output
and strip out any fields that contain such information.

The logic to do so is implemented in Perl, but it can be trivially
ported to instead use sed(1). Refactor the code accordingly so that we
can drop the PERL_TEST_HELPERS prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0021: refactor `generate_random_characters()` to not depend on Perl

The `generate_random_characters()` helper function generates N
random characters in the range 'a-z' and writes them into a file. The
logic currently uses Perl, but it can be adapted rather easily by:

  - Making `test-tool genrandom` generate an infinite stream.

  - Using `tr -dc` to strip all characters which aren't in the range of
    'a-z'.

  - Using `test_copy_bytes()` to copy the first N bytes.

This allows us to drop the PERL_TEST_HELPERS prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/lib-httpd: refactor "one-time-perl" CGI script to not depend on Perl

Our Apache HTTPD setup exposes an "one_time_perl" endpoint to access
repositories. If used, we execute the "apply-one-time-perl.sh" CGI
script that checks whether we have a "one-time-perl" script. If so, that
script gets executed so that it can munge what would be served. Once
done, the script gets removed so that it doesn't execute a second time.

As the name says, this functionality expects the user to pass a Perl
script. This isn't really necessary though: we can just as easily
implement the same thing with arbitrary scripts.

Refactor the code so that we instead expect an arbitrary script to
exist and rename the functionality to "one-time-script". Adapt callers
to use shell utilities instead of Perl so that we can drop the
PERL_TEST_HELPERS prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/lib-t6000: refactor `name_from_description()` to not depend on Perl

The `name_from_description()` test helper uses Perl to munge a given
description and convert it into a name. Refactor it to instead use a
combination of sed(1) and tr(1) so that we drop PERL_TEST_HELPERS
prerequisites in users of this library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/lib-gpg: refactor `sanitize_pgp()` to not depend on Perl

The `sanitize_pgp()` test helper uses Perl to strip PGP signatures from
stdin. Refactor it to instead use sed(1) so that we drop the
PERL_TEST_HELPERS prerequisite in users of this library.

Note that we have to add PERL_TEST_HELPERS to a subset of tests in t6300
now that the test suite doesn't bail out early anymore in case the
prerequisite isn't set.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: refactor tests depending on Perl for textconv scripts

We have a couple of tests that depend on Perl for textconv scripts.
Refactor these tests to instead be implemented via shell utilities so
that we can drop a couple of PERL_TEST_HELPERS prerequisites.

Note that the conversion in t4030 is not a one-to-one equivalent to the
previous textconv script. Before this change we used to essentially do a
hexdump via Perl. The obvious conversion here would be to use `test-tool
hexdump` like we do for the other tests. But this would lead to a ripple
effect where we would have to adapt a bunch of other tests with a bunch
of seemingly unrelated changes, which would be somewhat awkward.

Instead, we're going with the minimum viable change: the test files we
write contain "\001" and "\000", and the test's expectation is that
those get translated into proper ASCII characters. So instead of doing a
full hexdump, we simply use tr(1) to translate these specific bytes.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: refactor tests depending on Perl to print data

A bunch of tests rely on Perl to print data in various different ways.
These usages fall into the following categories:

  - Print data conditionally by matching patterns. These usecases can be
    converted to use awk(1) rather easily.

  - Print data repeatedly. These usecases can typically be converted to
    use a combination of `test-tool genzeros` and sed(1).

  - Print data in reverse. These usecases can be converted to use
    awk(1) or `sort -r`.

Refactor the tests accordingly so that we can drop a couple of
PERL_TEST_HELPERS prerequisites.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: refactor tests depending on Perl substitution operator

We have a bunch of tests that use Perl to perform substitution via the
"s/" operator. These usecases can be trivially replaced with sed(1) and
tr(1).

Refactor the tests accordingly so that we can drop a couple of
PERL_TEST_HELPERS prerequisites.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: refactor tests depending on Perl transliteration operator

We have a bunch of tests that use Perl to perform character
transliteration via the "y/" or "tr/" operator. These usecases can be
trivially replaced with tr(1).

Refactor the tests accordingly so that we can drop a couple of
PERL_TEST_HELPERS prerequisites.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Makefile: stop requiring Perl when running tests

The Makefile for our tests has a couple of targets that depend on Perl.
Adapt those targets to only run conditionally in case Perl is available
on the system so that it becomes possible to run the test suite without
Perl.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson: stop requiring Perl when tests are enabled

The Perl interpreter used to be a strict dependency for running our test
suite. This requirement is explicit in the Meson build system, where we
require Perl to be present unless tests have been disabled.

With the preceding commits we have loosened this restriction so that it
is now possible to run tests when Perl is unavailable. Loosen the above
requirement accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: adapt existing PERL prerequisites

A couple of our tests depend on the PERL prerequisite even though it
isn't needed. These tests fall into one of the following classes:

  - The underlying logic used to be implemented in Perl but isn't
    anymore. Here we can simply drop the dependency altogether.

  - The test logic used to depend on Perl but doesn't anymore. Again, we
    can simply drop the dependency.

  - The test logic still relies on a Perl interpreter. These tests
    should use the newly introduced PERL_TEST_HELPERS prerequisite.

Adapt test cases accordingly.

Note that in t1006 we have to introduce another new prerequisite
depending on whether or not the IPC::Open2 module is available. Funny
enough, when starting to use `test_lazy_prereq` to do so we also get a
conflict of variables with the "script" variable that contains the Perl
logic because `test_run_lazy_prereq_` also sets that variable. We thus
rename the variable in t1006 to "perl_script".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: introduce PERL_TEST_HELPERS prerequisite

In the early days of Git, Perl was used quite prominently throughout the
project. This has changed significantly as almost all of the executables
we ship nowadays have eventually been rewritten in C. Only a handful of
subsystems remain that require Perl:

  - gitweb, a read-only web interface.

  - A couple of scripts that allow importing repositories from GNU Arch,
    CVS and Subversion.

  - git-send-email(1), which can be used to send mails.

  - git-request-pull(1), which is used to request somebody to pull from
    a URL by sending an email.

  - git-filter-branch(1), which uses Perl with the `--state-branch`
    option. This command is typically recommended against nowadays in
    favor of git-filter-repo(1).

  - Our Perl bindings for Git.

  - The netrc Git credential helper.

None of these subsystems can really be considered to be part of the
"core" of Git, and an installation without them is fully functional.
It is more likely than not that an end user wouldn't even notice that
any features are missing if those tools weren't installed. But while
Perl nowadays very much is an optional dependency of Git, there is a
significant limitation when Perl isn't available: developers cannot run
our test suite.

Preceding commits have started to lift this restriction by removing the
strict dependency on Perl in many central parts of the test library. But
there are still many tests that rely on small Perl helpers to do various
different things.

Introduce a new PERL_TEST_HELPERS prerequisite that guards all tests
that require Perl. This prerequisite is explicitly different than the
preexisting PERL prerequisite:

  - PERL records whether or not features depending on the Perl
    interpreter are built.

  - PERL_TEST_HELPERS records whether or not a Perl interpreter is
    available for our tests.

By having these two separate prerequisites we can thus distinguish
between tests that inherently depend on Perl because the underlying
feature does, and those tests that depend on Perl because the test
itself is using Perl.

Adapt all tests to set the PERL_TEST_HELPERS prerequisite as needed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: adapt `test_readlink()` to not use Perl

The `test_readlink()` helper function reads a symbolic link and returns
the path it is pointing to. It is thus equivalent to the readlink(1)
utility, which isn't available on all supported platforms. As such, it
is implemented using Perl so that we can use it even on platforms where
the shell utility isn't available.

While using readlink(1) is not an option, what we can do is to implement
the logic ourselves in our test-tool. Do so, which allows a bunch of
tests to pass when Perl is not available.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: adapt `test_copy_bytes()` to not use Perl

The `test_copy_bytes()` helper function copies up to N bytes from stdin
to stdout. This is implemented using Perl, but it can be trivially
adapted to instead use dd(1).

Refactor the helper accordingly, which allows a bunch of tests to pass
when Perl is not available.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: adapt character translation helpers to not use Perl

We have a couple of helper functions that translate characters, e.g.
from LF to NUL or NUL to 'Q' and vice versa. These helpers use Perl
scripts, but they can be trivially adapted to instead use tr(1).

Note that one specialty here is the handling of NUL characters in tr(1),
which historically wasn't implemented correctly on all platforms. But
quoting tr(1p):

    It was considered that automatically stripping NUL characters from
    the input was not correct functionality.  However, the removal of -n
    in a later proposal does not remove the requirement that tr
    correctly process NUL characters in its input stream.

So when tr(1) is implemented following the POSIX standard then it is
expected to handle the transliteration of NUL just fine.

Refactor the helpers accordingly, which allows a bunch of tests to pass
when Perl is not available.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: refactor environment sanitization to not use Perl

Before executing tests we first sanitize the environment. Part of the
sanitization is to unset a couple of environment variables that we know
will change the behaviour of Git. This is done with a small Perl script,
which has the consequence that having a Perl interpreter available is a
strict requirement for running our unit tests.

The logic itself isn't particularly involved: we simply unset every
environment variable whose key starts with 'GIT_', but then explicitly
allow a subset of these.

Refactor the logic to instead use sed(1) so that it becomes possible to
execute our tests without Perl.

Based-on-patch-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: skip chain lint when PERL_PATH is unset

Our chainlint script verifies that test files have proper '&&' chains.
This script is written in Perl and executed for every test file before
executing the test logic itself.

In subsequent commits we're about to refactor our test suite so that
Perl becomes an optional dependency, only. And while it is already
possible to disable this linter, developers that don't have Perl
available at all would always have to disable the linter manually, which
is rather cumbersome.

Disable the chain linter automatically in case PERL_PATH isn't set to
make this a bit less annoying. Bail out with an error in case the
developer has asked explicitly for the chain linter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The third batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'js/imap-send-peer-cert-verify'

* js/imap-send-peer-cert-verify:
imap-send: explicitly verify the peer certificate

Merge branch 'js/mingw-admins-are-special'

"Dubious ownership" checks on Windows has been tightened up.

* js/mingw-admins-are-special:
test-tool path-utils: support debugging "dubious ownership" issues
mingw: special-case administrators even more

Merge branch 'tb/bitamp-typofix'

Typofix.

* tb/bitamp-typofix:
pseudo-merge.h: fix a typo

Merge branch 'dm/completion-remote-names-fix'

The bash command line completion script (in contrib/) has been
updated to cope with remote repository nicknames with slashes in
them.

* dm/completion-remote-names-fix:
completion: fix bugs with slashes in remote names
completion: add helper to count path components

Merge branch 'pw/doc-pack-refs-markup-fix'

Doc markup fix.

* pw/doc-pack-refs-markup-fix:
pack-refs doc: fix indentation for --exclude

Merge branch 'pw/build-breaking-changes-doc'

A documentation page was left out from formatting and installation,
which has been corrected.

* pw/build-breaking-changes-doc:
docs: add BreakingChanges to TECH_DOCS target

Merge branch 'jh/hash-init-fixes'

An earlier code refactoring of the hash machinery missed a few
required calls to init_fn.

* jh/hash-init-fixes:
index-pack, unpack-objects: restore missing ->init_fn

Merge branch 'tb/combine-cruft-below-size'

"git repack" learned "--combine-cruft-below-size" option that
controls how cruft-packs are combined.

* tb/combine-cruft-below-size:
  repack: begin combining cruft packs with `--combine-cruft-below-size`
  repack: avoid combining cruft packs with `--max-cruft-size`
  t/t7704-repack-cruft.sh: consolidate `write_blob()`
  t/t7704-repack-cruft.sh: clarify wording in --max-cruft-size tests
  t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests

Merge branch 'ja/doc-branch-markup'

Doc mark-up updates.

* ja/doc-branch-markup:
doc: apply new format to git-branch man page
completion: take into account the formatting backticks for options

Merge branch 'cc/lop-remote'

Bugfix in newly introduced large-object-promisor remote support.

* cc/lop-remote:
  promisor-remote: compare remote names case sensitively
  promisor-remote: fix possible issue when no URL is advertised
  promisor-remote: fix segfault when remote URL is missing
  t5710: arrange to delete the client before cloning

Merge branch 'jc/name-rev-stdin'

Using "git name-rev --stdin" as an example, improve the framework to
prepare tests to pretend to be in the future where the breaking
changes have already happened.

* jc/name-rev-stdin:
  name-rev: remove "--stdin" support
  t6120: further modernize
  t6120: avoid hiding "git" exit status
  t: introduce WITH_BREAKING_CHANGES prerequisite
  t: extend test_lazy_prereq
  t: document test_lazy_prereq

Merge branch 'kn/ci-meson-check-build-docs-fix'

GitHub Actions CI switched on a CI/CD variable that does not exist
when choosing what packages to install etc., which has been
corrected.

* kn/ci-meson-check-build-docs-fix:
ci/github: add missing 'CI_JOB_IMAGE' env variable

Merge branch 'aj/doc-restore-p-update'

Stale description in "git restore -p" documentation has been
updated.

* aj/doc-restore-p-update:
doc: restore: remove note on --patch w/ pathspecs

The second batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'hj/doc-rev-list-ancestry-fix'

Doc update.

* hj/doc-rev-list-ancestry-fix:
doc: add missing commit C to the graph for --ancestry-path=H D..M

Merge branch 'es/meson-building-docs-requires-perl'

Build update.

* es/meson-building-docs-requires-perl:
meson: fix perl detection when docs are enabled, but perl bindings aren't

Merge branch 'en/random-cleanups'

Miscellaneous code clean-ups.

* en/random-cleanups:
  merge-ort: remove extraneous word in comment
  merge-ort: fix accidental strset<->strintmap
  t7615: be more explicit about diff algorithm used
  t6423: fix a comment that accidentally reversed two commits
  stash: remove merge-recursive.h include

Merge branch 'rs/xdiff-context-length-fix'

The xdiff code on 32-bit platform misbehaved when an insanely large
context size is given, which has been corrected.

* rs/xdiff-context-length-fix:
xdiff: avoid arithmetic overflow in xdl_get_hunk()

Merge branch 'jk/use-wunreachable-code-for-devs'

Enable -Wunreachable-code for developer builds.

* jk/use-wunreachable-code-for-devs:
  config.mak.dev: enable -Wunreachable-code
  git-compat-util: add NOT_CONSTANT macro and use it in atfork_prepare()
  run-command: use errno to check for sigfillset() error

Merge branch 'en/diff-rename-follow-fix'

A corner-case bug in "git log --follow -B" has been fixed.

* en/diff-rename-follow-fix:
diffcore-rename: fix BUG when break detection and --follow used together

Merge branch 'tb/multi-cruft-pack-refresh-fix'

Certain "cruft" objects would have never been refreshed when there
are multiple cruft packs in the repository, which has been
corrected.

* tb/multi-cruft-pack-refresh-fix:
builtin/pack-objects.c: freshen objects from existing cruft packs

Merge branch 'am/dir-dedup-decl-of-repository'

Code cleanup.

* am/dir-dedup-decl-of-repository:
dir.h: remove duplicate forward declaration of struct repository

Merge branch 'ps/meson-with-breaking-changes'

Update meson based build procedure for breaking changes support.

* ps/meson-with-breaking-changes:
  meson: don't install git-pack-redundant(1) docs with breaking changes
  meson: don't compile git-pack-redundant(1) with breaking changes
  meson: define WITH_BREAKING_CHANGES when enabling breaking changes

Merge branch 'jk/fetch-ref-prefix-cleanup'

In protocol v2 where the refs advertisement is constrained, we try
to tell the server side not to limit the advertisement when there
is no specific need to, which has been the source of confusion and
recent bugs.  Revamp the logic to simplify.

* jk/fetch-ref-prefix-cleanup:
  fetch: use ref prefix list to skip ls-refs
  fetch: avoid ls-refs only to ask for HEAD symref update
  fetch: stop protecting additions to ref-prefix list
  fetch: ask server to advertise HEAD for config-less fetch
  refspec_ref_prefixes(): clean up refspec_item logic
  t5516: beef up exact-oid ref prefixes test
  t5516: drop NEEDSWORK about v2 reachability behavior
  t5516: prefer "oid" to "sha1" in some test titles
  t5702: fix typo in test name

Merge branch 'ab/decorate-code-cleanup'

Code clean-up.

* ab/decorate-code-cleanup:
decorate: fix sign comparison warnings

Merge branch 'en/merge-ort-prepare-to-remove-recursive'

First step of deprecating and removing merge-recursive.

* en/merge-ort-prepare-to-remove-recursive:
  am: switch from merge_recursive_generic() to merge_ort_generic()
  merge-ort: fix merge.directoryRenames=false
  t3650: document bug when directory renames are turned off
  merge-ort: support having merge verbosity be set to 0
  merge-ort: allow rename detection to be disabled
  merge-ort: add new merge_ort_generic() function

Merge branch 'ps/refname-avail-check-optim'

The code paths to check whether a refname X is available (by seeing
if another ref X/Y exists, etc.) have been optimized.

* ps/refname-avail-check-optim:
  refs: reuse iterators when determining refname availability
  refs/iterator: implement seeking for files iterators
  refs/iterator: implement seeking for packed-ref iterators
  refs/iterator: implement seeking for ref-cache iterators
  refs/iterator: implement seeking for reftable iterators
  refs/iterator: implement seeking for merged iterators
  refs/iterator: provide infrastructure to re-seek iterators
  refs/iterator: separate lifecycle from iteration
  refs: stop re-verifying common prefixes for availability
  refs/files: batch refname availability checks for initial transactions
  refs/files: batch refname availability checks for normal transactions
  refs/reftable: batch refname availability checks
  refs: introduce function to batch refname availability checks
  builtin/update-ref: skip ambiguity checks when parsing object IDs
  object-name: allow skipping ambiguity checks in `get_oid()` family
  object-name: introduce `repo_get_oid_with_flags()`

Merge branch 'cc/signed-fast-export-import'

"git fast-export | git fast-import" learns to deal with commit and
tag objects with embedded signatures a bit better.

* cc/signed-fast-export-import:
  fast-export, fast-import: add support for signed-commits
  fast-export: do not modify memory from get_commit_buffer
  git-fast-export.adoc: clarify why 'verbatim' may not be a good idea
  fast-export: rename --signed-tags='warn' to 'warn-verbatim'
  fast-export: fix missing whitespace after switch
  git-fast-import.adoc: add missing LF in the BNF

Start 2.50 cycle (batch #1)

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ja/doc-block-delimiter-markup-fix'

Doc markup updates.

* ja/doc-block-delimiter-markup-fix:
doc: add a blank line around block delimiters

Merge branch 'en/merge-process-renames-crash-fix'

The merge-recursive and merge-ort machinery crashed in corner cases
when certain renames are involved.

* en/merge-process-renames-crash-fix:
merge-ort: fix slightly overzealous assertion for rename-to-self
t6423: add a testcase causing a failed assertion in process_renames

Merge branch 'ua/some-builtins-wo-the-repository'

A handful of built-in command implementations have been rewritten
to use the repository instance supplied by git.c:run_builtin(), its
caller.

* ua/some-builtins-wo-the-repository:
  builtin/checkout-index: stop using `the_repository`
  builtin/for-each-ref: stop using `the_repository`
  builtin/ls-files: stop using `the_repository`
  builtin/pack-refs: stop using `the_repository`
  builtin/send-pack: stop using `the_repository`
  builtin/verify-commit: stop using `the_repository`
  builtin/verify-tag: stop using `the_repository`
  config: teach repo_config to allow `repo` to be NULL

Merge branch 'tb/refs-exclude-fixes'

The refname exclusion logic in the packed-ref backend has been
broken for some time, which confused upload-pack to advertise
different set of refs.  This has been corrected.

* tb/refs-exclude-fixes:
  refs.c: stop matching non-directory prefixes in exclude patterns
  refs.c: remove empty '--exclude' patterns

Merge branch 'sj/ref-consistency-checks-more'

"git fsck" becomes more careful when checking the refs.

* sj/ref-consistency-checks-more:
  builtin/fsck: add `git refs verify` child process
  packed-backend: check whether the "packed-refs" is sorted
  packed-backend: add "packed-refs" entry consistency check
  packed-backend: check whether the refname contains NUL characters
  packed-backend: add "packed-refs" header consistency check
  packed-backend: check if header starts with "# pack-refs with: "
  packed-backend: check whether the "packed-refs" is regular file
  builtin/refs: get worktrees without reading head information
  t0602: use subshell to ensure working directory unchanged

Merge branch 'jt/diff-pairs'

A post-processing filter for "diff --raw" output has been
introduced.

* jt/diff-pairs:
  builtin/diff-pairs: allow explicit diff queue flush
  builtin: introduce diff-pairs command
  diff: add option to skip resolving diff statuses
  diff: return diff_filepair from diff queue helpers

mergetools: vimdiff: add tests for layout with REMOTE as the target

Add some tests to make sure that now "REMOTE" can be used as a target
(ie. can be used together with the "@" marker) inside
"mergetool.vimdiff.layout"

Signed-off-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mergetools: vimdiff: fix layout where REMOTE is the target

"mergetool.vimdiff.layout" is used to define the vim layout (ie. how
windows, tabs and buffers are physically organized) when resolving
conflicts.

For example, if we set it to this:

    "(LOCAL,BASE,REMOTE)/MERGED"

...vim will open and show this layout:

    ------------------------------------------
    |             |           |              |
    |   LOCAL     |   BASE    |   REMOTE     |
    |             |           |              |
    ------------------------------------------
    |                                        |
    |                MERGED                  |
    |                                        |
    ------------------------------------------

By default, whatever ends up been written to the "MERGED" window will
become the file which conflict we are resolving.

However, it is possible to use the "@" symbol to specify a different
one.  For example, if we use this slightly different version of the
previously used string:

    "(LOCAL,BASE,@REMOTE)/MERGED"

...then the user should proceed to edit the contents of the top right
window (instead of the bottom window) as *that* is what will become the
conflicts free file once vim is closed.

Before this commit, the "@" marker worked for all targets *except* for
"REMOTE". In other words, these worked as expected:

    "(@LOCAL,BASE,REMOTE)/MERGED"
    "(LOCAL,@BASE,REMOTE)/MERGED"
    "(LOCAL,BASE,REMOTE)/@MERGED"

...but this didn't:

    "(LOCAL,BASE,@REMOTE)/MERGED"

This commit fixes that.

Reported-by: kawarimidoll <kawarimidoll+git@gmail.com>
Suggested-by: D. Ben Knoble <ben.knoble@gmail.com>
Signed-off-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson: disable coccinelle configuration when building from a tarball

Wiring up coccinelle in the build, depends on running git commands to
get the list of files to operate on. Reasonable, for a feature mainly
used by people developing on git. If building git itself from a tarball
distribution of git's own source code, one likely does not need to run
coccinelle.

But running those git commands failed, and caused the build to error
out, if `spatch` was installed -- because the build assumed that its
presence indicated a desire to use it on this source tree. Instead, we
can expand the conditional to check for both `spatch` and the `.git`
file or directory.

Meson's `opt.require()` method allows us to add a prerequisite for the
feature option. If the prerequisite fails, then the option either:

- converts autodetection to disabled

- emits an informative error if the feature was set to enabled:
  ```
  ERROR: Feature coccinelle cannot be enabled: coccinelle can only be run from a git checkout
  ```

Signed-off-by: Eli Schwartz <eschwartz@gentoo.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

vimdiff: clarify the sigil used for marking the buffer to save

The original documentation from 7b5cf8be18 (vimdiff: add tool
documentation, 2022-03-30) mistakenly described the marker as an
asterisk, which is the character "*". The code and examples have always
looked for an arobase ("@").

Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Acked-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bulk-checkin: fix sign compare warnings

In file bulk-checkin.c, three warnings are emitted by
"-Wsign-compare", two of which are caused by trivial loop iterator
type mismatches. For the third case, the type of `rsize` from

ssize_t rsize = size < sizeof(ibuf) ? size : sizeof(ibuf);

can be changed to size_t as both options of the ternary expression are
unsigned and the signedness of the variable isn't really needed
anywhere.

To prevent `read_result != rsize` making a clash, it is to be noted
that `read_result` is checked not to hold negative values. Therefore
casting the variable to size_t is a safe operation and enough to
remove the sign-compare warning.

Fix issues accordingly, and remove `DISABLE_SIGN_COMPARE_WARNINGS` to
enable "-Wsign-compare" for the file.

Signed-off-by: Tuomas Ahola <taahol@utu.fi>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

imap-send: explicitly verify the peer certificate

It is a bug to obtain the peer certificate without verifying it.

Having said that, from my reading of
https://www.openssl.org/docs/man1.1.1/man3/SSL_set_verify.html, it would
appear that Git is saved by the fact that it calls
`SSL_CTX_set_verify(ctx, SSL_VERIFY_PEER, NULL)` already early on.

In other words, that `SSL_VERIFY_PEER` combined with the `NULL`
parameter (i.e. no overridden callback) would _already_ verify the peer
certificate. The fact that we later call `SSL_get_peer_certificate()`
is mistaken by CodeQL to mean that that peer certificate still needs to
be verified, but that had already happened at that point.

Nevertheless, it is better to verify the peer certificate explicitly
than to rely on some side effect that is really hard to reason about
(and that took me more than one business day to analyze fully). It also
makes it easier for static analyzers to validate the correctness of the
code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

test-tool path-utils: support debugging "dubious ownership" issues

This adds a new sub-sub-command for `test-tool`, simply passing through
the command-line arguments to the `is_path_owned_by_current_user()`
function.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mingw: special-case administrators even more

The check for dubious ownership has one particular quirk on Windows: if
running as an administrator, files owned by the Administrators _group_
are considered owned by the user.

The rationale for that is: When running in elevated mode, Git creates
files that aren't owned by the individual user but by the Administrators
group.

There is yet another quirk, though: The check I introduced to determine
whether the current user is an administrator uses the
`CheckTokenMembership()` function with the current process token. And
that check only succeeds when running in elevated mode!

Let's be a bit more lenient here and look harder whether the current
user is an administrator. We do this by looking for a so-called "linked
token". That token exists when administrators run in non-elevated mode,
and can be used to create a new process in elevated mode. And feeding
_that_ token to the `CheckTokenMembership()` function succeeds!

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

completion: fix bugs with slashes in remote names

Previously, some calls to for-each-ref passed fixed numbers of path
components to strip from refs, assuming that remote names had no slashes
in them. This made completions like:

git push github/dseomn :com<Tab>

Result in:

git push github/dseomn :dseomn/completion-remote-slash

With this patch, it instead results in:

git push github/dseomn :completion-remote-slash

Signed-off-by: David Mandelberg <david@mandelberg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

completion: add helper to count path components

A follow-up commit will use this with for-each-ref to strip the right
number of path components from refnames.

Signed-off-by: David Mandelberg <david@mandelberg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit: move clear_commit_marks_many() loop body to clear_commit_marks()

clear_commit_marks_many() clears multiple commits one by one. Move the
code for handling a single commit to clear_commit_marks() and call it
instead of the other way around, to simplify the code.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx: implement writing incremental MIDX bitmaps

Now that the pack-bitmap machinery has learned how to read and interact
with an incremental MIDX bitmap, teach the pack-bitmap-write.c machinery
(and relevant callers from within the MIDX machinery) to write such
bitmaps.

The details for doing so are mostly straightforward. The main changes
are as follows:

  - find_object_pos() now makes use of an extra MIDX parameter which is
    used to locate the bit positions of objects which are from previous
    layers (and thus do not exist in the current layer's pack_order
    field).

    (Note also that the pack_order field is moved into struct
    write_midx_context to further simplify the callers for
    write_midx_bitmap()).

  - bitmap_writer_build_type_index() first determines how many objects
    precede the current bitmap layer and offsets the bits it sets in
    each respective type-level bitmap by that amount so they can be OR'd
    together.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators

Now that we have initialized arrays for each bitmap layer's type bitmaps
in the previous commit, adjust existing callers to use them in
preparation for multi-layered bitmaps.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap.c: keep track of each layer's type bitmaps

Prepare for reading the type-level bitmaps from previous bitmap layers
by maintaining an array for each type, where each element in that type's
array corresponds to one layer's bitmap for that type.

These fields will be used in a later commit to instantiate the 'struct
ewah_or_iterator' for each type.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ewah: implement `struct ewah_or_iterator`

While individual bitmap layers store different commit, type-level, and
pseudo-merge bitmaps, only the top-most layer is used to compute
reachability traversals.

Many functions which implement the aforementioned traversal rely on
enumerating the results according to the type-level bitmaps, and so
would benefit from a conceptual type-level bitmap that spans multiple
layers.

Implement `struct ewah_or_iterator` which is capable of enumerating
multiple EWAH bitmaps at once, and OR-ing the results together. When
initialized with, for example, all of the commit type bitmaps from each
layer, callers can pretend as if they are enumerating a large type-level
bitmap which contains the commits from *all* bitmap layers.

There are a couple of alternative approaches which were considered:

  - Decompress each EWAH bitmap and OR them together, enumerating a
    single (non-EWAH) bitmap. This would work, but has the disadvantage
    of decompressing a potentially large bitmap, which may not be
    necessary if the caller does not wish to read all of it.

  - Recursively call bitmap internal functions, reusing the "result" and
    "haves" bitmap from the top-most layer. This approach resembles the
    original implementation of this feature, but is inefficient in that
    it both (a) requires significant refactoring to implement, and (b)
    enumerates large sections of later bitmaps which are all zeros (as
    they pertain to objects in earlier layers).

    (b) is not so bad in and of itself, but can cause significant
    slow-downs when combined with expensive loop bodies.

This approach (enumerating an OR'd together version of all of the
type-level bitmaps from each layer) produces a significantly more
straightforward implementation with significantly less refactoring
required in order to make it work.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap.c: apply pseudo-merge commits with incremental MIDXs

Prepare for using pseudo-merges with incremental MIDX bitmaps by
attempting to apply pseudo-merges from each layer when encountering a
given commit during a walk.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap.c: compute disk-usage with incremental MIDXs

In a similar fashion as previous commits, use nth_midxed_pack() instead
of accessing the MIDX's ->packs array directly to support incremental
MIDXs.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs

Implement support for the special `--test-bitmap` mode of `git rev-list`
when using incremental MIDXs.

The bitmap_test_data structure is extended to contain a "base" pointer
that mirrors the structure of the bitmap chain that it is being used to
test.

When we find a commit to test, we first chase down the ->base pointer to
find the appropriate bitmap_test_data for the bitmap layer that the
given commit is contained within, and then perform the test on that
bitmap.

In order to implement this, light modifications are made to
bitmap_for_commit() to reimplement it in terms of a new function,
find_bitmap_for_commit(), which fills out a pointer which indicates the
bitmap layer which contains the given commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs

In a similar fashion as previous commits in the first phase of
incremental MIDXs, enumerate not just the packs in the current
incremental MIDX layer, but previous ones as well.

Likewise, in reuse_partial_packfile_from_bitmap(), when reusing only a
single pack from a MIDX, use the oldest layer's preferred pack as it is
likely to contain the largest number of reusable sections.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs

Since we may ask for a pack_id that is in an earlier MIDX layer relative
to the one corresponding to our bitmap, use nth_midxed_pack() instead of
accessing the ->packs array directly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs

The pack-bitmap machinery uses `bitmap_for_commit()` to locate the
EWAH-compressed bitmap corresponding to some given commit object.

Teach this function about incremental MIDX bitmaps by teaching it to
recur on earlier bitmap layers when it fails to find a given commit in
the current layer.

The changes to do so are as follows:

  - Avoid initializing hash_pos at its declaration, since
    bitmap_for_commit() is now a recursive function and may receive a
    NULL bitmap_index pointer as its first argument.

  - In cases where we would previously return NULL (to indicate that a
    lookup failed and the given bitmap_index does not contain an entry
    corresponding to the given commit), recursively call the function on
    the previous bitmap layer.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap.c: open and store incremental bitmap layers

Prepare the pack-bitmap machinery to work with incremental MIDXs by
adding a new "base" field to keep track of the bitmap index associated
with the previous MIDX layer.

The changes in this commit are mostly boilerplate to open the correct
bitmap(s), add them to the chain of bitmap layers along the "base"
pointer, ensure that the correct packs and their reverse indexes are
loaded across MIDX layers, etc.

While we're at it, keep track of a base_nr field to indicate how many
bitmap layers (including the current bitmap) exist. This will be used in
a future commit to allocate an array of 'struct ewah_bitmap' pointers to
collect all of the respective type bitmaps among all layers to
initialize a multi-EWAH iterator.

Subsequent commits will teach the functions within the pack-bitmap
machinery how to interact with these new fields.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-revindex: prepare for incremental MIDX bitmaps

Prepare the reverse index machinery to handle object lookups in an
incremental MIDX bitmap. These changes are broken out across a few
functions:

  - load_midx_revindex() learns to use the appropriate MIDX filename
    depending on whether the given 'struct multi_pack_index *' is
    incremental or not.

  - pack_pos_to_midx() and midx_to_pack_pos() now both take in a global
    object position in the MIDX pseudo-pack order, and find the
    earliest containing MIDX (similar to midx.c::midx_for_object().

  - midx_pack_order_cmp() adjusts its call to pack_pos_to_midx() by the
    number of objects in the base (since 'vb - midx->revindx_data' is
    relative to the containing MIDX, and pack_pos_to_midx() expects a
    global position).

    Likewise, this function adjusts its output by adding
    m->num_objects_in_base to return a global position out through the
    `*pos` pointer.

Together, these changes are sufficient to use the multi-pack index's
reverse index format for incremental multi-pack reachability bitmaps.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation: describe incremental MIDX bitmaps

Prepare to implement support for reachability bitmaps for the new
incremental multi-pack index (MIDX) feature over the following commits.

This commit begins by first describing the relevant format and usage
details for incremental MIDX bitmaps.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation: remove a "future work" item from the MIDX docs

One of the items listed as "future work" in the MIDX's technical
documentation is to extend the format to allow MIDXs to be written
incrementally across multiple layers.

This was suggested all the way back in ceab693d1f (multi-pack-index: add
design document, 2018-07-12), and implemented in b9497848df (Merge
branch 'tb/incremental-midx-part-1', 2024-08-19). Let's remove it
accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repack: begin combining cruft packs with `--combine-cruft-below-size`

The previous commit changed the behavior of repack's '--max-cruft-size'
to specify a cruft pack-specific override for '--max-pack-size'.

Introduce a new flag, '--combine-cruft-below-size' which is a
replacement for the old behavior of '--max-cruft-size'. This new flag
does explicitly what it says: it combines together cruft packs which are
smaller than a given threshold, and leaves alone ones which are
larger.

This accomplishes the original intent of '--max-cruft-size', which was
to avoid repacking cruft packs larger than the given threshold.

The new behavior is slightly different. Instead of building up small
packs together until the threshold is met, '--combine-cruft-below-size'
packs up *all* cruft packs smaller than the threshold. This means that
we may make a pack much larger than the given threshold (e.g., if you
aggregate 5 packs which are each 99 MiB in size with a threshold of 100
MiB).

But that's OK: the point isn't to restrict the size of the cruft packs
we generate, it's to avoid working with ones that have already grown too
large. If repositories still want to limit the size of the generated
cruft pack(s), they may use '--max-cruft-size'.

There's some minor test fallout as a result of the slight differences in
behavior between the old meaning of '--max-cruft-size' and the behavior
of '--combine-cruft-below-size'. In the test which is now called
"--combine-cruft-below-size combines packs", we need to use the new flag
over the old one to exercise that test's intended behavior. The
remainder of the changes there are to improve the clarity of the
comments.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>