git.ipfire.org Git - thirdparty/git.git/log

The 8th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ap/use-test-seq-f-more'

Test clean-up.

* ap/use-test-seq-f-more:
t: use test_seq -f and pipes in a few more places

Merge branch 'db/doc-fetch-jobs-auto'

Doc update.

* db/doc-fetch-jobs-auto:
doc: fetch: document `--jobs=0` behavior

Merge branch 'mf/format-patch-honor-from-for-cover-letter'

"git format-patch --from=<me>" did not honor the command line
option when writing out the cover letter, which has been corrected.

* mf/format-patch-honor-from-for-cover-letter:
format-patch: fix From header in cover letter

Merge branch 'jh/alias-i18n'

Extend the alias configuration syntax to allow aliases using
characters outside ASCII alphanumeric (plus '-').

* jh/alias-i18n:
  completion: fix zsh alias listing for subsection aliases
  alias: support non-alphanumeric names via subsection syntax
  alias: prepare for subsection aliases
  help: use list_aliases() for alias listing

Merge branch 'ps/tests-wo-iconv-fixes'

Some tests assumed "iconv" is available without honoring ICONV
prerequisite, which has been corrected.

* ps/tests-wo-iconv-fixes:
  t6006: don't use iconv(1) without ICONV prereq
  t5550: add ICONV prereq to tests that use "$HTTPD_URL/error"
  t4205: improve handling of ICONV prerequisite
  t40xx: don't use iconv(1) without ICONV prereq
  t: don't set ICONV prereq when iconv(1) is missing

Merge branch 'ps/ci-gitlab-msvc-updates'

CI update.

* ps/ci-gitlab-msvc-updates:
  gitlab-ci: handle failed tests on MSVC+Meson job
  gitlab-ci: use "run-test-slice-meson.sh"
  ci: make test slicing consistent across Meson/Make
  github: fix Meson tests not executing at all
  meson: fix MERGE_TOOL_DIR with "--no-bin-wrappers"
  ci: don't skip smallest test slice in GitLab
  ci: handle failures of test-slice helper

Merge branch 'jc/whitespace-incomplete-line'

It does not make much sense to apply the "incomplete-line"
whitespace rule to symbolic links, whose contents almost always
lack the final newline. "git apply" and "git diff" are now taught
to exclude them for a change to symbolic links.

* jc/whitespace-incomplete-line:
whitespace: symbolic links usually lack LF at the end

Merge branch 'jc/checkout-switch-restore'

"git switch <name>", in an attempt to create a local branch <name>
after a remote tracking branch of the same name gave an advise
message to disambiguate using "git checkout", which has been
updated to use "git switch".

* jc/checkout-switch-restore:
checkout: tell "parse_remote_branch" which command is calling it
checkout: pass program-readable token to unified "main"

Merge branch 'jk/ref-filter-lrstrip-optim'

Code clean-up.

* jk/ref-filter-lrstrip-optim:
  ref-filter: clarify lstrip/rstrip component counting
  ref-filter: avoid strrchr() in rstrip_ref_components()
  ref-filter: simplify rstrip_ref_components() memory handling
  ref-filter: simplify lstrip_ref_components() memory handling
  ref-filter: factor out refname component counting

Merge branch 'ps/history-ergonomics-updates'

UI improvements for "git history reword".

* ps/history-ergonomics-updates:
  Documentation/git-history: document default for "--update-refs="
  builtin/history: rename "--ref-action=" to "--update-refs="
  builtin/history: replace "--ref-action=print" with "--dry-run"
  builtin/history: check for merges before asking for user input
  builtin/history: perform revwalk checks before asking for user input

Merge branch 'ps/for-each-ref-in-fixes'

A handful of places used refs_for_each_ref_in() API incorrectly,
which has been corrected.

* ps/for-each-ref-in-fixes:
  bisect: simplify string_list memory handling
  bisect: fix misuse of `refs_for_each_ref_in()`
  pack-bitmap: fix bug with exact ref match in "pack.preferBitmapTips"
  pack-bitmap: deduplicate logic to iterate over preferred bitmap tips

Merge branch 'lo/repo-info-keys'

"git repo info" learns "--keys" action to list known keys.

* lo/repo-info-keys:
repo: add new flag --keys to git-repo-info
repo: rename the output format "keyvalue" to "lines"

t4052: test for diffstat width when prefix contains ANSI escape codes

Add test checking the calculation of the diffstat display width when the
`line_prefix`, which is text that goes before the diffstat, contains
ANSI escape codes.

This situation happens, for example, when `git log --stat --graph` is
executed:
* `--stat` will create a diffstat for each commit
* `--graph` will stuff `line_prefix` with the graph portion of the log,
which contains ANSI escape codes to color the text

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: handle ANSI escape codes in prefix when calculating diffstat width

The diffstat width is calculated by taking the terminal width and
incorrectly subtracting the `strlen()` of `line_prefix`, instead of the
actual display width of `line_prefix`, which may contain ANSI escape
codes (e.g., ANSI-colored strings in `log --graph --stat`).

Utilize the display width instead, obtained via `utf8_strnwidth()` with
the flag `skip_ansi`.

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: remove duplicate --stdin-packs definition

cd846bacc7 (pack-objects: introduce '--stdin-packs=follow', 2025-06-23)
added a new definition of the option --stdin-packs that accepts an
argument. It kept the old definition, which still shows up in the short
help, but is shadowed by the new one. Remove it.

Hinted-at-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repo: remove unnecessary variable shadow

Avoid redeclaring `entry` inside the conditional block, removing
unnecessary variable shadowing and improving code clarity without
changing behavior.

Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Acked-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git, help: fix memory leaks in alias listing

The list_aliases() function sets the util pointer of each list item to
a heap-allocated copy of the alias command value.  Two callers failed
to free these util pointers:

- list_cmds() in git.c collects a string list with STRING_LIST_INIT_DUP
   and clears it with string_list_clear(&list, 0), which frees the
   duplicated strings (strdup_strings=1) but not the util pointers.
   Pass free_util=1 to free them.

- list_cmds_by_config() in help.c calls string_list_sort_u(list, 0) to
   deduplicate the list before processing completion.commands overrides.
   When duplicate entries are removed, the util pointer of each discarded
   item is leaked because free_util=0.  Pass free_util=1 to free them.

Reported-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

alias: treat empty subsection [alias ""] as plain [alias]

When git-config stores a key of the form alias..name, it records
it under an empty subsection ([alias ""]). The new subsection-aware
alias lookup would see a non-NULL but zero-length subsection and
fall into the subsection code path, where it required a "command"
key and thus silently ignored the entry.

Normalize an empty subsection to NULL before any further processing
so that entries stored this way continue to work as plain
case-insensitive aliases, matching the pre-subsection behaviour.

Users who relied on alias..name to create an alias literally named
".name" may want to migrate to subsection syntax, which looks less confusing:

[alias ".name"]
command = <value>

Add tests covering both the empty-subsection compatibility case and
the leading-dot alias via the new syntax.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: fix list continuation in alias subsection example

The example showing the equivalence between alias.last and
alias.last.command was missing the list continuation marks (+
between the shell session block and the following prose, leaving
the paragraph detached from the list item in the rendered output.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

status: add status.compareBranches config for multiple branch comparisons

Add a new configuration variable status.compareBranches that allows
users to specify a space-separated list of branch comparisons in
git status output.

Supported values:
- @{upstream} for the current branch's upstream tracking branch
- @{push} for the current branch's push destination

Any other value is ignored and a warning is shown.

When not configured, the default behavior is equivalent to setting
`status.compareBranches = @{upstream}`, preserving backward
compatibility.

The advice messages shown are context-aware:
- "git pull" advice is shown only when comparing against @{upstream}
- "git push" advice is shown only when comparing against @{push}
- Divergence advice is shown for upstream branch comparisons

This is useful for triangular workflows where the upstream tracking
branch differs from the push destination, allowing users to see their
status relative to both branches at once.

Example configuration:
[status]
compareBranches = @{upstream} @{push}

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refactor format_branch_comparison in preparation

Refactor format_branch_comparison function in preparation for showing
comparison with push remote tracking branch.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

environment: move "branch.autoSetupMerge" into `struct repo_config_values`

The config value `branch.autoSetupMerge` is parsed in
`git_default_branch_config()` and stored in the global variable
`git_branch_track`. This global variable can be overwritten
by another repository when multiple Git repos run in the the same process.

Move this value into `struct repo_config_values` in the_repository to
retain current behaviours and move towards libifying Git.
Since the variable is no longer a global variable, it has been renamed to
`branch_track` in the struct `repo_config_values`.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

environment: stop using core.sparseCheckout globally

The config value `core.sparseCheckout` is parsed in
`git_default_core_config()` and stored globally in
`core_apply_sparse_checkout`. This could cause it to be overwritten
by another repository when different Git repositories run in the same
process.

Move the parsed value into `struct repo_config_values` in the_repository
to retain current behaviours and move towards libifying Git.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The 7th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ac/string-list-sort-u-and-tests'

Code clean-up using a new helper function introduced lately.

* ac/string-list-sort-u-and-tests:
sparse-checkout: use string_list_sort_u

Merge branch 'mc/tr2-process-ancestry-cleanup'

Add process ancestry data to trace2 on macOS to match what we
already do on Linux and Windows.  Also adjust the way Windows
implementation reports this information to match the other two.

* mc/tr2-process-ancestry-cleanup:
  t0213: add trace2 cmd_ancestry tests
  test-tool: extend trace2 helper with 400ancestry
  trace2: emit cmd_ancestry data for Windows
  trace2: refactor Windows process ancestry trace2 event
  build: include procinfo.c impl for macOS
  trace2: add macOS process ancestry tracing

Merge branch 'ps/pack-concat-wo-backfill'

"git pack-objects --stdin-packs" with "--exclude-promisor-objects"
fetched objects that are promised, which was not wanted. This has
been fixed.

* ps/pack-concat-wo-backfill:
builtin/pack-objects: don't fetch objects when merging packs

Merge branch 'dk/complete-stash-import-export'

Command line completion (in contrib/) update.

* dk/complete-stash-import-export:
completion: add stash import, export

Merge branch 'jc/doc-cg-needswork'

A CodingGuidelines update.

* jc/doc-cg-needswork:
CodingGuidelines: document NEEDSWORK comments

Merge branch 'ds/revision-maximal-only'

"git rev-list" and friends learn "--maximal-only" to show only the
commits that are not reachable by other commits.

* ds/revision-maximal-only:
revision: add --maximal-only option

Merge branch 'cc/lop-filter-auto'

"auto filter" logic for large-object promisor remote.

* cc/lop-filter-auto:
  fetch-pack: wire up and enable auto filter logic
  promisor-remote: change promisor_remote_reply()'s signature
  promisor-remote: keep advertised filters in memory
  list-objects-filter-options: support 'auto' mode for --filter
  doc: fetch: document `--filter=<filter-spec>` option
  fetch: make filter_options local to cmd_fetch()
  clone: make filter_options local to cmd_clone()
  promisor-remote: allow a client to store fields
  promisor-remote: refactor initialising field lists

Merge branch 'pw/commit-msg-sample-hook'

Update sample commit-msg hook to complain when a log message has
material mailinfo considers the end of log message in the middle.

* pw/commit-msg-sample-hook:
templates: detect commit messages containing diffs
templates: add .gitattributes entry for sample hooks

Merge branch 'kh/doc-am-format-sendmail'

Doc update.

* kh/doc-am-format-sendmail:
doc: add caveat about round-tripping format-patch

Documentation/git-repo: capitalize format descriptions

The descriptions for the git-repo output formats are in lowercase.
Capitalize these descriptions, making them consistent with the rest of
the documentation.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/git-repo: replace 'NUL' with '_NUL_'

Replace all occurrences of "NUL" by "_NUL_" in git-repo.adoc, following the
convention used by other documentation files.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t1901: adjust nul format output instead of expected value

The test 'keyvalue and nul format', as it description says, test both
`keyvalue` and `nul` format. These formats are similar, differing only in
their field separator (= in the former, LF in the latter) and their
record separator (LF in the former, NUL in the latter). This way, both
formats can be tested using the same expected output and only replacing
the separators in one of the output formats.

However, it is not desirable to have a NUL character in the files
compared by test_cmp because, if that assetion fails, diff will consider
them binary files and won't display the differences properly.

Adjust the output of `git repo structure --format=nul` in t1901, matching the
--format=keyvalue ones. Compare this output against the same value expected
from --format=keyvalue, without using files with NUL characters in
test_cmp.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t1900: rename t1900-repo to t1900-repo-info

Since the commit bbb2b93348 (builtin/repo: introduce structure subcommand,
2025-10-21), t1901 specifically tests git-repo-structure. Rename
t1900-repo to t1900-repo-info to clarify that it focus solely on
git-repo-info subcommand.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repo: rename struct field to repo_info_field

Change the name of the struct field to repo_info_field, making it
explicit that it is an internal data type of git-repo-info.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repo: replace get_value_fn_for_key by get_repo_info_field

Remove the function `get_value_fn_for_key`, which returns a function that
retrieves a value for a certain repo info key. Introduce `get_repo_info_field`
instead, which returns a struct field.

This refactor makes the structure of the function print_fields more consistent
to the function print_all_fields, improving its readability.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repo: rename repo_info_fields to repo_info_field

Rename repo_info_fields as repo_info_field, following the CodingGuidelines rule
for naming arrays in singular. Rename all the references to that array
accordingly.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

CodingGuidelines: instruct to name arrays in singular

Arrays should be named in the singular form, ensuring that when
accessing an element within an array (e.g. dog[0]) it's clear that
we're referring to an element instead of a collection.

Add a new rule to CodingGuidelines asking for arrays to be named in
singular instead of plural.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: add GIT_REFERENCE_BACKEND to specify reference backend

Git allows setting a different object directory via
'GIT_OBJECT_DIRECTORY', but provides no equivalent for references. In
the previous commit we extended the 'extensions.refStorage' config to
also support an URI input for reference backend with location.

Let's also add a new environment variable 'GIT_REFERENCE_BACKEND' that
takes in the same input as the config variable. Having an environment
variable allows us to modify the reference backend and location on the
fly for individual Git commands.

The environment variable also allows usage of alternate reference
directories during 'git-clone(1)' and 'git-init(1)'. Add the config to
the repository when created with the environment variable set.

When initializing the repository with an alternate reference folder,
create the required stubs in the repositories $GIT_DIR. The inverse,
i.e. removal of the ref store doesn't clean up the stubs in the $GIT_DIR
since that would render it unusable. Removal of ref store is only used
when migrating between ref formats and cleanup of the $GIT_DIR doesn't
make sense in such a situation.

Helped-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: allow reference location in refstorage config

The 'extensions.refStorage' config is used to specify the reference
backend for a given repository. Both the 'files' and 'reftable' backends
utilize the $GIT_DIR as the reference folder by default in
`get_main_ref_store()`.

Since the reference backends are pluggable, this means that they could
work with out-of-tree reference directories too. Extend the 'refStorage'
config to also support taking an URI input, where users can specify the
reference backend and the location.

Add the required changes to obtain and propagate this value to the
individual backends. Add the necessary documentation and tests.

Traditionally, for linked worktrees, references were stored in the
'$GIT_DIR/worktrees/<wt_id>' path. But when using an alternate reference
storage path, it doesn't make sense to store the main worktree
references in the new path, and the linked worktree references in the
$GIT_DIR. So, let's store linked worktree references in
'$ALTERNATE_REFERENCE_DIR/worktrees/<wt_id>'. To do this, create the
necessary files and folders while also adding stubs in the $GIT_DIR path
to ensure that it is still considered a Git directory.

Ideally, we would want to pass in a `struct worktree *` to individual
backends, instead of passing the `gitdir`. This allows them to handle
worktree specific logic. Currently, that is not possible since the
worktree code is:

  - Tied to using the global `the_repository` variable.

  - Is not setup before the reference database during initialization of
    the repository.

Add a TODO in 'refs.c' to ensure we can eventually make that change.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: receive and use the reference storage payload

An upcoming commit will add support for providing an URI via the
'extensions.refStorage' config. The URI will contain the reference
backend and a corresponding payload. The payload can be then used for
providing an alternate locations for the reference backend.

To prepare for this, modify the existing backends to accept such an
argument when initializing via the 'init()' function. Both the files
and reftable backends will parse the information to be filesystem paths
to store references. Given that no callers pass any payload yet this is
essentially a no-op change for now.

To enable this, provide a 'refs_compute_filesystem_location()' function
which will parse the current 'gitdir' and the 'payload' to provide the
final reference directory and common reference directory (if working in
a linked worktree).

The documentation and tests will be added alongside the extension of the
config variable.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: move out stub modification to generic layer

When creating the reftable reference backend on disk, we create stubs to
ensure that the directory can be recognized as a Git repository. This is
done by calling `refs_create_refdir_stubs()`. Move this to the generic
layer as this is needed for all backends excluding from the files
backends. In an upcoming commit where we introduce alternate reference
backend locations, we'll have to also create stubs in the $GIT_DIR
irrespective of the backend being used. This commit builds the base to
add that logic.

Similarly, move the logic for deletion of stubs to the generic layer.
The files backend recursively calls the remove function of the
'packed-backend', here skip calling the generic function since that
would try to delete stubs.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: extract out `refs_create_refdir_stubs()`

For Git to recognize a directory as a Git directory, it requires the
directory to contain:

  1. 'HEAD' file
  2. 'objects/' directory
  3. 'refs/' directory

Here, #1 and #3 are part of the reference storage mechanism,
specifically the files backend. Since then, newer backends such as the
reftable backend have moved to using their own path ('reftable/') for
storing references. But to ensure Git still recognizes the directory as
a Git directory, we create stubs.

There are two locations where we create stubs:

- In 'refs/reftable-backend.c' when creating the reftable backend.
- In 'clone.c' before spawning transport helpers.

In a following commit, we'll add another instance. So instead of
repeating the code, let's extract out this code to
`refs_create_refdir_stubs()` and use it.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

setup: don't modify repo in `create_reference_database()`

The `create_reference_database()` function is used to create the
reference database during initialization of a repository. The function
calls `repo_set_ref_storage_format()` to set the repositories reference
format. This is an unexpected side-effect of the function. More so
because the function is only called in two locations:

  1. During git-init(1) where the value is propagated from the `struct
     repository_format repo_fmt` value.

  2. During git-clone(1) where the value is propagated from the
     `the_repository` value.

The former is valid, however the flow already calls
`repo_set_ref_storage_format()`, so this effort is simply duplicated.
The latter sets the existing value in `the_repository` back to itself.
While this is okay for now, introduction of more fields in
`repo_set_ref_storage_format()` would cause issues, especially
dynamically allocated strings, where we would free/allocate the same
string back into `the_repostiory`.

To avoid all this confusion, clean up the function to no longer take in
and set the repo's reference storage format.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: fix wrong evaluation order in URL trailing-slash trimming

if i == -1, url[i] will be UB.

Signed-off-by: cuiweixie <cuiweixie@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx: enable reachability bitmaps during MIDX compaction

Enable callers to generate reachability bitmaps when performing MIDX
layer compaction by combining all existing bitmaps from the compacted
layers.

Note that because of the object/pack ordering described by the previous
commit, the pseudo-pack order for the compacted MIDX is the same as
concatenating the individual pseudo-pack orderings for each layer in the
compaction range.

As a result, the only non-test or documentation change necessary is to
treat all objects as non-preferred during compaction so as not to
disturb the object ordering.

In the future, we may want to adjust which commit(s) receive
reachability bitmaps when compacting multiple .bitmap files into one, or
even generate new bitmaps (e.g., if the references have moved
significantly since the .bitmap was generated). This commit only
implements combining all existing bitmaps in range together in order to
demonstrate and lay the groundwork for more exotic strategies.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx: implement MIDX compaction

When managing a MIDX chain with many layers, it is convenient to combine
a sequence of adjacent layers into a single layer to prevent the chain
from growing too long.

While it is conceptually possible to "compact" a sequence of MIDX layers
together by running "git multi-pack-index write --stdin-packs", there
are a few drawbacks that make this less than desirable:

- Preserving the MIDX chain is impossible, since there is no way to
   write a MIDX layer that contains objects or packs found in an earlier
   MIDX layer already part of the chain. So callers would have to write
   an entirely new (non-incremental) MIDX containing only the compacted
   layers, discarding all other objects/packs from the MIDX.

- There is (currently) no way to write a MIDX layer outside of the MIDX
   chain to work around the above, such that the MIDX chain could be
   reassembled substituting the compacted layers with the MIDX that was
   written.

- The `--stdin-packs` command-line option does not allow us to specify
   the order of packs as they appear in the MIDX. Therefore, even if
   there were workarounds for the previous two challenges, any bitmaps
   belonging to layers which come after the compacted layer(s) would no
   longer be valid.

This commit introduces a way to compact a sequence of adjacent MIDX
layers into a single layer while preserving the MIDX chain, as well as
any bitmap(s) in layers which are newer than the compacted ones.

Implementing MIDX compaction does not require a significant number of
changes to how MIDX layers are written. The main changes are as follows:

- Instead of calling `fill_packs_from_midx()`, we call a new function
   `fill_packs_from_midx_range()`, which walks backwards along the
   portion of the MIDX chain which we are compacting, and adds packs one
   layer a time.

   In order to preserve the pseudo-pack order, the concatenated pack
   order is preserved, with the exception of preferred packs which are
   always added first.

- After adding entries from the set of packs in the compaction range,
   `compute_sorted_entries()` must adjust the `pack_int_id`'s for all
   objects added in each fanout layer to match their original
   `pack_int_id`'s (as opposed to the index at which each pack appears
   in `ctx.info`).

   Note that we cannot reuse `midx_fanout_add_midx_fanout()` directly
   here, as it unconditionally recurs through the `->base_midx`. Factor
   out a `_1()` variant that operates on a single layer, reimplement
   the existing function in terms of it, and use the new variant from
   `midx_fanout_add_compact()`.

   Since we are sorting the list of objects ourselves, the order we add
   them in does not matter.

- When writing out the new 'multi-pack-index-chain' file, discard any
   layers in the compaction range, replacing them with the newly written
   layer, instead of keeping them and placing the new layer at the end
   of the chain.

This ends up being sufficient to implement MIDX compaction in such a way
that preserves bitmaps corresponding to more recent layers in the MIDX
chain.

The tests for MIDX compaction are so far fairly spartan, since the main
interesting behavior here is ensuring that the right packs/objects are
selected from each layer, and that the pack order is preserved despite
whether or not they are sorted in lexicographic order in the original
MIDX chain.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/helper/test-read-midx.c: plug memory leak when selecting layer

Though our 'read-midx' test tool is capable of printing information
about a single MIDX layer identified by its checksum, no caller in our
test suite exercises this path.

Unfortunately, there is a memory leak lurking in this (currently) unused
path that would otherwise be exposed by the following commit.

This occurs when providing a MIDX layer checksum other than the tip. As
we walk over the MIDX chain trying to find the matching layer, we drop
our reference to the top-most MIDX layer. Thus, our call to
'close_midx()' later on leaks memory between the top-most MIDX layer and
the MIDX layer immediately following the specified one.

Plug this leak by holding a reference to the tip of the MIDX chain, and
ensure that we call `close_midx()` before terminating the test tool.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx-write.c: factor fanout layering from `compute_sorted_entries()`

When computing the set of objects to appear in a MIDX, we use
compute_sorted_entries(), which handles objects from various existing
sources one fanout layer at a time.

The process for computing this set is slightly different during MIDX
compaction, so factor out the existing functionality into its own
routine to prevent `compute_sorted_entries()` from becoming too
difficult to read.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx-write.c: enumerate `pack_int_id` values directly

Our `midx-write.c::fill_packs_from_midx()` function currently enumerates
the range [0, m->num_packs), and then shifts its index variable up by
`m->num_packs_in_base` to produce a valid `pack_int_id`.

Instead, directly enumerate the range:

[m->num_packs_in_base, m->num_packs_in_base + m->num_packs)

, which are the original pack_int_ids themselves as opposed to the
indexes of those packs relative to the MIDX layer they are contained
within.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx-write.c: extract `fill_pack_from_midx()`

When filling packs from an existing MIDX, `fill_packs_from_midx()`
handles preparing a MIDX'd pack, and reading out its pack name from the
existing MIDX.

MIDX compaction will want to perform an identical operation, though the
caller will look quite different than `fill_packs_from_midx()`. To
reduce any future code duplication, extract `fill_pack_from_midx()`
from `fill_packs_from_midx()` to prepare to call our new helper function
in a future change.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx-write.c: introduce `midx_pack_perm()` helper

The `ctx->pack_perm` array can be considered as a permutation between
the original `pack_int_id` of some given pack to its position in the
`ctx->info` array containing all packs.

Today we can always index into this array with any known `pack_int_id`,
since there is never a `pack_int_id` which is greater than or equal to
the value `ctx->nr`.

That is not necessarily the case with MIDX compaction. For example,
suppose we have a MIDX chain with three layers, each containing three
packs. The base of the MIDX chain will have packs with IDs 0, 1, and 2,
the next layer 3, 4, and 5, and so on. If we are compacting the topmost
two layers, we'll have input `pack_int_id` values between [3, 8], but
`ctx->nr` will only be 6.

In that example, if we want to know where the pack whose original
`pack_int_id` value was, say, 7, we would compute `ctx->pack_perm[7]`,
leading to an uninitialized read, since there are only 6 entries
allocated in that array.

To address this, there are a couple of options:

- We could allocate enough entries in `ctx->pack_perm` to accommodate
the largest `orig_pack_int_id` value.

- Or, we could internally shift the input values by the number of packs
in the base layer of the lower end of the MIDX compaction range.

This patch prepare us to take the latter approach, since it does not
allocate more memory than strictly necessary. (In our above example, the
base of the lower end of the compaction range is the first MIDX layer
(having three packs), so we would end up indexing `ctx->pack_perm[7-3]`,
which is a valid read.)

Note that this patch does not actually implement that approach yet, but
merely performs a behavior-preserving refactoring which will make the
change easier to carry out in the future.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx: do not require packs to be sorted in lexicographic order

The MIDX file format currently requires that pack files be identified by
the lexicographic ordering of their names (that is, a pack having a
checksum beginning with "abc" would have a numeric pack_int_id which is
smaller than the same value for a pack beginning with "bcd").

As a result, it is impossible to combine adjacent MIDX layers together
without permuting bits from bitmaps that are in more recent layer(s).

To see why, consider the following example:

          | packs       | preferred pack
  --------+-------------+---------------
  MIDX #0 | { X, Y, Z } | Y
  MIDX #1 | { A, B, C } | B
  MIDX #2 | { D, E, F } | D

, where MIDX #2's base MIDX is MIDX #1, and so on. Suppose that we want
to combine MIDX layers #0 and #1, to create a new layer #0' containing
the packs from both layers. With the original three MIDX layers, objects
are laid out in the bitmap in the order they appear in their source
pack, and the packs themselves are arranged according to the pseudo-pack
order. In this case, that ordering is Y, X, Z, B, A, C.

But recall that the pseudo-pack ordering is defined by the order that
packs appear in the MIDX, with the exception of the preferred pack,
which sorts ahead of all other packs regardless of its position within
the MIDX. In the above example, that means that pack 'Y' could be placed
anywhere (so long as it is designated as preferred), however, all other
packs must be placed in the location listed above.

Because that ordering isn't sorted lexicographically, it is impossible
to compact MIDX layers in the above configuration without permuting the
object-to-bit-position mapping. Changing this mapping would affect all
bitmaps belonging to newer layers, rendering the bitmaps associated with
MIDX #2 unreadable.

One of the goals of MIDX compaction is that we are able to shrink the
length of the MIDX chain *without* invalidating bitmaps that belong to
newer layers, and the lexicographic ordering constraint is at odds with
this goal.

However, packs do not *need* to be lexicographically ordered within the
MIDX. As far as I can gather, the only reason they are sorted lexically
is to make it possible to perform a binary search over the pack names in
a MIDX, necessary to make `midx_contains_pack()`'s performance
logarithmic in the number of packs rather than linear.

Relax this constraint by allowing MIDX writes to proceed with packs that
are not arranged in lexicographic order. `midx_contains_pack()` will
lazily instantiate a `pack_names_sorted` array on the MIDX, which will
be used to implement the binary search over pack names.

This change produces MIDXs which may not be correctly read with external
tools or older versions of Git. Though older versions of Git know how to
gracefully degrade and ignore any MIDX(s) they consider corrupt,
external tools may not be as robust. To avoid unintentionally breaking
any such tools, guard this change behind a version bump in the MIDX's
on-disk format.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx-write.c: introduce `struct write_midx_opts`

In the MIDX writing code, there are four functions which perform some
sort of MIDX write operation. They are:

- write_midx_file()
- write_midx_file_only()
- expire_midx_packs()
- midx_repack()

All of these functions are thin wrappers over `write_midx_internal()`,
which implements the bulk of these routines. As a result, the
`write_midx_internal()` function takes six arguments.

Future commits in this series will want to add additional arguments, and
in general this function's signature will be the union of parameters
among *all* possible ways to write a MIDX.

Instead of adding yet more arguments to this function to support MIDX
compaction, introduce a `struct write_midx_opts`, which has the same
struct members as `write_midx_internal()`'s arguments.

Adding additional fields to the `write_midx_opts` struct is preferable
to adding additional arguments to `write_midx_internal()`. This is
because the callers below all zero-initialize the struct, so each time
we add a new piece of information, we do not have to pass the zero value
for it in all other call-sites that do not care about it.

For now, no functional changes are included in this patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx-write.c: don't use `pack_perm` when assigning `bitmap_pos`

In midx_pack_order(), we compute for each bitmapped pack the first bit
to correspond to an object in that pack, along with how many bits were
assigned to object(s) in that pack.

Initially, each bitmap_nr value is set to zero, and each bitmap_pos
value is set to the sentinel BITMAP_POS_UNKNOWN. This is done to ensure
that there are no packs who have an unknown bit position but a somehow
non-zero number of objects (cf. `write_midx_bitmapped_packs()` in
midx-write.c).

Once the pack order is fully determined, midx_pack_order() sets the
bitmap_pos field for any bitmapped packs to zero if they are still
listed as BITMAP_POS_UNKNOWN.

However, we enumerate the bitmapped packs in order of `ctx->pack_perm`.
This is fine for existing cases, since the only time the
`ctx->pack_perm` array holds a value outside of the addressable range of
`ctx->info` is when there are expired packs, which only occurs via 'git
multi-pack-index expire', which does not support writing MIDX bitmaps.
As a result, the range of ctx->pack_perm covers all values in [0,
`ctx->nr`), so enumerating in this order isn't an issue.

A future change necessary for compaction will complicate this further by
introducing a wrapper around the `ctx->pack_perm` array, which turns the
given `pack_int_id` into one that is relative to the lower end of the
compaction range. As a result, indexing into `ctx->pack_perm` through
this helper, say, with "0" will produce a crash when the lower end of
the compaction range has >0 pack(s) in its base layer, since the
subtraction will wrap around the 32-bit unsigned range, resulting in an
uninitialized read.

But the process is completely unnecessary in the first place: we are
enumerating all values of `ctx->info`, and there is no reason to process
them in a different order than they appear in memory. Index `ctx->info`
directly to reflect that.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/t5319-multi-pack-index.sh: fix copy-and-paste error in t5319.39

Commit d4bf1d88b90 (multi-pack-index: verify missing pack, 2018-09-13)
adds a new test to the MIDX test script to test how we handle missing
packs.

While the commit itself describes the test as "verify missing pack[s]",
the test itself is actually called "verify packnames out of order",
despite that not being what it tests.

Likely this was a copy-and-paste of the test immediately above it of the
same name. Correct this by renaming the test to match the commit
message.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-multi-pack-index(1): align SYNOPSIS with 'git multi-pack-index -h'

Since c39fffc1c90 (tests: start asserting that *.txt SYNOPSIS matches -h
output, 2022-10-13), the manual page for 'git multi-pack-index' has a
SYNOPSIS section which differs from 'git multi-pack-index -h'.

Correct this while also documenting additional options accepted by the
'write' sub-command.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-multi-pack-index(1): remove non-existent incompatibility

Since fcb2205b774 (midx: implement support for writing incremental MIDX
chains, 2024-08-06), the command-line options '--incremental' and
'--bitmap' were declared to be incompatible with one another when
running 'git multi-pack-index write'.

However, since 27afc272c49 (midx: implement writing incremental MIDX
bitmaps, 2025-03-20), that incompatibility no longer exists, despite the
documentation saying so. Correct this by removing the stale reference to
their incompatibility.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/multi-pack-index.c: make '--progress' a common option

All multi-pack-index sub-commands (write, verify, repack, and expire)
support a '--progress' command-line option, despite not listing it as
one of the common options in `common_opts`.

As a result each sub-command declares its own `OPT_BIT()` for a
"--progress" command-line option. Centralize this within the
`common_opts` to avoid re-declaring it in each sub-command.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx: introduce `midx_get_checksum_hex()`

When trying to print out, say, the hexadecimal representation of a
MIDX's hash, our code will do something like:

hash_to_hex_algop(midx_get_checksum_hash(m),
m->source->odb->repo->hash_algo);

, which is both cumbersome and repetitive. In fact, all but a handful of
callers to `midx_get_checksum_hash()` do exactly the above. Reduce the
repetitive nature of calling `midx_get_checksum_hash()` by having it
return a pointer into a static buffer containing the above result.

For the handful of callers that do need to compare the raw bytes and
don't want to deal with an encoded copy (e.g., because they are passing
it to hasheq() or similar), they may still rely on
`midx_get_checksum_hash()` which returns the raw bytes.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx: rename `get_midx_checksum()` to `midx_get_checksum_hash()`

Since 541204aabea (Documentation: document naming schema for structs and
their functions, 2024-07-30), we have adopted a naming convention for
functions that would prefer a name like, say, `midx_get_checksum()` over
`get_midx_checksum()`.

Adopt this convention throughout the midx.h API. Since this function
returns a raw (that is, non-hex encoded) hash, let's suffix the function
with "_hash()" to make this clear. As a side effect, this prepares us
for the subsequent change which will introduce a "_hex()" variant that
encodes the checksum itself.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx: mark `get_midx_checksum()` arguments as const

To make clear that the function `get_midx_checksum()` does not do
anything to modify its argument, mark the MIDX pointer as const.

The following commit will rename this function altogether to make clear
that it returns the raw bytes of the checksum, not a hex-encoded copy of
it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

build: regenerate config-list.h when Documentation changes

The Meson-based build doesn't know when to rebuild config-list.h, so the
header is sometimes stale.

For example, an old build directory might have config-list.h from before
4173df5187 (submodule: introduce extensions.submodulePathConfig,
2026-01-12), which added submodule.<name>.gitdir to the list. Without
it, t9902-completion.sh fails. Regenerating the config-list.h artifact
from sources fixes the artifact and the test.

Since Meson does not have (or want) builtin support for globbing like
Make, teach generate-configlist.sh to also generate a list of
Documentation files its output depends on, and incorporate that into the
Meson build. We honor the undocumented GCC/Clang contract of outputting
empty targets for all the dependencies (like they do with -MP). That is,
generate lines like

    build/config-list.h: $SOURCE_DIR/Documentation/config.adoc
    $SOURCE_DIR/Documentation/config.adoc:

We assume that if a user adds a new file under
Documentation/config then they will also edit one of the existing files
to include that new file, and that will trigger a rebuild. Also mark the
generator script as a dependency.

While we're at it, teach the Makefile to use the same "the script knows
it's dependencies" logic.

For Meson, combining the following commands helps debug dependencies:

    ninja -C <builddir> -t deps config-list.h
    ninja -C <builddir> -t browse config-list.h

The former lists all the dependencies discovered from our output ".d"
file (the config documentation) and the latter shows the dependency on
the script itself, among other useful edges in the dependency graph.

Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jh/alias-i18n' into jh/alias-i18n-fixes

* jh/alias-i18n:
  completion: fix zsh alias listing for subsection aliases
  alias: support non-alphanumeric names via subsection syntax
  alias: prepare for subsection aliases
  help: use list_aliases() for alias listing

builtin/maintenance: use "geometric" strategy by default

The git-gc(1) command has been introduced in the early days of Git in
30f610b7b0 (Create 'git gc' to perform common maintenance operations.,
2006-12-27) as the main repository maintenance utility. And while the
tool has of course evolved since then to cover new parts, the basic
strategy it uses has never really changed much.

It is safe to say that since 2006 the Git ecosystem has changed quite a
bit. Repositories tend to be much larger nowadays than they have been
almost 20 years ago, and large parts of the industry went crazy for
monorepos (for various wildly different definitions of "monorepo"). So
the maintenance strategy we used back then may not be the best fit
nowadays anymore.

Arguably, most of the maintenance tasks that git-gc(1) does are still
perfectly fine today: repacking references, expiring various data
structures and things like tend to not cause huge problems. But the big
exception is the way we repack objects.

git-gc(1) by default uses a split strategy: it performs incremental
repacks by default, and then whenever we have too many packs we perform
a large all-into-one repack. This all-into-one repack is what is causing
problems nowadays, as it is an operation that is quite expensive. While
it is wasteful in small- and medium-sized repositories, in large repos
it may even be prohibitively expensive.

We have eventually introduced git-maintenance(1) that was slated as a
replacement for git-gc(1). In contrast to git-gc(1), it is much more
flexible as it is structured around configurable tasks and strategies.
So while its default "gc" strategy still uses git-gc(1) under the hood,
it allows us to iterate.

A second strategy it knows about is the "incremental" strategy, which we
configure when registering a repository for scheduled maintenance. This
strategy isn't really a full replacement for git-gc(1) though, as it
doesn't know to expire unused data structures. In Git 2.52 we have thus
introduced a new "geometric" strategy that is a proper replacement for
the old git-gc(1).

In contrast to the incremental/all-into-one split used by git-gc(1), the
new "geometric" strategy maintains a geometric progression of packfiles,
which significantly reduces the number of all-into-one repacks that we
have to perform in large repositories. It is thus a much better fit for
large repositories than git-gc(1).

Note that the "geometric" strategy isn't perfect though: while we
perform way less all-into-one repacks compared to git-gc(1), we still
have to perform them eventually. But for the largest repositories out
there this may not be an option either, as client machines might not be
powerful enough to perform such a repack in the first place. These cases
would thus still be covered by the "incremental" strategy.

Switch the default strategy away from "gc" to "geometric", but retain
the "incremental" strategy configured when registering background
maintenance with `git maintenance register`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t7900: prepare for switch of the default strategy

The t7900 test suite is exercising git-maintenance(1) and is thus of
course heavily reliant on the exact maintenance strategy. This reliance
comes in two flavors:

  - One test explicitly wants to verify that git-gc(1) is run as part of
    `git maintenance run`. This test is adapted by explicitly picking the
    "gc" strategy.

  - The other tests assume a specific shape of the object database,
    which is dependent on whether or not we run auto-maintenance before
    we come to the actual subject under test. These tests are adapted by
    disabling auto-maintenance.

With these changes t7900 passes with both "gc" and "geometric" default
strategies.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t6500: explicitly use "gc" strategy

The test in t6500 explicitly wants to exercise git-gc(1) and is thus
highly specific to the actual on-disk state of the repository and
specifically of the object database. An upcoming change modifies the
default maintenance strategy to be the "geometric" strategy though,
which breaks a couple of assumptions.

One fix would arguably be to disable auto-maintenance altogether, as we
do want to explicitly verify git-gc(1) anyway. But as the whole test
suite is about git-gc(1) in the first place it feels more sensible to
configure the default maintenance strategy to be "gc".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5510: explicitly use "gc" strategy

One of the tests in t5510 wants to verify that auto-gc does not lock up
when fetching into a repository. Adapt it to explicitly pick the "gc"
strategy for auto-maintenance.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5400: explicitly use "gc" strategy

In t5400 we verify that git-receive-pack(1) runs automated repository
maintenance in the remote repository. The check is performed indirectly
by observing an effect that git-gc(1) would have, namely to prune a
temporary object from the object database. In a subsequent commit we're
about to switch to the "geometric" strategy by default though, and here
we stop observing that effect.

Adapt the test to explicitly use the "gc" strategy to prepare for that
upcoming change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t34xx: don't expire reflogs where it matters

We have a couple of tests in the t34xx range that rely on reflogs. This
never really used to be a problem, but in a subsequent commit we will
change the default maintenance strategy from "gc" to "geometric", and
this will cause us to drop all reflogs in these tests.

This may seem surprising and like a bug at first, but it's actually not.
The main difference between these two strategies is that the "gc"
strategy will skip all maintenance in case the object database is in a
well-optimized state. The "geometric" strategy has separate subtasks
though, and the conditions for each of these tasks is evaluated on a
case by case basis. This means that even if the object database is in
good shape, we may still decide to expire reflogs.

So why is that a problem? The issue is that Git's test suite hardcodes
the committer and author dates to a date in 2005. Interestingly though,
these hardcoded dates not only impact the commits, but also the reflog
entries. The consequence is that all newly written reflog entries are
immediately considered stale as our reflog expiration threshold is in
the range of weeks, only. It follows that executing `git reflog expire`
will thus immediately purge all reflog entries.

This hasn't been a problem in our test suite by pure chance, as the
repository shapes simply didn't cause us to perform actual garbage
collection. But with the upcoming "geometric" strategy we _will_ start
to execute `git reflog expire`, thus surfacing this issue.

Prepare for this by explicitly disabling reflog expiration in tests
impacted by this upcoming change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: disable maintenance where we verify object database structure

We have a couple of tests that explicitly verify the structure of the
object database. Naturally, this structure is dependent on whether or
not we run repository maintenance: if it decides to optimize the object
database the expected structure is likely to not materialize.

Explicitly disable auto-maintenance in such tests so that we are not
dependent on decisions made by our maintenance.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: fix races caused by background maintenance

Many Git commands spawn git-maintenance(1) to optimize the repository in
the background. By default, performing the maintenance is for most of
the part asynchronous: we fork the executable and then continue with the
rest of our business logic.

This is working as expected for our users, but this behaviour is
somewhat problematic for our test suite as this is inherently racy. We
have many tests that verify the on-disk state of repositories, and those
tests may easily race with our background maintenance. In a similar
fashion, we may end up with processes that "leak" out of a current test
case.

Until now this tends to not be much of a problem. Our maintenance uses
git-gc(1) by default, which knows to bail out in case there aren't
either too many packfiles or too many loose objects. So even if other
data structures would need to be optimized, we won't do so unless the
object database also needs optimizations.

This is about to change though, as a subsequent commit will switch to
the "geometric" maintenance strategy as a default. The consequence is
that we will run required optimizations even if the object database is
well-optimized. And this uncovers races between our test suite and
background maintenance all over the place.

Disabling maintenance outright in our test suite is not really an
option, as it would result in significant divergence from the "real
world" and reduce our test coverage. But we've got an alternative up our
sleeves: we can ensure that garbage collection runs synchronously by
overriding the "maintenance.autoDetach" configuration.

Of course that also diverges from the real world, as we now stop testing
that background maintenance interacts in a benign way with normal Git
commands. But on the other hand this ensures that the maintenance itself
does not for example lead to data loss in a more reproducible way.

Another concern is that this would make execution of the test suite much
slower. But a quick benchmark on my machine demonstrates that this does
not seem to be the case:

    Benchmark 1: meson test (revision = HEAD~)
      Time (mean ± σ):     131.182 s ±  1.293 s    [User: 853.737 s, System: 1160.479 s]
      Range (min … max):   130.001 s … 132.563 s    3 runs

    Benchmark 2: meson test (revision = HEAD)
      Time (mean ± σ):     129.554 s ±  0.507 s    [User: 849.040 s, System: 1152.664 s]
      Range (min … max):   129.000 s … 129.994 s    3 runs

    Summary
      meson test (revision = HEAD) ran
        1.01 ± 0.01 times faster than meson test (revision = HEAD~)

Funny enough, it even seems as if this speeds up test execution ever so
slightly, but that may just as well be noise.

Introduce a new `GIT_TEST_MAINT_AUTO_DETACH` environment variable that
allows us to override the auto-detach behaviour and set that variable in
our tests.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diffcore-break: avoid segfault with freed entries

After we have freed the file pair, we should set the queue reference to null.
When computing a diff in a partial clone, there is a chance that we
could trigger a prefetch of missing objects when there are freed entries in
the global diff queue due to break-rewrites detection. The segfault only occurs
if an entry has been freed by break-rewrites and there is an entry
to be prefetched.

There is a new test in t4067 that trigger the segmentation fault that results
in this case. The test explicitly fetch the necessary blobs to trigger the
break rewrites, some blobs are left to be prefetched.

The fix is to set the queue pointer to NULL after it is freed, the prefetch
will skip NULL entries.

Signed-off-by: Han Young <hanyang.tony@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: diff-options.adoc: show format.noprefix for format-patch

git-format-patch(1) uses `format.noprefix` and ignores `diff.noprefix`.

The configuration variable `format.prefix` was added as an “escape
hatch”, and “it’s unlikely that anybody really wants format.
noprefix=true in the first place.”[1] Based on that there doesn’t
seem to be a need to widely advertise this configuration variable.

But in any case: the documentation for this option should not claim
that it overrides a config that is always ignored.

† 1: 8d5213de (format-patch: add format.noprefix option, 2023-03-09)

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

format-patch: make format.noprefix a boolean

The config `format.noprefix` was added in 8d5213de (format-patch: add
format.noprefix option, 2023-03-09) to support no-prefix on paths.
That was immediately after making git-format-patch(1) not respect
`diff.noprefix`.[1]

The intent was to mirror `diff.noprefix`. But this config was
unintentionally[2] implemented by enabling no-prefix if any kind of
value is set.

† 1: c169af8f (format-patch: do not respect diff.noprefix, 2023-03-09)
† 2: https://lore.kernel.org/all/20260211073553.GA1867915@coredump.intra.peff.net/

Let’s indeed mirror `diff.noprefix` by treating it as a boolean.

This is a breaking change. And as far as breaking changes go it is
pretty benign:

• The documentation claims that this config is equivalent to
  `diff.noprefix`; this is just a bug fix if the documentation is
  what defines the application interface
• Only users with non-boolean values will run into problems when we
  try to parse it as a boolean. But what would (1) make them suspect
  they could do that in the first place, and (2) have motivated them to
  do it?
• Users who have set this to `false` and expect that to mean *enable
  format.noprefix* (current behavior) will now have the opposite
  experience. Which is not a reasonable setup.

Let’s only offer a breaking change fig leaf by advising about the
previous behavior before dying.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ps/object-info-bits-cleanup' into ps/odb-sources

* ps/object-info-bits-cleanup:
  odb: convert `odb_has_object()` flags into an enum
  odb: convert object info flags into an enum
  odb: drop gaps in object info flag values
  builtin/fsck: fix flags passed to `odb_has_object()`
  builtin/backfill: fix flags passed to `odb_has_object()`

Merge branch 'ps/odb-for-each-object' into ps/odb-sources

* ps/odb-for-each-object:
  odb: drop unused `for_each_{loose,packed}_object()` functions
  reachable: convert to use `odb_for_each_object()`
  builtin/pack-objects: use `packfile_store_for_each_object()`
  odb: introduce mtime fields for object info requests
  treewide: drop uses of `for_each_{loose,packed}_object()`
  treewide: enumerate promisor objects via `odb_for_each_object()`
  builtin/fsck: refactor to use `odb_for_each_object()`
  odb: introduce `odb_for_each_object()`
  packfile: introduce function to iterate through objects
  packfile: extract function to iterate through objects of a store
  object-file: introduce function to iterate through objects
  object-file: extract function to read object info from path
  odb: fix flags parameter to be unsigned
  odb: rename `FOR_EACH_OBJECT_*` flags

config: use an enum for type

The --type=<X> option for 'git config' has previously been defined using
macros, but using a typed enum is better for tracking the possible
values.

Move the definition up to make sure it is defined before a macro uses
some of its terms.

Update the initializer for config_display_options to explicitly set
'type' to TYPE_NONE even though this is implied by a zero value.

This assists in knowing that the switch statement added in the previous
change has a complete set of cases for a properly-valued enum.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: restructure format_config()

The recent changes have replaced the bodies of most if/else-if cases
with simple helper method calls. This makes it easy to adapt the
structure into a clearer switch statement, leaving a simple if/else in
the default case.

Make things a little simpler to read by reducing the nesting depth via a
new goto statement when we want to skip values.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: format colors quietly

Move the logic for formatting color config value into a helper method
and use quiet parsing when needed.

This removes error messages when parsing a list of config values that do
not match color formats.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

color: add color_parse_quietly()

When parsing colors, a failed parse leads to an error message due to the
result returning error(). To allow for quiet parsing, create
color_parse_quietly(). This is in contrast to an ..._gently() version
because the original does not die(), so both options are technically
'gentle'.

To accomplish this, convert the implementation of color_parse_mem() into
a static color_parse_mem_1() helper that adds a 'quiet' parameter. The
color_parse_quietly() method can then use this. Since it is a near
equivalent to color_parse(), move that method down in the file so they
can be nearby while also appearing after color_parse_mem_1().

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: format expiry dates quietly

Move the logic for formatting expiry date config values into a helper
method and use quiet parsing when needed.

Note that git_config_expiry_date() will show an error on a bad parse and
not die() like most other git_config...() parsers. Thus, we use
'quietly' here instead of 'gently'.

There is an unfortunate asymmetry in these two parsing methods, but we
need to treat a positive response from parse_expiry_date() as an error
or we will get incorrect values.

This updates the behavior of 'git config list --type=expiry-date' to be
quiet when attempting parsing on non-date values.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: format paths gently

Move the logic for formatting path config values into a helper method
and use gentle parsing when needed.

We need to be careful about how to handle the ':(optional)' macro, which
as tested in t1311-config-optional.sh must allow for ignoring a missing
path when other multiple values exist, but cause 'git config get' to
fail if it is the only possible value and thus no result is output.

In the case of our list, we need to omit those values silently. This
necessitates the use of the 'gently' parameter here.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: format bools or strings in helper

Move the logic for formatting bool-or-string config values into a
helper. This parsing has always been gentle, so this is not unlocking
new behavior. This extraction is only to match the formatting of the
other cases that do need a behavior change.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: format bools or ints gently

Move the logic for formatting bool-or-int config values into a helper
method and use gentle parsing when needed.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: format bools gently

Move the logic for formatting bool config values into a helper method
and use gentle parsing when needed.

This makes 'git config list --type=bool' not fail when coming across a
non-boolean value. Such unparseable values are filtered out quietly.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: format int64s gently

Move the logic for formatting int64 config values into a helper method
and use gentle parsing when needed.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: make 'git config list --type=<X>' work

Previously, the --type=<X> argument to 'git config list' was ignored and
did nothing. Now, we add the use of format_config() to the
show_all_config() function so each key-value pair is attempted to be
parsed. This is our first use of the 'gently' parameter with a nonzero
value.

When listing multiple values, our initial settings for the output format
is different. Add a new init helper to specify the fact that keys should
be shown and also add the default delimiters as they were unset in some
cases.

Our intention is that if there is an error in parsing, then the row is
not output. This is necessary to avoid the caller needing to build their
own validator to understand the difference between valid, canonicalized
types and other raw string values. The raw values will always be
available to the user if they do not specify the --type=<X> option.

The current behavior is more complicated, including error messages on
bad parsing or potentially complete failure of the command. We add
tests at this point that demonstrate the current behavior so we can
witness the fix in future changes that parse these values quietly and
gently.

This is a change in behavior! We are starting to respect an option that
was previously ignored, leading to potential user confusion. This is
probably still a good option, since the --type argument did not change
behavior at all previously, so users can get the behavior they expect by
removing the --type argument or adding the --no-type argument.

t1300-config.sh is updated with the current behavior of this formatting
logic to justify the upcoming refactoring of format_config() that will
incrementally fix some of these cases to be more user-friendly.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: add 'gently' parameter to format_config()

This parameter is set to 0 for all current callers and is UNUSED.
However, we will start using this option in future changes and in a
critical change that requires gentle parsing (not using die()) to try
parsing all values in a list.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: move show_all_config()

In anticipation of using format_config() in this method, move
show_all_config() lower in the file without changes.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: replace `refs_for_each_fullref_in()`

Replace calls to `refs_for_each_fullref_in()` with the newly introduced
`refs_for_each_ref_ext()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: replace `refs_for_each_namespaced_ref()`

Replace calls to `refs_for_each_namespaced_ref()` with the newly
introduced `refs_for_each_ref_ext()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: replace `refs_for_each_glob_ref()`

Replace calls to `refs_for_each_glob_ref()` with the newly introduced
`refs_for_each_ref_ext()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: replace `refs_for_each_glob_ref_in()`

Replace calls to `refs_for_each_glob_ref_in()` with the newly introduced
`refs_for_each_ref_ext()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: replace `refs_for_each_rawref_in()`

Replace calls to `refs_for_each_rawref_in()` with the newly introduced
`refs_for_each_ref_ext()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: replace `refs_for_each_rawref()`

Replace calls to `refs_for_each_rawref()` with the newly introduced
`refs_for_each_ref_ext()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>