git.ipfire.org Git - thirdparty/git.git/log

The 10th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'hy/diff-lazy-fetch-with-break-fix'

A prefetch call can be triggered to access a stale diff_queue entry
after diffcore-break breaks a filepair into two and freed the
original entry that is no longer used, leading to a segfault, which
has been corrected.

* hy/diff-lazy-fetch-with-break-fix:
diffcore-break: avoid segfault with freed entries

Merge branch 'aa/add-p-no-auto-advance'

"git add -p" learned a new mode that allows the user to revisit a
file that was already dealt with.

* aa/add-p-no-auto-advance:
  add-patch: allow interfile navigation when selecting hunks
  add-patch: allow all-or-none application of patches
  add-patch: modify patch_update_file() signature
  interactive -p: add new `--auto-advance` flag

Merge branch 'lg/t2004-test-path-is-helpers'

Test code clean-up.

* lg/t2004-test-path-is-helpers:
t2004: use test_path_is_file instead of test -f

Merge branch 'ps/simplify-normalize-path-copy-len'

Code clean-up.

* ps/simplify-normalize-path-copy-len:
path: factor out skip_slashes() in normalize_path_copy_len()

Merge branch 'sc/pack-redundant-leakfix'

Leakfix.

* sc/pack-redundant-leakfix:
pack-redundant: fix memory leak when open_pack_index() fails

Merge branch 'cs/subtree-split-fixes'

An earlier attempt to optimize "git subtree" discarded too much
relevant histories, which has been corrected.

* cs/subtree-split-fixes:
  contrib/subtree: process out-of-prefix subtrees
  contrib/subtree: test history depth
  contrib/subtree: capture additional test-cases

for-each-repo: simplify passing of parameters

This change simplifies the code somewhat from its original
implementation.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

for-each-repo: work correctly in a worktree

When run in a worktree, the GIT_DIR directory is set in a different way
than in a typical repository. Show this by updating t0068 to include a
worktree and add a test that runs from that worktree. This requires
moving the repo.key config into a global config instead of the base test
repository's local config (demonstrating that it worked with
non-worktree Git repositories).

We need to be careful to unset the local Git environment variables and
let the child process rediscover them, while also reinstating those
variables in the parent process afterwards. Update run_command_on_repo()
to use the new sanitize_repo_env() helper method to erase these
environment variables.

During review of this bug fix, there were several incorrect patches
demonstrating different bad behaviors. Most of these are covered by
tests, when it is not too expensive to set it up. One case that would be
expensive to set up is the GIT_NO_REPLACE_OBJECTS environment variable,
but we trust that using sanitize_repo_env() will be sufficient to
capture these uncovered cases by using the common code for resetting
environment variables.

Reported-by: Matthew Gabeler-Lee <fastcat@gmail.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

run-command: extract sanitize_repo_env helper

The current prepare_other_repo_env() does two distinct things:

1. Strip certain known environment variables that should be set by a
child process based on a different repository.

2. Set the GIT_DIR variable to avoid repository discovery.

The second item is valuable for child processes that operate on
submodules, where the repo discovery could be mistaken for the parent
repository.

In the next change, we will see an important case where only the first
item is required as the GIT_DIR discovery should happen naturally from
the '-C' parameter in the child process.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

for-each-repo: test outside of repo context

The 'git for-each-repo' tool is frequently run outside of a repo context
in the real world. For example, it powers background maintenance.
Despite this typical case, we have not been testing it without a local
repository.

Update t0068 to stop creating a test repo and to use global config
everywhere. This has some subtle changes to test across the file.

This was noticed because an earlier attempt to remove the_repository
from builtin/for-each-repo.c did not catch a segmentation fault since
the passed 'repo' is NULL. This use of the_repository will need to stay
until we have a better way to handle config queries outside of a repo
context. Similar use still exists in builtin/config.c for the same
reason.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: fix list continuation in alias.adoc

Add missing list continuation marks ('+') after code blocks and shell examples
so paragraphs render correctly as part of the preceding list item.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

xdiff: re-diff shifted change groups when using histogram algorithm

After a diff algorithm has been run, the compaction phase
(xdl_change_compact()) shifts and merges change groups to produce a
cleaner output. However, this shifting could create a new matched group
where both sides now have matching lines. This results in a
wrong-looking diff output which contains redundant lines that are the
same on both files.

Fix this by detecting this situation, and re-diff the texts on each side
to find similar lines, using the fall-back Myer's diff. Only do this for
histogram diff as it's the only algorithm where this is relevant. Below
contains an example, and more details.

For an example, consider two files below:

    file1:
        A

        A
        A
        A

        A
        A
        A

    file2:
        A

        A
        x
        A

        A
        A
        A

When using Myer's diff, the algorithm finds that only the "x" has been
changed, and produces a final diff result (these are line diffs, but
using word-diff syntax for ease of presentation):

        A A[-A-]{+x+}A AAA

When using histogram diff, the algorithm first discovers the LCS "A
AAA", which it uses as anchor, then produces an intermediate diff:

        {+A Ax+}A AAA[- AAA-].

This is a longer diff than Myer's, but it's still self-consistent.
However, the compaction phase attempts to shift the first file's diff
group upwards (note that this shift crosses the anchor that histogram
had used), leading to the final results for histogram diff:

        [-A AA-]{+A Ax+}A AAA

This is a technically correct patch but looks clearly redundant to a
human as the first 3 lines should not be in the diff.

The fix would detect that a shift has caused matching to a new group,
and re-diff the "A AA" and "A Ax" parts, which results in "A A"
correctly re-marked as unchanged. This creates the now correct histogram
diff:

        A A[-A-]{+x+}A AAA

This issue is not applicable to Myer's diff algorithm as it already
generates a minimal diff, which means a shift cannot result in a smaller
diff output (the default Myer's diff in xdiff is not guaranteed to be
minimal for performance reasons, but it typically does a good enough
job).

It's also not applicable to patience diff, because it uses only unique
lines as anchor for its splits, and falls back to Myer's diff within
each split. Shifting requires both ends having the same lines, and
therefore cannot cross the unique line boundaries established by the
patience algorithm. In contrast histogram diff uses non-unique lines as
anchors, and therefore shifting can cross over them.

This issue is rare in a normal repository. Below is a table of
repositories (`git log --no-merges -p --histogram -1000`), showing how
many times a re-diff was done and how many times it resulted in finding
matching lines (therefore addressing this issue) with the fix. In
general it is fewer than 1% of diff's that exhibit this offending
behavior:

| Repo (1k commits)  | Re-diff | Found matching lines |
|--------------------|---------|----------------------|
| llvm-project       |  45     | 11                   |
| vim                | 110     |  9                   |
| git                |  18     |  2                   |
| WebKit             | 168     |  1                   |
| ripgrep            |  22     |  1                   |
| cpython            |  32     |  0                   |
| vscode             |  13     |  0                   |

Signed-off-by: Yee Cheng Chin <ychin.git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: gitprotocol-pack: normalize italic formatting

Uniform italic style usage for command and process names.

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: gitprotocol-pack: improve paragraphs structure

Logically separate the introductory sentence from the first transport
description to improve readability and structural clarity.

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: gitprotocol-pack: fix pronoun-antecedent agreement

Fix "pronoun-antecedent agreement" errors.

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The 9th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jt/object-file-use-container-of'

Code clean-up.

* jt/object-file-use-container-of:
object-file.c: avoid container_of() of a NULL container
object-file: use `container_of()` to convert from base types

Merge branch 'ps/receive-pack-shallow-optim'

The code to accept shallow "git push" has been optimized.

* ps/receive-pack-shallow-optim:
  commit: use commit graph in `lookup_commit_reference_gently()`
  commit: make `repo_parse_commit_no_graph()` more robust
  commit: avoid parsing non-commits in `lookup_commit_reference_gently()`

Merge branch 'kh/doc-patch-id-4'

Doc update.

* kh/doc-patch-id-4:
  doc: patch-id: see also git-cherry(1)
  doc: patch-id: add script example
  doc: patch-id: emphasize multi-patch processing

Merge branch 'ps/meson-gitk-git-gui'

Plumb gitk/git-gui build and install procedure in meson based
builds.

* ps/meson-gitk-git-gui:
meson: wire up gitk and git-gui

Merge branch 'pw/meson-doc-mergetool'

Update build precedure for mergetool documentation in meson-based builds.

* pw/meson-doc-mergetool:
meson: fix building mergetool docs

Merge branch 'kh/doc-am-xref'

Doc update.

* kh/doc-am-xref:
  doc: am: fill out hook discussion
  doc: am: add missing config am.messageId
  doc: am: say that --message-id adds a trailer
  doc: am: normalize git(1) command links

Merge branch 'ps/object-info-bits-cleanup'

A couple of bugs in use of flag bits around odb API has been
corrected, and the flag bits reordered.

* ps/object-info-bits-cleanup:
  odb: convert `odb_has_object()` flags into an enum
  odb: convert object info flags into an enum
  odb: drop gaps in object info flag values
  builtin/fsck: fix flags passed to `odb_has_object()`
  builtin/backfill: fix flags passed to `odb_has_object()`

Merge branch 'ag/http-netrc-tests'

Additional tests were introduced to see the interaction with netrc
auth with auth failure on the http transport.

* ag/http-netrc-tests:
t5550: add netrc tests for http 401/403

Merge branch 'ty/symlinks-use-unsigned-for-bitset'

Code clean-up.

* ty/symlinks-use-unsigned-for-bitset:
symlinks: use unsigned int for flags

Merge branch 'ps/validate-prefix-in-subtree-split'

"git subtree split --prefix=P <commit>" now checks the prefix P
against the tree of the (potentially quite different from the
current working tree) given commit.

* ps/validate-prefix-in-subtree-split:
subtree: validate --prefix against commit in split

Merge branch 'uk/signature-is-good-after-key-expires'

A signature on a commit that was GPG signed long time ago ought to
be still valid after the key that was used to sign it has expired,
but we showed them in alarming red.

* uk/signature-is-good-after-key-expires:
gpg-interface: signatures by expired keys are fine

Merge branch 'ps/odb-for-each-object'

Revamp object enumeration API around odb.

* ps/odb-for-each-object:
  odb: drop unused `for_each_{loose,packed}_object()` functions
  reachable: convert to use `odb_for_each_object()`
  builtin/pack-objects: use `packfile_store_for_each_object()`
  odb: introduce mtime fields for object info requests
  treewide: drop uses of `for_each_{loose,packed}_object()`
  treewide: enumerate promisor objects via `odb_for_each_object()`
  builtin/fsck: refactor to use `odb_for_each_object()`
  odb: introduce `odb_for_each_object()`
  packfile: introduce function to iterate through objects
  packfile: extract function to iterate through objects of a store
  object-file: introduce function to iterate through objects
  object-file: extract function to read object info from path
  odb: fix flags parameter to be unsigned
  odb: rename `FOR_EACH_OBJECT_*` flags

Merge branch 'ar/run-command-hook-take-2' into ar/config-hooks

* ar/run-command-hook-take-2:
builtin/receive-pack: avoid spinning no-op sideband async threads

builtin/receive-pack: avoid spinning no-op sideband async threads

Exit early if the hooks do not exist, to avoid spinning up/down
sideband async threads which no-op.

It is important to call the hook_exists() API provided by hook.[ch]
because it covers both config-defined hooks and the "traditional"
hooks from the hookdir. find_hook() only covers the hookdir hooks.

The regression happened because the no-op async threads add some
additional overhead which can be measured with the receive-refs test
of the benchmarks suite [1].

Reproduced using:
cd benchmarks/receive-refs && \
./run --revisions /path/to/git \
fc148b146ad41be71a7852c4867f0773cbfe1ff9~,fc148b146ad41be71a7852c4867f0773cbfe1ff9 \
--parameter-list refformat reftable --parameter-list refcount 10000

1: https://gitlab.com/gitlab-org/data-access/git/benchmarks

Fixes: fc148b146ad4 ("receive-pack: convert update hooks to new API")
Reported-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
[jc: avoid duplicated hardcoded hook names]
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/repo: find tree with most entries

The size of a tree object usually corresponds with the number of entries
it has. While iterating through objects in the repository for
git-repo-structure, identify the tree with the most entries and display
it in the output.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/repo: find commit with most parents

Complex merge events may produce an octopus merge where the resulting
merge commit has more than two parents. While iterating through objects
in the repository for git-repo-structure, identify the commit with the
most parents and display it in the output.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/repo: add OID annotations to table output

The "structure" output for git-repo(1) does not show the corresponding
OIDs for the largest objects in its "table" output. Update the output to
include a list of OID annotations with an index to the corresponding row
in the table.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/repo: collect largest inflated objects

The "structure" output for git-repo(1) shows the total inflated and disk
sizes of reachable objects in the repository, but doesn't show the size
of the largest individual objects. Since an individual object may be a
large contributor to the overall repository size, it is useful for users
to know the maximum size of individual objects.

While interating across objects, record the size and OID of the largest
objects encountered for each object type to provide as output. Note that
the default "table" output format only displays size information and not
the corresponding OID. In a subsequent commit, the table format is
updated to add table annotations that mention the OID.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/repo: add helper for printing keyvalue output

The machine-parsable formats for the git-repo(1) "structure" subcommand
print output in keyvalue pairs. Introduce the helper function
`print_keyvalue()` to remove some code duplication and improve
readability.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/repo: update stats for each object

When walking reachable objects in the repository, `count_objects()`
processes a set of objects and updates the `struct object_stats`. In
preparation for more granular statistics being collected, update the
`struct object_stats` for each individual object instead.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: fix "that that" typo in lib-unicode-nfc-nfd.sh

In the comments of lib-unicode-nfc-nfd.sh, "that that" was used
unintentionally. Remove the redundant "that" to improve clarity.

Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3310: replace test -f/-d with test_path_is_file/test_path_is_dir

Replace old-style path assertions with modern helpers that
provide clearer diagnostic messages on failure. When test -f
fails, the output gives no indication of what went wrong.

These instances were found using:

git grep "test -[efd]" t/

as suggested in the microproject ideas.

Signed-off-by: Francesco Paparatto <francescopaparatto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: unset GITLAB_FEATURES envvar to not bust xargs(1) limits

We have started to see the following assert happen in our GitLab CI
pipelines for jobs that use Windows with Meson:

  assertion "bc_ctl.arg_max >= LINE_MAX" failed: file "xargs.c", line 512, function: main

The assert in question verifies that we have enough room available to
pass at least `LINE_MAX` many bytes via the command line. The xargs(1)
binary in those jobs comes from Git for Windows, which in turn sources
the binaries from MSYS2, and has the following limits in place:

  $ & "C:/Program Files/Git/usr/bin/bash.exe" -l -c 'xargs --show-limits </dev/null'
  Your environment variables take up 17373 bytes
  POSIX upper limit on argument length (this system): 12579
  POSIX smallest allowable upper limit on argument length (all systems): 4096
  Maximum length of command we could actually use: 18446744073709546822
  Size of command buffer we are actually using: 12579
  Maximum parallelism (--max-procs must be no greater): 2147483647

What's interesting to see is the limit of 16 exabits for the maximum
command line length. This value might seem a bit high, and it is indeed
the result of an underflow: our environment is larger than the POSIX
upper limit on argument length, and the value is computed by subtracting
the former from the latter. So what we get is the result of `2^64 -
(17373 - 12579)`.

This makes it clear that the problem here is the size of our environment
variables. A listing sorted by length yields the following result:

  $ Get-ChildItem "Env:" |
      Sort-Object { $_.Value.Length } -Descending |
      Select-Object Name, @{Name="Length"; Expression={$_.Value.Length}}
  Name                                          Length
  ----                                          ------
  GITLAB_FEATURES                                 6386
  Path                                             706
  PSModulePath                                     229

The GITLAB_FEATURES environment variable makes up for roughly a third of
the complete environment. This variable is a comma-separated list of
features available for the GitLab instance, and seemingly it has been
growing over time as GitLab added more and more features.

Fix the issue by unsetting the environment variable in "ci/lib.sh". This
ensures that the environment variables are now smaller than the upper
limit on argument length again, and that in turn fixes the assert in
xargs(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: diff-options.adoc: make *.noprefix split translatable

We cannot split single words like what we did in the previous
commit. That is because the doc translations are processed in
bigger chunks.

Instead write the two paragraphs with the only variations being this
configuration variable.

Reported-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

send-email: add client certificate options

For SMTP servers that do "mutual certificate verification", the mail
client is required to present its own TLS certificate as well. This
patch adds --smtp-ssl-client-cert and --smtp-ssl-client-key for such
servers.

The problem of which private key for the certificate is chosen arises
when there are private keys in both the certificate and private key
file. According to the documentation of IO::Socket::SSL(link supplied),
the behaviour(the private key chosen) depends on the format of the
certificate. In a nutshell,

- PKCS12: the key in the cert always takes the precedence
- PEM: if the key file is not given, it will "try" to read one
from the cert PEM file

Many users may find this discrepancy unintuitive.

In terms of client certificate, git-send-email is implemented in a way
that what's possible with perl's SSL library is exposed to the user as
much as possible. In this instance, the user may choose to use a PEM
file that contains both certificate and private key should be
at their discretion despite the implications.

Link: https://metacpan.org/pod/IO::Socket::SSL#SSL_cert_file-%7C-SSL_cert-%7C-SSL_key_file-%7C-SSL_key
Link: https://lore.kernel.org/all/319bf98c-52df-4bf9-b157-e4bc2bf087d6@dev.snart.me/
Signed-off-by: David Timber <dxdt@dev.snart.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: fix crash with --find-object outside repository

When "git diff --find-object=<oid>" is run outside a git repository,
the option parsing callback eagerly resolves the OID via
repo_get_oid(), which reaches get_main_ref_store() and hits a BUG()
assertion because no repository has been set up.

Check startup_info->have_repository before attempting to resolve the
OID, and return a user-friendly error instead.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fsmonitor-watchman: fix variable reference and remove redundant code

The is_work_tree_watched() function in fsmonitor-watchman.sample has
two bugs:

1. Wrong variable in error check: After calling watchman_clock(), the
   result is stored in $o, but the code checks $output->{error} instead
   of $o->{error}. This means errors from the clock command are silently
   ignored.

2. Double output violates protocol: When the retry path triggers (the
   directory wasn't initially watched), output_result() is called with
   the "/" flag, then launch_watchman() is called recursively which
   calls output_result() again. This outputs two clock tokens to stdout,
   but git's fsmonitor v2 protocol expects exactly one response.

Fix #1 by checking $o->{error} after watchman_clock().

Fix #2 by removing the recursive launch_watchman() call. The "/"
"everything is dirty" flag already tells git to do a full scan, and
git will call the hook again on the next invocation with a valid clock
token.

With the recursive call removed, the $retry guard is no longer needed
since it only existed to prevent infinite recursion. Remove it.

Apply the same fixes to the test helper scripts in t/t7519/.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

send-email: validate charset name in 8bit encoding prompt

When a non-ASCII character is detected in the body or subject of the email
the user is prompted with,

        Which 8bit encoding should I declare [UTF-8]? foo

After this the input string is validated by the regex, based on the fact
that the charset string will be minimum 4 characters [1]. If the string is
more than 4 letters the email is sent, if not then a second prompt to
confirm is asked to the user,

        Are you sure you want to use <foo> [y/N]? y

This relies on a length based regex heuristic check to validate the user
input, and can allow clearly invalid charset names to pass if the input is
greater than 4 characters.

Add a semantic validation of the charset name using the
Encode::find_encoding() which is a bundled module of perl. If the encoding
is not recognized, warn the user and ask for confirmation before proceeding.
After this validation the lenght based validation becomes redundant and also
breaks flow, so change the regex of valid input to any non blank string.

Make the encoding warning logic specific to the 8bit prompt, also add a
unique confirmation prompt which  reduces the load on ask(), and improves
maintainability.

Additionally, the wording of the first prompt can confuse the user if not
read properly or under any default assumptions for a yes/no prompt. Change
the wording to make it explicitly clear to the user that the prompt needs a
string input, UTF-8 being the default.

The intended flow is,

        Declare which 8bit encoding to use [default: UTF-8]? foobar
        'foobar' does not appear to be a valid charset name. Use it anyway [y/N]?

[1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b

Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

parseopt: check for duplicate long names and numerical options

We already check for duplicate short names. Check for and report
duplicate long names and numerical options as well.

Perform the slightly expensive string duplicate check only when showing
the usage to keep the cost of normal invocations low. t0012-help.sh
covers it.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gitk: support link color in the Preferences dialog

As a dark-theme user, I use the Preferences dialog to set colors
for gitk. The only color I cannot change via that dialog is the
link foreground color, which leads to using the default link color
on a dark background that makes it hard to read.

Make the link foreground color also configurable in the Gitk
Preferences dialog's Color tab, so users won't need to dig into
the code/manual to check if it is configurable and can simply set
the color there.

Signed-off-by: Wang Zichong <wangzichong@deepin.org>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>

The 8th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ap/use-test-seq-f-more'

Test clean-up.

* ap/use-test-seq-f-more:
t: use test_seq -f and pipes in a few more places

Merge branch 'db/doc-fetch-jobs-auto'

Doc update.

* db/doc-fetch-jobs-auto:
doc: fetch: document `--jobs=0` behavior

Merge branch 'mf/format-patch-honor-from-for-cover-letter'

"git format-patch --from=<me>" did not honor the command line
option when writing out the cover letter, which has been corrected.

* mf/format-patch-honor-from-for-cover-letter:
format-patch: fix From header in cover letter

Merge branch 'jh/alias-i18n'

Extend the alias configuration syntax to allow aliases using
characters outside ASCII alphanumeric (plus '-').

* jh/alias-i18n:
  completion: fix zsh alias listing for subsection aliases
  alias: support non-alphanumeric names via subsection syntax
  alias: prepare for subsection aliases
  help: use list_aliases() for alias listing

Merge branch 'ps/tests-wo-iconv-fixes'

Some tests assumed "iconv" is available without honoring ICONV
prerequisite, which has been corrected.

* ps/tests-wo-iconv-fixes:
  t6006: don't use iconv(1) without ICONV prereq
  t5550: add ICONV prereq to tests that use "$HTTPD_URL/error"
  t4205: improve handling of ICONV prerequisite
  t40xx: don't use iconv(1) without ICONV prereq
  t: don't set ICONV prereq when iconv(1) is missing

Merge branch 'ps/ci-gitlab-msvc-updates'

CI update.

* ps/ci-gitlab-msvc-updates:
  gitlab-ci: handle failed tests on MSVC+Meson job
  gitlab-ci: use "run-test-slice-meson.sh"
  ci: make test slicing consistent across Meson/Make
  github: fix Meson tests not executing at all
  meson: fix MERGE_TOOL_DIR with "--no-bin-wrappers"
  ci: don't skip smallest test slice in GitLab
  ci: handle failures of test-slice helper

Merge branch 'jc/whitespace-incomplete-line'

It does not make much sense to apply the "incomplete-line"
whitespace rule to symbolic links, whose contents almost always
lack the final newline. "git apply" and "git diff" are now taught
to exclude them for a change to symbolic links.

* jc/whitespace-incomplete-line:
whitespace: symbolic links usually lack LF at the end

Merge branch 'jc/checkout-switch-restore'

"git switch <name>", in an attempt to create a local branch <name>
after a remote tracking branch of the same name gave an advise
message to disambiguate using "git checkout", which has been
updated to use "git switch".

* jc/checkout-switch-restore:
checkout: tell "parse_remote_branch" which command is calling it
checkout: pass program-readable token to unified "main"

Merge branch 'jk/ref-filter-lrstrip-optim'

Code clean-up.

* jk/ref-filter-lrstrip-optim:
  ref-filter: clarify lstrip/rstrip component counting
  ref-filter: avoid strrchr() in rstrip_ref_components()
  ref-filter: simplify rstrip_ref_components() memory handling
  ref-filter: simplify lstrip_ref_components() memory handling
  ref-filter: factor out refname component counting

Merge branch 'ps/history-ergonomics-updates'

UI improvements for "git history reword".

* ps/history-ergonomics-updates:
  Documentation/git-history: document default for "--update-refs="
  builtin/history: rename "--ref-action=" to "--update-refs="
  builtin/history: replace "--ref-action=print" with "--dry-run"
  builtin/history: check for merges before asking for user input
  builtin/history: perform revwalk checks before asking for user input

Merge branch 'ps/for-each-ref-in-fixes'

A handful of places used refs_for_each_ref_in() API incorrectly,
which has been corrected.

* ps/for-each-ref-in-fixes:
  bisect: simplify string_list memory handling
  bisect: fix misuse of `refs_for_each_ref_in()`
  pack-bitmap: fix bug with exact ref match in "pack.preferBitmapTips"
  pack-bitmap: deduplicate logic to iterate over preferred bitmap tips

Merge branch 'lo/repo-info-keys'

"git repo info" learns "--keys" action to list known keys.

* lo/repo-info-keys:
repo: add new flag --keys to git-repo-info
repo: rename the output format "keyvalue" to "lines"

t4052: test for diffstat width when prefix contains ANSI escape codes

Add test checking the calculation of the diffstat display width when the
`line_prefix`, which is text that goes before the diffstat, contains
ANSI escape codes.

This situation happens, for example, when `git log --stat --graph` is
executed:
* `--stat` will create a diffstat for each commit
* `--graph` will stuff `line_prefix` with the graph portion of the log,
which contains ANSI escape codes to color the text

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: handle ANSI escape codes in prefix when calculating diffstat width

The diffstat width is calculated by taking the terminal width and
incorrectly subtracting the `strlen()` of `line_prefix`, instead of the
actual display width of `line_prefix`, which may contain ANSI escape
codes (e.g., ANSI-colored strings in `log --graph --stat`).

Utilize the display width instead, obtained via `utf8_strnwidth()` with
the flag `skip_ansi`.

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: remove duplicate --stdin-packs definition

cd846bacc7 (pack-objects: introduce '--stdin-packs=follow', 2025-06-23)
added a new definition of the option --stdin-packs that accepts an
argument. It kept the old definition, which still shows up in the short
help, but is shadowed by the new one. Remove it.

Hinted-at-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repo: remove unnecessary variable shadow

Avoid redeclaring `entry` inside the conditional block, removing
unnecessary variable shadowing and improving code clarity without
changing behavior.

Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Acked-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git, help: fix memory leaks in alias listing

The list_aliases() function sets the util pointer of each list item to
a heap-allocated copy of the alias command value.  Two callers failed
to free these util pointers:

- list_cmds() in git.c collects a string list with STRING_LIST_INIT_DUP
   and clears it with string_list_clear(&list, 0), which frees the
   duplicated strings (strdup_strings=1) but not the util pointers.
   Pass free_util=1 to free them.

- list_cmds_by_config() in help.c calls string_list_sort_u(list, 0) to
   deduplicate the list before processing completion.commands overrides.
   When duplicate entries are removed, the util pointer of each discarded
   item is leaked because free_util=0.  Pass free_util=1 to free them.

Reported-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

alias: treat empty subsection [alias ""] as plain [alias]

When git-config stores a key of the form alias..name, it records
it under an empty subsection ([alias ""]). The new subsection-aware
alias lookup would see a non-NULL but zero-length subsection and
fall into the subsection code path, where it required a "command"
key and thus silently ignored the entry.

Normalize an empty subsection to NULL before any further processing
so that entries stored this way continue to work as plain
case-insensitive aliases, matching the pre-subsection behaviour.

Users who relied on alias..name to create an alias literally named
".name" may want to migrate to subsection syntax, which looks less confusing:

[alias ".name"]
command = <value>

Add tests covering both the empty-subsection compatibility case and
the leading-dot alias via the new syntax.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: fix list continuation in alias subsection example

The example showing the equivalence between alias.last and
alias.last.command was missing the list continuation marks (+
between the shell session block and the following prose, leaving
the paragraph detached from the list item in the rendered output.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

status: add status.compareBranches config for multiple branch comparisons

Add a new configuration variable status.compareBranches that allows
users to specify a space-separated list of branch comparisons in
git status output.

Supported values:
- @{upstream} for the current branch's upstream tracking branch
- @{push} for the current branch's push destination

Any other value is ignored and a warning is shown.

When not configured, the default behavior is equivalent to setting
`status.compareBranches = @{upstream}`, preserving backward
compatibility.

The advice messages shown are context-aware:
- "git pull" advice is shown only when comparing against @{upstream}
- "git push" advice is shown only when comparing against @{push}
- Divergence advice is shown for upstream branch comparisons

This is useful for triangular workflows where the upstream tracking
branch differs from the push destination, allowing users to see their
status relative to both branches at once.

Example configuration:
[status]
compareBranches = @{upstream} @{push}

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refactor format_branch_comparison in preparation

Refactor format_branch_comparison function in preparation for showing
comparison with push remote tracking branch.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

environment: move "branch.autoSetupMerge" into `struct repo_config_values`

The config value `branch.autoSetupMerge` is parsed in
`git_default_branch_config()` and stored in the global variable
`git_branch_track`. This global variable can be overwritten
by another repository when multiple Git repos run in the the same process.

Move this value into `struct repo_config_values` in the_repository to
retain current behaviours and move towards libifying Git.
Since the variable is no longer a global variable, it has been renamed to
`branch_track` in the struct `repo_config_values`.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

environment: stop using core.sparseCheckout globally

The config value `core.sparseCheckout` is parsed in
`git_default_core_config()` and stored globally in
`core_apply_sparse_checkout`. This could cause it to be overwritten
by another repository when different Git repositories run in the same
process.

Move the parsed value into `struct repo_config_values` in the_repository
to retain current behaviours and move towards libifying Git.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The 7th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ac/string-list-sort-u-and-tests'

Code clean-up using a new helper function introduced lately.

* ac/string-list-sort-u-and-tests:
sparse-checkout: use string_list_sort_u

Merge branch 'mc/tr2-process-ancestry-cleanup'

Add process ancestry data to trace2 on macOS to match what we
already do on Linux and Windows.  Also adjust the way Windows
implementation reports this information to match the other two.

* mc/tr2-process-ancestry-cleanup:
  t0213: add trace2 cmd_ancestry tests
  test-tool: extend trace2 helper with 400ancestry
  trace2: emit cmd_ancestry data for Windows
  trace2: refactor Windows process ancestry trace2 event
  build: include procinfo.c impl for macOS
  trace2: add macOS process ancestry tracing

Merge branch 'ps/pack-concat-wo-backfill'

"git pack-objects --stdin-packs" with "--exclude-promisor-objects"
fetched objects that are promised, which was not wanted. This has
been fixed.

* ps/pack-concat-wo-backfill:
builtin/pack-objects: don't fetch objects when merging packs

Merge branch 'dk/complete-stash-import-export'

Command line completion (in contrib/) update.

* dk/complete-stash-import-export:
completion: add stash import, export

Merge branch 'jc/doc-cg-needswork'

A CodingGuidelines update.

* jc/doc-cg-needswork:
CodingGuidelines: document NEEDSWORK comments

Merge branch 'ds/revision-maximal-only'

"git rev-list" and friends learn "--maximal-only" to show only the
commits that are not reachable by other commits.

* ds/revision-maximal-only:
revision: add --maximal-only option

Merge branch 'cc/lop-filter-auto'

"auto filter" logic for large-object promisor remote.

* cc/lop-filter-auto:
  fetch-pack: wire up and enable auto filter logic
  promisor-remote: change promisor_remote_reply()'s signature
  promisor-remote: keep advertised filters in memory
  list-objects-filter-options: support 'auto' mode for --filter
  doc: fetch: document `--filter=<filter-spec>` option
  fetch: make filter_options local to cmd_fetch()
  clone: make filter_options local to cmd_clone()
  promisor-remote: allow a client to store fields
  promisor-remote: refactor initialising field lists

Merge branch 'pw/commit-msg-sample-hook'

Update sample commit-msg hook to complain when a log message has
material mailinfo considers the end of log message in the middle.

* pw/commit-msg-sample-hook:
templates: detect commit messages containing diffs
templates: add .gitattributes entry for sample hooks

Merge branch 'kh/doc-am-format-sendmail'

Doc update.

* kh/doc-am-format-sendmail:
doc: add caveat about round-tripping format-patch

Documentation/git-repo: capitalize format descriptions

The descriptions for the git-repo output formats are in lowercase.
Capitalize these descriptions, making them consistent with the rest of
the documentation.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/git-repo: replace 'NUL' with '_NUL_'

Replace all occurrences of "NUL" by "_NUL_" in git-repo.adoc, following the
convention used by other documentation files.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t1901: adjust nul format output instead of expected value

The test 'keyvalue and nul format', as it description says, test both
`keyvalue` and `nul` format. These formats are similar, differing only in
their field separator (= in the former, LF in the latter) and their
record separator (LF in the former, NUL in the latter). This way, both
formats can be tested using the same expected output and only replacing
the separators in one of the output formats.

However, it is not desirable to have a NUL character in the files
compared by test_cmp because, if that assetion fails, diff will consider
them binary files and won't display the differences properly.

Adjust the output of `git repo structure --format=nul` in t1901, matching the
--format=keyvalue ones. Compare this output against the same value expected
from --format=keyvalue, without using files with NUL characters in
test_cmp.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t1900: rename t1900-repo to t1900-repo-info

Since the commit bbb2b93348 (builtin/repo: introduce structure subcommand,
2025-10-21), t1901 specifically tests git-repo-structure. Rename
t1900-repo to t1900-repo-info to clarify that it focus solely on
git-repo-info subcommand.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repo: rename struct field to repo_info_field

Change the name of the struct field to repo_info_field, making it
explicit that it is an internal data type of git-repo-info.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repo: replace get_value_fn_for_key by get_repo_info_field

Remove the function `get_value_fn_for_key`, which returns a function that
retrieves a value for a certain repo info key. Introduce `get_repo_info_field`
instead, which returns a struct field.

This refactor makes the structure of the function print_fields more consistent
to the function print_all_fields, improving its readability.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repo: rename repo_info_fields to repo_info_field

Rename repo_info_fields as repo_info_field, following the CodingGuidelines rule
for naming arrays in singular. Rename all the references to that array
accordingly.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

CodingGuidelines: instruct to name arrays in singular

Arrays should be named in the singular form, ensuring that when
accessing an element within an array (e.g. dog[0]) it's clear that
we're referring to an element instead of a collection.

Add a new rule to CodingGuidelines asking for arrays to be named in
singular instead of plural.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: add GIT_REFERENCE_BACKEND to specify reference backend

Git allows setting a different object directory via
'GIT_OBJECT_DIRECTORY', but provides no equivalent for references. In
the previous commit we extended the 'extensions.refStorage' config to
also support an URI input for reference backend with location.

Let's also add a new environment variable 'GIT_REFERENCE_BACKEND' that
takes in the same input as the config variable. Having an environment
variable allows us to modify the reference backend and location on the
fly for individual Git commands.

The environment variable also allows usage of alternate reference
directories during 'git-clone(1)' and 'git-init(1)'. Add the config to
the repository when created with the environment variable set.

When initializing the repository with an alternate reference folder,
create the required stubs in the repositories $GIT_DIR. The inverse,
i.e. removal of the ref store doesn't clean up the stubs in the $GIT_DIR
since that would render it unusable. Removal of ref store is only used
when migrating between ref formats and cleanup of the $GIT_DIR doesn't
make sense in such a situation.

Helped-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: allow reference location in refstorage config

The 'extensions.refStorage' config is used to specify the reference
backend for a given repository. Both the 'files' and 'reftable' backends
utilize the $GIT_DIR as the reference folder by default in
`get_main_ref_store()`.

Since the reference backends are pluggable, this means that they could
work with out-of-tree reference directories too. Extend the 'refStorage'
config to also support taking an URI input, where users can specify the
reference backend and the location.

Add the required changes to obtain and propagate this value to the
individual backends. Add the necessary documentation and tests.

Traditionally, for linked worktrees, references were stored in the
'$GIT_DIR/worktrees/<wt_id>' path. But when using an alternate reference
storage path, it doesn't make sense to store the main worktree
references in the new path, and the linked worktree references in the
$GIT_DIR. So, let's store linked worktree references in
'$ALTERNATE_REFERENCE_DIR/worktrees/<wt_id>'. To do this, create the
necessary files and folders while also adding stubs in the $GIT_DIR path
to ensure that it is still considered a Git directory.

Ideally, we would want to pass in a `struct worktree *` to individual
backends, instead of passing the `gitdir`. This allows them to handle
worktree specific logic. Currently, that is not possible since the
worktree code is:

  - Tied to using the global `the_repository` variable.

  - Is not setup before the reference database during initialization of
    the repository.

Add a TODO in 'refs.c' to ensure we can eventually make that change.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: receive and use the reference storage payload

An upcoming commit will add support for providing an URI via the
'extensions.refStorage' config. The URI will contain the reference
backend and a corresponding payload. The payload can be then used for
providing an alternate locations for the reference backend.

To prepare for this, modify the existing backends to accept such an
argument when initializing via the 'init()' function. Both the files
and reftable backends will parse the information to be filesystem paths
to store references. Given that no callers pass any payload yet this is
essentially a no-op change for now.

To enable this, provide a 'refs_compute_filesystem_location()' function
which will parse the current 'gitdir' and the 'payload' to provide the
final reference directory and common reference directory (if working in
a linked worktree).

The documentation and tests will be added alongside the extension of the
config variable.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: move out stub modification to generic layer

When creating the reftable reference backend on disk, we create stubs to
ensure that the directory can be recognized as a Git repository. This is
done by calling `refs_create_refdir_stubs()`. Move this to the generic
layer as this is needed for all backends excluding from the files
backends. In an upcoming commit where we introduce alternate reference
backend locations, we'll have to also create stubs in the $GIT_DIR
irrespective of the backend being used. This commit builds the base to
add that logic.

Similarly, move the logic for deletion of stubs to the generic layer.
The files backend recursively calls the remove function of the
'packed-backend', here skip calling the generic function since that
would try to delete stubs.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: extract out `refs_create_refdir_stubs()`

For Git to recognize a directory as a Git directory, it requires the
directory to contain:

  1. 'HEAD' file
  2. 'objects/' directory
  3. 'refs/' directory

Here, #1 and #3 are part of the reference storage mechanism,
specifically the files backend. Since then, newer backends such as the
reftable backend have moved to using their own path ('reftable/') for
storing references. But to ensure Git still recognizes the directory as
a Git directory, we create stubs.

There are two locations where we create stubs:

- In 'refs/reftable-backend.c' when creating the reftable backend.
- In 'clone.c' before spawning transport helpers.

In a following commit, we'll add another instance. So instead of
repeating the code, let's extract out this code to
`refs_create_refdir_stubs()` and use it.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

setup: don't modify repo in `create_reference_database()`

The `create_reference_database()` function is used to create the
reference database during initialization of a repository. The function
calls `repo_set_ref_storage_format()` to set the repositories reference
format. This is an unexpected side-effect of the function. More so
because the function is only called in two locations:

  1. During git-init(1) where the value is propagated from the `struct
     repository_format repo_fmt` value.

  2. During git-clone(1) where the value is propagated from the
     `the_repository` value.

The former is valid, however the flow already calls
`repo_set_ref_storage_format()`, so this effort is simply duplicated.
The latter sets the existing value in `the_repository` back to itself.
While this is okay for now, introduction of more fields in
`repo_set_ref_storage_format()` would cause issues, especially
dynamically allocated strings, where we would free/allocate the same
string back into `the_repostiory`.

To avoid all this confusion, clean up the function to no longer take in
and set the repo's reference storage format.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: fix wrong evaluation order in URL trailing-slash trimming

if i == -1, url[i] will be UB.

Signed-off-by: cuiweixie <cuiweixie@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx: enable reachability bitmaps during MIDX compaction

Enable callers to generate reachability bitmaps when performing MIDX
layer compaction by combining all existing bitmaps from the compacted
layers.

Note that because of the object/pack ordering described by the previous
commit, the pseudo-pack order for the compacted MIDX is the same as
concatenating the individual pseudo-pack orderings for each layer in the
compaction range.

As a result, the only non-test or documentation change necessary is to
treat all objects as non-preferred during compaction so as not to
disturb the object ordering.

In the future, we may want to adjust which commit(s) receive
reachability bitmaps when compacting multiple .bitmap files into one, or
even generate new bitmaps (e.g., if the references have moved
significantly since the .bitmap was generated). This commit only
implements combining all existing bitmaps in range together in order to
demonstrate and lay the groundwork for more exotic strategies.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx: implement MIDX compaction

When managing a MIDX chain with many layers, it is convenient to combine
a sequence of adjacent layers into a single layer to prevent the chain
from growing too long.

While it is conceptually possible to "compact" a sequence of MIDX layers
together by running "git multi-pack-index write --stdin-packs", there
are a few drawbacks that make this less than desirable:

- Preserving the MIDX chain is impossible, since there is no way to
   write a MIDX layer that contains objects or packs found in an earlier
   MIDX layer already part of the chain. So callers would have to write
   an entirely new (non-incremental) MIDX containing only the compacted
   layers, discarding all other objects/packs from the MIDX.

- There is (currently) no way to write a MIDX layer outside of the MIDX
   chain to work around the above, such that the MIDX chain could be
   reassembled substituting the compacted layers with the MIDX that was
   written.

- The `--stdin-packs` command-line option does not allow us to specify
   the order of packs as they appear in the MIDX. Therefore, even if
   there were workarounds for the previous two challenges, any bitmaps
   belonging to layers which come after the compacted layer(s) would no
   longer be valid.

This commit introduces a way to compact a sequence of adjacent MIDX
layers into a single layer while preserving the MIDX chain, as well as
any bitmap(s) in layers which are newer than the compacted ones.

Implementing MIDX compaction does not require a significant number of
changes to how MIDX layers are written. The main changes are as follows:

- Instead of calling `fill_packs_from_midx()`, we call a new function
   `fill_packs_from_midx_range()`, which walks backwards along the
   portion of the MIDX chain which we are compacting, and adds packs one
   layer a time.

   In order to preserve the pseudo-pack order, the concatenated pack
   order is preserved, with the exception of preferred packs which are
   always added first.

- After adding entries from the set of packs in the compaction range,
   `compute_sorted_entries()` must adjust the `pack_int_id`'s for all
   objects added in each fanout layer to match their original
   `pack_int_id`'s (as opposed to the index at which each pack appears
   in `ctx.info`).

   Note that we cannot reuse `midx_fanout_add_midx_fanout()` directly
   here, as it unconditionally recurs through the `->base_midx`. Factor
   out a `_1()` variant that operates on a single layer, reimplement
   the existing function in terms of it, and use the new variant from
   `midx_fanout_add_compact()`.

   Since we are sorting the list of objects ourselves, the order we add
   them in does not matter.

- When writing out the new 'multi-pack-index-chain' file, discard any
   layers in the compaction range, replacing them with the newly written
   layer, instead of keeping them and placing the new layer at the end
   of the chain.

This ends up being sufficient to implement MIDX compaction in such a way
that preserves bitmaps corresponding to more recent layers in the MIDX
chain.

The tests for MIDX compaction are so far fairly spartan, since the main
interesting behavior here is ensuring that the right packs/objects are
selected from each layer, and that the pack order is preserved despite
whether or not they are sorted in lexicographic order in the original
MIDX chain.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/helper/test-read-midx.c: plug memory leak when selecting layer

Though our 'read-midx' test tool is capable of printing information
about a single MIDX layer identified by its checksum, no caller in our
test suite exercises this path.

Unfortunately, there is a memory leak lurking in this (currently) unused
path that would otherwise be exposed by the following commit.

This occurs when providing a MIDX layer checksum other than the tip. As
we walk over the MIDX chain trying to find the matching layer, we drop
our reference to the top-most MIDX layer. Thus, our call to
'close_midx()' later on leaks memory between the top-most MIDX layer and
the MIDX layer immediately following the specified one.

Plug this leak by holding a reference to the tip of the MIDX chain, and
ensure that we call `close_midx()` before terminating the test tool.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx-write.c: factor fanout layering from `compute_sorted_entries()`

When computing the set of objects to appear in a MIDX, we use
compute_sorted_entries(), which handles objects from various existing
sources one fanout layer at a time.

The process for computing this set is slightly different during MIDX
compaction, so factor out the existing functionality into its own
routine to prevent `compute_sorted_entries()` from becoming too
difficult to read.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>