git.ipfire.org Git - thirdparty/git.git/log

doc: mention that proxies must be completely transparent

We already document in the FAQ that proxies must be completely
transparent and not modify the request or response in any way, but add
similar documentation to the http.proxy entry. We know that while the
FAQ is very useful, users sometimes are less likely to read in favor of
the documentation specific to an option or command, so adding it in both
places will help users be adequately informed.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gitfaq: add entry about syncing working trees

Users very commonly want to sync their working tree with uncommitted
changes across machines, often to carry across in-progress work or
stashes.  Despite this not being a recommended approach, users want to
do it and are not dissuaded by suggestions not to, so let's recommend a
sensible technique.

The technique that many users are using is their preferred cloud syncing
service, which is a bad idea.  Users have reported problems where they
end up with duplicate files that won't go away (with names like "file.c
2"), broken references, oddly named references that have date stamps
appended to them, missing objects, and general corruption and data loss.
That's because almost all of these tools sync file by file, which is a
great technique if your project is a single word processing document or
spreadsheet, but is utterly abysmal for Git repositories because they
don't necessarily snapshot the entire repository correctly.  They also
tend to sync the files immediately instead of when the repository is
quiescent, so writing multiple files, as occurs during a commit or a gc,
can confuse the tools and lead to corruption.

We know that the old standby, rsync, is up to the task, provided that
the repository is quiescent, so let's suggest that and dissuade people
from using cloud syncing tools.  Let's tell people about common things
they should be aware of before doing this and that this is still
potentially risky.  Additionally, let's tell people that Git's security
model does not permit sharing working trees across users in case they
planned to do that.  While we'd still prefer users didn't try to do
this, hopefully this will lead them in a safer direction.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gitfaq: give advice on using eol attribute in gitattributes

In the FAQ, we tell people how to use the text attribute, but we fail to
explain what to do with the eol attribute. As we ourselves have
noticed, most shell implementations do not care for carriage returns,
and as such, people will practically always want them to use LF endings.
Similar things can be said for batch files on Windows, except with CRLF
endings.

Since these are common things to have in a repository, let's help users
make a good decision by recommending that they use the gitattributes
file to correctly check out the endings.

In addition, let's correct the cross-reference to this question, which
originally referred to "the following entry", even though a new entry
has been inserted in between. The cross-reference notation should
prevent this from occurring and provide a link in formats, such as HTML,
which support that.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gitfaq: add documentation on proxies

Many corporate environments and local systems have proxies in use. Note
the situations in which proxies can be used and how to configure them.
At the same time, note what standards a proxy must follow to work with
Git. Explicitly call out certain classes that are known to routinely
have problems reported various places online, including in the Git for
Windows issue tracker and on Stack Overflow, and recommend against the
use of such software, noting that they are associated with myriad
security problems (including, for example, breaking sandboxing and image
integrity[0], and, for TLS middleboxes, the use of insecure protocols
and ciphers and lack of certificate verification[1]). Don't mention the
specific nature of these security problems in the FAQ entry because they
are extremely numerous and varied and we wish to keep the FAQ entry
relatively brief.

[0] https://issues.chromium.org/issues/40285192
[1] https://faculty.cc.gatech.edu/~mbailey/publications/ndss17_interception.pdf

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: unify bash calling convention

Under ci/ hierarchy, we run scripts under either "sh" (any Bourne
compatible POSIX shell would work) or specifically "bash" (as they
require features from bash, e.g., ${parameter/pattern/string}
expansion). As we have the CI environment under our control, we can
expect that /bin/sh will always be fine to run the scripts that only
require a Bourne shell, but we may not know where "bash" is
installed depending on the distro used.

So let's make sure we start these scripts with either one of these:

#!/bin/sh
#!/usr/bin/env bash

Yes, the latter has to assume that everybody installs "env" at that
path and not as /bin/env or /usr/local/bin/env, but this currently
is the best we could do.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

The ninteenth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ds/sparse-lstat-caching'

The code to deal with modified paths that are out-of-cone in a
sparsely checked out working tree has been optimized.

* ds/sparse-lstat-caching:
  sparse-index: improve lstat caching of sparse paths
  sparse-index: count lstat() calls
  sparse-index: use strbuf in path_found()
  sparse-index: refactor path_found()
  sparse-checkout: refactor skip worktree retry logic

Merge branch 'xx/bundie-uri-fixes'

When bundleURI interface fetches multiple bundles, Git failed to
take full advantage of all bundles and ended up slurping duplicated
objects.

* xx/bundie-uri-fixes:
  unbundle: extend object verification for fetches
  fetch-pack: expose fsckObjects configuration logic
  bundle-uri: verify oid before writing refs

Merge branch 'ps/leakfixes-more'

More memory leaks have been plugged.

* ps/leakfixes-more: (29 commits)
  builtin/blame: fix leaking ignore revs files
  builtin/blame: fix leaking prefixed paths
  blame: fix leaking data for blame scoreboards
  line-range: plug leaking find functions
  merge: fix leaking merge bases
  builtin/merge: fix leaking `struct cmdnames` in `get_strategy()`
  sequencer: fix memory leaks in `make_script_with_merges()`
  builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()`
  apply: fix leaking string in `match_fragment()`
  sequencer: fix leaking string buffer in `commit_staged_changes()`
  commit: fix leaking parents when calling `commit_tree_extended()`
  config: fix leaking "core.notesref" variable
  rerere: fix various trivial leaks
  builtin/stash: fix leak in `show_stash()`
  revision: free diff options
  builtin/log: fix leaking commit list in git-cherry(1)
  merge-recursive: fix memory leak when finalizing merge
  builtin/merge-recursive: fix leaking object ID bases
  builtin/difftool: plug memory leaks in `run_dir_diff()`
  object-name: free leaking object contexts
  ...

Merge branch 'tb/path-filter-fix'

The Bloom filter used for path limited history traversal was broken
on systems whose "char" is unsigned; update the implementation and
bump the format version to 2.

* tb/path-filter-fix:
  bloom: introduce `deinit_bloom_filters()`
  commit-graph: reuse existing Bloom filters where possible
  object.h: fix mis-aligned flag bits table
  commit-graph: new Bloom filter version that fixes murmur3
  commit-graph: unconditionally load Bloom filters
  bloom: prepare to discard incompatible Bloom filters
  bloom: annotate filters with hash version
  repo-settings: introduce commitgraph.changedPathsVersion
  t4216: test changed path filters with high bit paths
  t/helper/test-read-graph: implement `bloom-filters` mode
  bloom.h: make `load_bloom_filter_from_graph()` public
  t/helper/test-read-graph.c: extract `dump_graph_info()`
  gitformat-commit-graph: describe version 2 of BDAT
  commit-graph: ensure Bloom filters are read with consistent settings
  revision.c: consult Bloom filters for root commits
  t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`

Merge branch 'db/date-underflow-fix'

date parser updates to be more careful about underflowing epoch
based timestamp.

* db/date-underflow-fix:
date: detect underflow/overflow when parsing dates with timezone offset
t0006: simplify prerequisites

Merge branch 'rj/pager-die-upon-exec-failure'

When GIT_PAGER failed to spawn, depending on the code path taken,
we failed immediately (correct) or just spew the payload to the
standard output (incorrect). The code now always fail immediately
when GIT_PAGER fails.

* rj/pager-die-upon-exec-failure:
pager: die when paging to non-existing command

Merge branch 'ss/doc-eol-attr-fix'

Doc update.

* ss/doc-eol-attr-fix:
doc: fix case error of eol attribute in example

Merge branch 'jc/archive-prefix-with-add-virtual-file'

"git archive --add-virtual-file=<path>:<contents>" never paid
attention to the --prefix=<prefix> option but the documentation
said it would. The documentation has been corrected.

* jc/archive-prefix-with-add-virtual-file:
archive: document that --add-virtual-file takes full path

advice: warn when sparse index expands

Typically, forcing a sparse index to expand to a full index means that
Git could not determine the status of a file outside of the
sparse-checkout and needed to expand sparse trees into the full list of
sparse blobs. This operation can be very slow when the sparse-checkout
is much smaller than the full tree at HEAD.

When users are in this state, there is usually a modified or untracked
file outside of the sparse-checkout mentioned by the output of 'git
status'. There are a number of reasons why this is insufficient:

1. Users may not have a full understanding of which files are inside or
    outside of their sparse-checkout. This is more common in monorepos
    that manage the sparse-checkout using custom tools that map build
    dependencies into sparse-checkout definitions.

2. In some cases, an empty directory could exist outside the
    sparse-checkout and these empty directories are not reported by 'git
    status' and friends.

3. If the user has '.gitignore' or 'exclude' files, then 'git status'
    will squelch the warnings and not demonstrate any problems.

In order to help users who are in this state, add a new advice message
to indicate that a sparse index is expanded to a full index. This
message should be written at most once per process, so add a static
global 'give_advice_on_expansion' to sparse-index.c. Further, there is a
case in 'git sparse-checkout set' that uses the sparse index as an
in-memory data structure (even when writing a full index) so we need to
disable the message in that kind of case.

The t1092-sparse-checkout-compatibility.sh test script compares the
behavior of several Git commands across full and sparse repositories,
including sparse repositories with and without a sparse index. We need
to disable the advice in the sparse-index repo to avoid differences in
stderr. By leaving the advice on in the sparse-checkout repo (without
the sparse index), we can test the behavior of disabling the advice in
convert_to_sparse(). (Indeed, these tests are how that necessity was
discovered.) Add a test that reenables the advice and demonstrates that
the message is output.

The advice message is defined outside of expand_index() to avoid super-
wide lines. It is also defined as a macro to avoid compile issues with
-Werror=format-security.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: fix the max number of branches shown by "show-branch"

The number to be displayed is calculated by the following defined in
object.h:

#define REV_SHIFT 2
#define MAX_REVS (FLAG_BITS - REV_SHIFT)

FLAG_BITS is currently 28, so 26 is the correct number.

Signed-off-by: Rikita Ishikawa <lagrange.resolvent@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gitweb: rss/atom change published/updated date to committer date

The author date is used for published/updated date in the rss/atom
feed stream. Change it to the committer date that reflects the
"published/updated" definition better and makes rss/atom feeds more
linear. Gitlab/Github rss/atom feeds use the committer date.

Additionally, to be consistent, also use the committer date to
determine the date of the last commit to send in the feed
instead of the author date.

Signed-off-by: Jesús Ariel Cabello Mateos <080ariel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge https://github.com/j6t/git-gui

* https://github.com/j6t/git-gui:
  git-gui: fix inability to quit after closing another instance
  git-gui: sv.po: Update Swedish translation (576t0f0u)
  git-gui: note the new maintainer
  Makefile(s): do not enforce "all indents must be done with tab"
  Makefile(s): avoid recipe prefix in conditional statements
  doc: switch links to https
  doc: update links to current pages
  git-gui: po: fix typo in French "aperçu"

Merge branch 'os/catch-rename'

The problem can be reproduced on Linux with this sequence:

1. Run git gui from a terminal.
2. Edit the commit message and wait for at least 2 seconds.
3. Terminate the instance from the terminal, for example with Ctrl-C,
to simulate crash. This leaves the file .git/GITGUI_BCK behind.
4. Start two instances of git gui &.

At this point the first instance can be closed (it renames
.git/GITGUI_BCK to .git/GITGUI_MSG), but the seconds brings an error
message about the absent file and cannot be closed thereafter and must
be killed from the command line.

The renaming that happens by the first instance is the correct action
and need not be repeated by the second instance. It is the correct
action to ignore the failed renaming.

On the other hand, the second instance could just edit the commit
message again, wait 2 seconds to write GITGUI_BCK, and then can be
closed without failing. At this point, since the user has edited the
message, it is again correct to preserve the edited version in
GITGUI_MSG.

* os/catch-rename:
git-gui: fix inability to quit after closing another instance

clang-format: include kh_foreach* macros in ForEachMacros

The command for generating the list of ForEachMacros searches for
macros whose name contains the string "for_each".  Include those whose
name contains "foreach" as well.  That brings in kh_foreach and
kh_foreach_value from khash.h.

Regenerating the list also brings in hashmap-based macros added by
87571c3f71 (hashmap: use *_entry APIs for iteration, 2019-10-06),
f0e63c4113 (hashmap: use *_entry APIs to wrap container_of, 2019-10-06),
4fa1d501f7 (strmap: add functions facilitating use as a string->int map,
2020-11-05), b70c82e6ed (strmap: add more utility functions,
2020-11-05), and 1201eb628a (strmap: add a strset sub-type, 2020-11-06).

for_each_abbrev is no longer found because its definition was removed by
d850b7a545 (cocci: apply the "cache.h" part of "the_repository.pending",
2023-03-28).  Note that it had been a false positive, though, as it had
been a function wrapper, not a for-like macro.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config.mak.dev: fix typo when enabling -Wpedantic

In ebd2e4a13a (Makefile: restrict -Wpedantic and -Wno-pedantic-ms-format
better, 2021-09-28), we tightened our Makefile's behavior to only enable
-Wpedantic when compiling with either gcc5/clang4 or greater as older
compiler versions did not have support for -Wpedantic.

Commit ebd2e4a13a was looking for either "gcc5" or "clang4" to appear in
the COMPILER_FEATURES variable, combining the two "$(filter ...)"
searches with an "$(or ...)".

But ebd2e4a13a has a typo where instead of writing:

    ifneq ($(or ($filter ...),$(filter ...)),)

we wrote:

    ifneq (($or ($filter ...),$(filter ...)),)

Causing our Makefile (when invoked with DEVELOPER=1, and a sufficiently
recent compiler version) to barf:

    $ make DEVELOPER=1
    config.mak.dev:13: extraneous text after 'ifneq' directive
    [...]

Correctly combine the results of the two "$(filter ...)" operations by
using "$(or ...)", not "$or".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-strvec: use test_msg()

check_strvec_loc() checks each strvec item by looping through them and
comparing them with expected values.  If a check fails then we'd like
to know which item is affected.  It reports that information by building
a strbuf and delivering its contents using a failing assertion, e.g.
if there are fewer items in the strvec than expected:

   # check "vec->nr > nr" failed at t/unit-tests/t-strvec.c:19
   #    left: 1
   #   right: 1
   # check "strvec index 1" failed at t/unit-tests/t-strvec.c:71

Note that the index variable is "nr" and thus the interesting value is
reported twice in that example (in lines three and four).

Stop printing the index explicitly for checks that already report it.
The message for the same condition as above becomes:

   # check "vec->nr > nr" failed at t/unit-tests/t-strvec.c:19
   #    left: 1
   #   right: 1

For the string comparison, whose error message doesn't include the
index, report it using the simpler and more appropriate test_msg()
instead.  Report the index using its actual variable name and format the
line like the preceding ones.  The message for an unexpected string
value becomes:

   # check "!strcmp(vec->v[nr], str)" failed at t/unit-tests/t-strvec.c:24
   #    left: "foo"
   #   right: "bar"
   #      nr: 0

Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-ort: fix missing early return

One of the conversions in f19b9165 (merge-ort: convert more error()
cases to path_msg(), 2024-06-19) accidentally lost the early return.

Restore it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c

helper/test-oidmap.c along with t0016-oidmap.sh test the oidmap.h
library which is built on top of hashmap.h.

Migrate them to the unit testing framework for better performance,
concise code and better debugging. Along with the migration also plug
memory leaks and make the test logic independent for all the tests.
The migration removes 'put' tests from t0016, because it is used as
setup to all the other tests, so testing it separately does not yield
any benefit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

push: avoid showing false negotiation errors

When "git push" is configured to use the push negotiation, a push of
deletion of a branch (without pushing anything else) may end up not
having anything to negotiate for the common ancestor discovery.

In such a case, we end up making an internal invocation of "git
fetch --negotiate-only" without any "--negotiate-tip" parameters
that stops the negotiate-only fetch from being run, which by itself
is not a bad thing (one fewer round-trip), but the end-user sees a
"fatal: --negotiate-only needs one or more --negotiation-tip=*"
message that the user cannot act upon.

Teach "git push" to notice the situation and omit performing the
negotiate-only fetch to begin with. One fewer process spawned, one
fewer "alarming" message given the user.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

checkout: special case error messages during noop switching

"git checkout" ran with no branch and no pathspec behaves like
switching the branch to the current branch (in other words, a
no-op, except that it gives a side-effect "here are the modified
paths" report).  But unlike "git checkout HEAD" or "git checkout
main" (when you are on the 'main' branch), the user is much less
conscious that they are "switching" to the current branch.

This twists end-user expectation in a strange way.  There are
options (like "--ours") that make sense only when we are checking
out paths out of either the tree-ish or out of the index.  So the
error message the command below gives

    $ git checkout --ours
    fatal: '--ours/theirs' cannot be used with switching branches

is technically correct, but because the end-user may not even be
aware of the fact that the command they are issuing is about no-op
branch switching [*], they may find the error confusing.

Let's refactor the code to make it easier to special case the "no-op
branch switching" situation, and then customize the exact error
message for "--ours/--theirs".  Since it is more likely that the
end-user forgot to give pathspec that is required by the option,
let's make it say

    $ git checkout --ours
    fatal: '--ours/theirs' needs the paths to check out

instead.

Among the other options that are incompatible with branch switching,
there may be some that benefit by having messages tweaked when a
no-op branch switching is done, but I'll leave them as #leftoverbits
material.

[Footnote]

* Yes, the end-users are irrational.  When they did not give
   "--ours", they take it granted that "git checkout" gives a short
   status, e.g..

    $ git checkout
    M builtin/checkout.c
    M t/t7201-co.sh

   exactly as a branch switching command.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Sync with 'maint'

The eighteenth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'rs/diff-color-moved-w-no-ext-diff-fix'

"git diff --no-ext-diff" when diff.external is configured ignored
the "--color-moved" option.

* rs/diff-color-moved-w-no-ext-diff-fix:
diff: allow --color-moved with --no-ext-diff

Merge branch 'ew/object-convert-leakfix'

Leakfix.

* ew/object-convert-leakfix:
object-file: fix leak on conversion failure

Merge branch 'jk/remote-wo-url'

Memory ownership rules for the in-core representation of
remote.*.url configuration values have been straightened out, which
resulted in a few leak fixes and code clarification.

* jk/remote-wo-url:
  remote: drop checks for zero-url case
  remote: always require at least one url in a remote
  t5801: test remote.*.vcs config
  t5801: make remote-testgit GIT_DIR setup more robust
  remote: allow resetting url list
  config: document remote.*.url/pushurl interaction
  remote: simplify url/pushurl selection
  remote: use strvecs to store remote url/pushurl
  remote: transfer ownership of memory in add_url(), etc
  remote: refactor alias_url() memory ownership
  archive: fix check for missing url

Merge branch 'jc/fuzz-sans-curl'

CI job to build minimum fuzzers learned to pass NO_CURL=NoThanks to
the build procedure, as its build environment does not offer, or
the rest of the build needs, anything cURL.

* jc/fuzz-sans-curl:
fuzz: minimum fuzzers environment lacks libcURL

Merge branch 'rb/build-options-w-lib-versions'

"git version --build-options" reports the version information of
OpenSSL and other libraries (if used) in the build.

* rb/build-options-w-lib-versions:
  version: teach --build-options to reports zlib version information
  version: teach --build-options to reports libcurl version information
  version: --build-options reports OpenSSL version information

Merge branch 'ps/use-the-repository'

A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.

* ps/use-the-repository:
  hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
  t/helper: remove dependency on `the_repository` in "proc-receive"
  t/helper: fix segfault in "oid-array" command without repository
  t/helper: use correct object hash in partial-clone helper
  compat/fsmonitor: fix socket path in networked SHA256 repos
  replace-object: use hash algorithm from passed-in repository
  protocol-caps: use hash algorithm from passed-in repository
  oidset: pass hash algorithm when parsing file
  http-fetch: don't crash when parsing packfile without a repo
  hash-ll: merge with "hash.h"
  refs: avoid include cycle with "repository.h"
  global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
  hash: require hash algorithm in `empty_tree_oid_hex()`
  hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
  hash: make `is_null_oid()` independent of `the_repository`
  hash: convert `oidcmp()` and `oideq()` to compare whole hash
  global: ensure that object IDs are always padded
  hash: require hash algorithm in `oidread()` and `oidclr()`
  hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
  hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions

Merge branch 'ew/cat-file-unbuffered-tests'

The output from "git cat-file --batch-check" and "--batch-command
(info)" should not be unbuffered, for which some tests have been
added.

* ew/cat-file-unbuffered-tests:
t1006: ensure cat-file info isn't buffered by default
Git.pm: use array in command_bidi_pipe example

Yet another batch of post 2.45.2 updates from the 'master' front

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'rs/remove-unused-find-header-mem' into maint-2.45

Code clean-up.

* rs/remove-unused-find-header-mem:
commit: remove find_header_mem()

Merge branch 'jc/worktree-git-path' into maint-2.45

Code cleanup.

* jc/worktree-git-path:
worktree_git_path(): move the declaration to path.h

Merge branch 'jk/fetch-pack-fsck-wo-lock-pack' into maint-2.45

"git fetch-pack -k -k" without passing "--lock-pack" (which we
never do ourselves) did not work at all, which has been corrected.

* jk/fetch-pack-fsck-wo-lock-pack:
fetch-pack: fix segfault when fscking without --lock-pack

Merge branch 'jk/t5500-typofix' into maint-2.45

A helper function shared between two tests had a copy-paste bug,
which has been corrected.

* jk/t5500-typofix:
t5500: fix mistaken $SERVER reference in helper function

Merge branch 'js/mingw-remove-unused-extern-decl' into maint-2.45

An unused extern declaration for mingw has been removed to prevent
it from causing build failure.

* js/mingw-remove-unused-extern-decl:
mingw: drop bogus (and unneeded) declaration of `_pgmptr`

Merge branch 'jc/no-default-attr-tree-in-bare' into maint-2.45

Earlier we stopped using the tree of HEAD as the default source of
attributes in a bare repository, but failed to document it. This
has been corrected.

* jc/no-default-attr-tree-in-bare:
attr.tree: HEAD:.gitattributes is no longer the default in a bare repo

Merge branch 'tb/precompose-getcwd' into maint-2.45

We forgot to normalize the result of getcwd() to NFC on macOS where
all other paths are normalized, which has been corrected. This still
does not address the case where core.precomposeUnicode configuration
is not defined globally.

* tb/precompose-getcwd:
macOS: ls-files path fails if path of workdir is NFD

Merge branch 'pw/rebase-i-error-message' into maint-2.45

When the user adds to "git rebase -i" instruction to "pick" a merge
commit, the error experience is not pleasant.  Such an error is now
caught earlier in the process that parses the todo list.

* pw/rebase-i-error-message:
  rebase -i: improve error message when picking merge
  rebase -i: pass struct replay_opts to parse_insn_line()

Merge branch 'ds/format-patch-rfc-and-k' into maint-2.45

The "-k" and "--rfc" options of "format-patch" will now error out
when used together, as one tells us not to add anything to the
title of the commit, and the other one tells us to add "RFC" in
addition to "PATCH".

* ds/format-patch-rfc-and-k:
format-patch: ensure that --rfc and -k are mutually exclusive

t-reftable-record: add tests for reftable_log_record_compare_key()

reftable_log_record_compare_key() is a function defined by
reftable/record.{c, h} and is used to compare the keys of two
log records when sorting multiple log records using 'qsort'.
In the current testing setup, this function is left unexercised.
Add a testing function for the same.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-reftable-record: add tests for reftable_ref_record_compare_name()

reftable_ref_record_compare_name() is a function defined by
reftable/record.{c, h} and is used to compare the refname of two
ref records when sorting multiple ref records using 'qsort'.
In the current testing setup, this function is left unexercised.
Add a testing function for the same.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-reftable-record: add index tests for reftable_record_is_deletion()

reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for index records.

Add tests for this function in the case of index records.
Note that since index records cannot be of type deletion, this function
must always return '0' when called on an index record.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-reftable-record: add obj tests for reftable_record_is_deletion()

reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for two of the four record types (obj, index).

Add tests for this function in the case of obj records.
Note that since obj records cannot be of type deletion, this function
must always return '0' when called on an obj record.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-reftable-record: add log tests for reftable_record_is_deletion()

reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for three of the four record types (log, obj, index).

Add tests for this function in the case of log records.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-reftable-record: add ref tests for reftable_record_is_deletion()

reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for all the four record types (ref, log, obj, index).

Add tests for this function in the case of ref records.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-reftable-record: add comparison tests for obj records

In the current testing setup for obj records, the comparison
functions for obj records, reftable_obj_record_cmp_void() and
reftable_obj_record_equal_void() are left untested.

Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_index_record_cmp_void() and reftable_index_record_equal_void()
respectively.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-reftable-record: add comparison tests for index records

In the current testing setup for index records, the comparison
functions for index records, reftable_index_record_cmp() and
reftable_index_record_equal() are left untested.

Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_index_record_cmp() and reftable_index_record_equal()
respectively.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-reftable-record: add comparison tests for ref records

In the current testing setup for ref records, the comparison
functions for ref records, reftable_ref_record_cmp_void() and
reftable_ref_record_equal() are left untested.

Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_ref_record_cmp_void() and reftable_ref_record_equal()
respectively.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-reftable-record: add reftable_record_cmp() tests for log records

In the current testing setup for log records, only
reftable_log_record_equal() among log record's comparison functions
is tested.

Modify the existing tests to exercise reftable_log_record_cmp_void()
(using the wrapper function reftable_record_cmp()) alongside
reftable_log_record_equal().
Note that to achieve this, we'll need to replace instances of
reftable_log_record_equal() with the wrapper function
reftable_record_equal().

Rename the now modified test to reflect its nature of exercising
all comparison operations, not just equality.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: move reftable/record_test.c to the unit testing framework

reftable/record_test.c exercises the functions defined in
reftable/record.{c, h}. Migrate reftable/record_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework, and renaming the tests to fit unit-tests' naming scheme.

While at it, change the type of index variable 'i' to 'size_t'
from 'int'. This is because 'i' is used in comparison against
'ARRAY_SIZE(x)' which is of type 'size_t'.

Also, use set_hash() which is defined locally in the test file
instead of set_test_hash() which is defined by
reftable/test_framework.{c, h}. This is fine to do as both these
functions are similarly implemented, and
reftable/test_framework.{c, h} is not #included in the ported test.

Get rid of reftable_record_print() from the tests as well, because
it clutters the test framework's output and we have no way of
verifying the output.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0612: mark as leak-free

A quick test tells us that t0612 does not trigger any leak:

    $ make SANITIZE=leak test GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true GIT_TEST_OPTS=-i T=t0612-reftable-jgit-compatibility.sh
    [...]
    *** t0612-reftable-jgit-compatibility.sh ***
    in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
    ok 1 - CGit repository can be read by JGit
    ok 2 - JGit repository can be read by CGit
    ok 3 - mixed writes from JGit and CGit
    ok 4 - JGit can read multi-level index
    # passed all 4 test(s)
    1..4
    # faking up non-zero exit with --invert-exit-code
    make[2]: *** [Makefile:75: t0612-reftable-jgit-compatibility.sh] Error 1

Let's mark it as leak-free to silence the machinery activated by
`GIT_TEST_PASSING_SANITIZE_LEAK=check`.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

test-lib: fix GIT_TEST_SANITIZE_LEAK_LOG

When a test that leaks runs with GIT_TEST_SANITIZE_LEAK_LOG=true,
the test returns zero, which is not what we want.

In the if-else's chain we have in "check_test_results_san_file_", we
consider three variables: $passes_sanitize_leak, $sanitize_leak_check
and, implicitly, GIT_TEST_SANITIZE_LEAK_LOG (always set to "true" at
that point).

For the first two variables we have different considerations depending
on the value of $test_failure, which makes sense.  However, for the
third, GIT_TEST_SANITIZE_LEAK_LOG, we don't;  regardless of
$test_failure, we use "invert_exit_code=t" to produce a non-zero
return value.

That assumes "$test_failure" is always zero at that point.  But it
may not be:

   $ git checkout v2.40.1
   $ make test SANITIZE=leak T=t3200-branch.sh # this fails
   $ make test SANITIZE=leak GIT_TEST_SANITIZE_LEAK_LOG=true T=t3200-branch.sh # this succeeds
   [...]
   With GIT_TEST_SANITIZE_LEAK_LOG=true, our logs revealed a memory leak, exiting with a non-zero status!
   # faked up failures as TODO & now exiting with 0 due to --invert-exit-code

We need to use "invert_exit_code=t" only when "$test_failure" is zero.

Let's add the missing conditions in the if-else's chain to make it work
as expected.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0613: mark as leak-free

We can mark t0613 as leak-free:

    $ make test SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true T=t0613-reftable-write-options.sh
    [...]
    *** t0613-reftable-write-options.sh ***
    in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
    ok 1 - default write options
    ok 2 - disabled reflog writes no log blocks
    ok 3 - many refs results in multiple blocks
    ok 4 - tiny block size leads to error
    ok 5 - small block size leads to multiple ref blocks
    ok 6 - small block size fails with large reflog message
    ok 7 - block size exceeding maximum supported size
    ok 8 - restart interval at every single record
    ok 9 - restart interval exceeding maximum supported interval
    ok 10 - object index gets written by default with ref index
    ok 11 - object index can be disabled
    # passed all 11 test(s)
    1..11
    # faking up non-zero exit with --invert-exit-code
    make[2]: *** [Makefile:75: t0613-reftable-write-options.sh] Error 1

Do it.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pathspec: fix typo "glossary-context.txt" -> "glossary-content.txt"

The pathspec syntax is explained in the file "glossary-content.txt".
Moreover, no file named "glossary-context.txt" exists in the repository.

Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

submodule--helper: use strvec_pushf() for --super-prefix

Use the strvec_pushf() call that already appends a slash to also produce
the stuck form of the option --super-prefix instead of adding the option
name in a separate call of strvec_push() or strvec_pushl(). This way we
can more easily see that these parts make up a single option with its
argument and save a function call.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-send-email: use sanitized address when reading mbox body

Addresses that are mentioned on the trailers in the commit log
messages (e.g., "Reviewed-by") are added to the "Cc:" list by "git
send-email".  These hand-written addresses, however, may be
malformed (e.g., having unquoted "." and other punctutation marks in
the display-name part) and can upset MTA.

The code does use the sanitize_address() helper on these
address-looking strings to turn them into valid addresses, but it is
used only to see if the address should be suppressed.  The original
string taken from the message is added to the @cc list if the code
decides the address is not suppressed.

Because the addresses on trailer lines are hand-written and more
likely to contain malformed addresses, when adding to the @cc list,
use the result from sanitize_address, not the original.  Note that
we do not modify the behaviour for addresses taken from the e-mail
headers, as they are more likely to be machine generated and
well-formed.

Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-gui: fix inability to quit after closing another instance

If you open 2 git gui instances in the same directory, then close one
of them and try to close the other, an error message pops up, saying:
'error renaming ".git/GITGUI_BCK": no such file or directory', and it
is no longer possible to close the window ever.

Fix by catching this error, and proceeding even if the file no longer
exists.

Signed-off-by: Orgad Shaneh <orgads@gmail.com>

Sync with 'maint'

More post 2.45.2 updates from the 'master' front

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ds/ahead-behind-fix' into maint-2.45

Fix for a progress bar.

* ds/ahead-behind-fix:
commit-graph: increment progress indicator

Merge branch 'ds/doc-add-interactive-singlekey' into maint-2.45

Doc update.

* ds/doc-add-interactive-singlekey:
doc: interactive.singleKey is disabled by default

Merge branch 'jc/varargs-attributes' into maint-2.45

Varargs functions that are unannotated as printf-like or execl-like
have been annotated as such.

* jc/varargs-attributes:
  __attribute__: add a few missing format attributes
  __attribute__: mark some functions with LAST_ARG_MUST_BE_NULL
  __attribute__: remove redundant attribute declaration for git_die_config()
  __attribute__: trace2_region_enter_printf() is like "printf"

Merge branch 'ps/ci-fix-detection-of-ubuntu-20' into maint-2.45

Fix for an embarrassing typo that prevented Python2 tests from running
anywhere.

* ps/ci-fix-detection-of-ubuntu-20:
ci: fix check for Ubuntu 20.04

Merge branch 'jk/cap-exclude-file-size' into maint-2.45

An overly large ".gitignore" files are now rejected silently.

* jk/cap-exclude-file-size:
dir.c: reduce max pattern file size to 100MB
dir.c: skip .gitignore, etc larger than INT_MAX

Merge branch 'jc/safe-directory-leading-path' into maint-2.45

The safe.directory configuration knob has been updated to
optionally allow leading path matches.

* jc/safe-directory-leading-path:
safe.directory: allow "lead/ing/path/*" match

Merge branch 'rs/difftool-env-simplify' into maint-2.45

Code simplification.

* rs/difftool-env-simplify:
difftool: add env vars directly in run_file_diff()

Merge branch 'ps/fix-reinit-includeif-onbranch' into maint-2.45

"git init" in an already created directory, when the user
configuration has includeif.onbranch, started to fail recently,
which has been corrected.

* ps/fix-reinit-includeif-onbranch:
setup: fix bug with "includeIf.onbranch" when initializing dir

Merge branch 'es/chainlint-ncores-fix' into maint-2.45

The chainlint script (invoked during "make test") did nothing when
it failed to detect the number of available CPUs.  It now falls
back to 1 CPU to avoid the problem.

* es/chainlint-ncores-fix:
  chainlint.pl: latch CPU count directly reported by /proc/cpuinfo
  chainlint.pl: fix incorrect CPU count on Linux SPARC
  chainlint.pl: make CPU count computation more robust

Merge branch 'jc/rev-parse-fatal-doc' into maint-2.45

Doc update.

* jc/rev-parse-fatal-doc:
rev-parse: document how --is-* options work outside a repository

Merge branch 'jc/doc-diff-name-only' into maint-2.45

The documentation for "git diff --name-only" has been clarified
that it is about showing the names in the post-image tree.

* jc/doc-diff-name-only:
diff: document what --name-only shows

Merge branch 'mt/t0211-typofix' into maint-2.45

Test fix.

* mt/t0211-typofix:
t/t0211-trace2-perf.sh: fix typo patern -> pattern

Merge branch 'dg/fetch-pack-code-cleanup' into maint-2.45

Code clean-up to remove an unused struct definition.

* dg/fetch-pack-code-cleanup:
fetch-pack: remove unused 'struct loose_object_iter'

Merge branch 'dm/update-index-doc-fix' into maint-2.45

Doc fix.

* dm/update-index-doc-fix:
documentation: git-update-index: add --show-index-version to synopsis

Merge branch 'ds/scalar-reconfigure-all-fix' into maint-2.45

Scalar fix.

* ds/scalar-reconfigure-all-fix:
scalar: avoid segfault in reconfigure --all

Merge branch 'vd/doc-merge-tree-x-option' into maint-2.45

Doc update.

* vd/doc-merge-tree-x-option:
Documentation/git-merge-tree.txt: document -X

Merge branch 'fa/p4-error' into maint-2.45

P4 update.

* fa/p4-error:
git-p4: show Perforce error to the user

Merge branch 'tb/attr-limits' into maint-2.45

The maximum size of attribute files is enforced more consistently.

* tb/attr-limits:
attr.c: move ATTR_MAX_FILE_SIZE check into read_attr_from_buf()

Merge branch 'rs/diff-parseopts-cleanup' into maint-2.45

Code clean-up to remove code that is now a noop.

* rs/diff-parseopts-cleanup:
diff-lib: stop calling diff_setup_done() in do_diff_cache()

Merge branch 'dk/zsh-git-repo-path-fix' into maint-2.45

Command line completion support for zsh (in contrib/) has been
updated to stop exposing internal state to end-user shell
interaction.

* dk/zsh-git-repo-path-fix:
completion: zsh: stop leaking local cache variable

Merge branch 'bc/zsh-compatibility' into maint-2.45

zsh can pretend to be a normal shell pretty well except for some
glitches that we tickle in some of our scripts. Work them around
so that "vimdiff" and our test suite works well enough with it.

* bc/zsh-compatibility:
vimdiff: make script and tests work with zsh
t4046: avoid continue in &&-chain for zsh

Merge branch 'js/for-each-repo-keep-going' into maint-2.45

A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out.  Now it keeps going.

* js/for-each-repo-keep-going:
  maintenance: running maintenance should not stop on errors
  for-each-repo: optionally keep going on an error

Merge branch 'aj/stash-staged-fix' into maint-2.45

"git stash -S" did not handle binary files correctly, which has
been corrected.

* aj/stash-staged-fix:
stash: fix "--staged" with binary files

Merge branch 'xx/disable-replace-when-building-midx' into maint-2.45

The procedure to build multi-pack-index got confused by the
replace-refs mechanism, which has been corrected by disabling the
latter.

* xx/disable-replace-when-building-midx:
midx: disable replace objects

Merge branch 'pw/rebase-m-signoff-fix' into maint-2.45

"git rebase --signoff" used to forget that it needs to add a
sign-off to the resulting commit when told to continue after a
conflict stops its operation.

* pw/rebase-m-signoff-fix:
  rebase -m: fix --signoff with conflicts
  sequencer: store commit message in private context
  sequencer: move current fixups to private context
  sequencer: start removing private fields from public API
  sequencer: always free "struct replay_opts"

sparse-index: improve lstat caching of sparse paths

The clear_skip_worktree_from_present_files() method was first introduced
in af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files
present in worktree, 2022-01-14) to allow better interaction with the
working directory in the presence of paths outside of the
sparse-checkout. The initial implementation would lstat() every single
SKIP_WORKTREE path to see if it existed; if it ran across a sparse
directory that existed (when a sparse index was in use), then it would
expand the index and then check every SKIP_WORKTREE path.

Since these lstat() calls were very expensive, this was improved in
d79d299352 (Accelerate clear_skip_worktree_from_present_files() by
caching, 2022-01-14) by caching directories that do not exist so it
could avoid lstat()ing any files under such directories. However, there
are some inefficiencies in that caching mechanism.

The caching mechanism stored only the parent directory as not existing,
even if a higher parent directory also does not exist. This means that
wasted lstat() calls would occur when the paths passed to path_found()
change immediate parent directories but within the same parent directory
that does not exist.

To create an example repository that demonstrates this problem, it helps
to have a directory outside of the sparse-checkout that contains many
deep paths. In particular, the first paths (in lexicographic order)
underneath the sparse directory should have deep directory structures,
maximizing the difference between the old caching algorithm that looks
to a single parent and the new caching algorithm that looks to the
top-most missing directory.

The performance test script p2000-sparse-operations.sh takes the sample
repository and copies its HEAD to several copies nested in directories
of the form f<i>/f<j>/f<k> where i, j, and k are numbers from 1 to 4.
The sparse-checkout cone is then selected as "f2/f4/". Creating "f1/f1/"
will trigger the behavior and also lead to some interesting cases for
the caching algorithm since "f1/f1/" exists but "f1/f2/" and "f3/" do
not.

This is difficult to notice when running performance tests using the Git
repository (or a blow-up of the Git repository, as in
p2000-sparse-operations.sh) because Git has a very shallow directory
structure.

This change reorganizes the caching algorithm to focus on storing the
highest level leading directory that does not exist; specifically this
means that that directory's parent _does_ exist. By doing a little extra
work on a path passed to path_found(), we can short-circuit all of the
paths passed to path_found() afterwards that match a prefix with that
non-existing directory. When in a repository where the first sparse file
is likely to have a much deeper path than the first non-existing
directory, this can realize significant gains.

The details of this algorithm require careful attention, so the new
implementation of path_found() has detailed comments, including the use
of a new max_common_dir_prefix() method that may be of independent
interest.

It's worth noting that this is not universally positive, since we are
doing extra lstat() calls to establish the exact path to cache. In the
blow-up of the Git repository, we can see that the lstat count
_increases_ from 28 to 31. However, these numbers were already
artificially low.

Contributor Elijah Newren created a publicly-available test repository
that demonstrates the difference in these caching algorithms in the most
extreme way. To test, follow these steps:

  git clone --sparse https://github.com/newren/gvfs-like-git-bomb
  cd gvfs-like-git-bomb
  ./runme.sh                   # NOTE: check scripts before running!

At this point, assuming you do not have index.sparse=true set globally,
the index has one million paths with the SKIP_WORKTREE bit and they will
all be sent to path_found() in the sparse loop. You can measure this by
running 'git status' with GIT_TRACE2_PERF=1:

    Sparse files in the index: 1,000,000
  sparse_lstat_count (before):   200,000
   sparse_lstat_count (after):         2

And here are the performance numbers:

  Benchmark 1: old
    Time (mean ± σ):     397.5 ms ±   4.1 ms
    Range (min … max):   391.2 ms … 404.8 ms    10 runs

  Benchmark 2: new
    Time (mean ± σ):     252.7 ms ±   3.1 ms
    Range (min … max):   249.4 ms … 259.5 ms    11 runs

  Summary
    'new' ran
      1.57 ± 0.02 times faster than 'old'

By modifying this example further, we can demonstrate a more realistic
example and include the sparse index expansion. Continue by creating
this directory, confusing both caching algorithms somewhat:

  mkdir -p bomb/d/e/f/a/a

Then re-run the 'git status' tests to see these statistics:

    Sparse files in the index: 1,000,000
  sparse_lstat_count (before):   724,010
   sparse_lstat_count (after):       106

  Benchmark 1: old
    Time (mean ± σ):     753.0 ms ±   3.5 ms
    Range (min … max):   749.7 ms … 760.9 ms    10 runs

  Benchmark 2: new
    Time (mean ± σ):     201.4 ms ±   3.2 ms
    Range (min … max):   196.0 ms … 207.9 ms    14 runs

  Summary
    'new' ran
      3.74 ± 0.06 times faster than 'old'

Note that if this repository had a sparse index enabled, the additional
cost of expanding the sparse index affects the total time of these
commands by over four seconds, significantly diminishing the benefit of
the caching algorithm. Having existing paths outside of the
sparse-checkout is a known performance issue for the sparse index and is
a known trade-off for the performance benefits given when no such paths
exist.

Using an internal monorepo with over two million paths at HEAD and a
typical sparse-checkout cone such that the sparse index contains
~190,000 entries (including over two thousand sparse trees), I was able
to measure these lstat counts when one sparse directory actually exists
on disk:

  Sparse files in expanded index: 1,841,997
       full_lstat_count (before): 1,188,161
       full_lstat_count  (after):     4,404

This resulted in this absolute time change, on a warm disk:

      Time in full loop (before): 13.481 s
      Time in full loop  (after):  0.081 s

(These times were calculated on a Windows machine, where lstat() is
slower than a similar Linux machine.)

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sparse-index: count lstat() calls

The clear_skip_worktree.. methods already report some statistics about
how many cache entries are checked against path_found() due to having
the skip-worktree bit set. However, due to path_found() performing some
caching, this isn't the only information that would be helpful to
report.

Add a new lstat_count member to the path_found_data struct to count the
number of times path_found() calls lstat(). This will be helpful to help
explain performance problems in this method as well as to demonstrate
future changes to the caching algorithm in a more concrete way than
end-to-end timings.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sparse-index: use strbuf in path_found()

The path_found() method previously reused strings from the cache entries
the calling methods were using. This prevents string manipulation in
place and causes some odd reallocation before the final lstat() call in
the method.

Refactor the method to use strbufs and copy the path into the strbuf,
but also only the parent directory and not the whole path. This looks
like extra copying when assigning the path to the strbuf, but we save an
allocation by dropping the 'tmp' string, and we are "reusing" the copy
from 'tmp' to put the data in the strbuf.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sparse-index: refactor path_found()

In advance of changing the behavior of path_found(), take all of the
intermediate data values and group them into a single struct. This
simplifies the method prototype as well as the initialization. Future
changes can be made directly to the struct and method without changing
the callers with this approach.

Note that the clear_path_found_data() method is currently empty, as
there is nothing to free. This method is a placeholder for future
changes that require a non-trivial implementation. Its stub is created
now so consumers could call it now and not change in future changes.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sparse-checkout: refactor skip worktree retry logic

The clear_skip_worktree_from_present_files() method was introduced in
af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files present
in worktree, 2022-01-14) to help cases where sparse-checkout is enabled
but some paths outside of the sparse-checkout also exist on disk.  This
operation can be slow as it needs to check path existence in a way not
stored in the index, so caching was introduced in d79d299352 (Accelerate
clear_skip_worktree_from_present_files() by caching, 2022-01-14).

This check is particularly confusing in the presence of a sparse index,
as a sparse tree entry corresponding to an existing directory must first
be expanded to a full index before examining the paths within. This is
currently implemented using a 'goto' and a boolean variable to ensure we
restart only once.

Even with that caching, it was noticed that this could take a long time
to execute. 89aaab11a3 (index: add trace2 region for clear skip
worktree, 2022-11-03) introduced trace2 regions to measure this time.
Further, the way the loop repeats itself was slightly confusing and
prone to breakage, so a BUG() statement was added in 8c7abdc596 (index:
raise a bug if the index is materialised more than once, 2022-11-03) to
be sure that the second run of the loop does not hit any sparse trees.

One thing that can be confusing about the current setup is that the
trace2 regions nest and it is not clear that a second loop is running
after a sparse index is expanded. Here is an example of what the regions
look like in a typical case:

| region_enter | ... | label:clear_skip_worktree_from_present_files
| region_enter | ... | ..label:update
| region_leave | ... | ..label:update
| region_enter | ... | ..label:ensure_full_index
| region_enter | ... | ....label:update
| region_leave | ... | ....label:update
| region_leave | ... | ..label:ensure_full_index
| data         | ... | ..sparse_path_count:1
| data         | ... | ..sparse_path_count_full:269538
| region_leave | ... | label:clear_skip_worktree_from_present_files

One thing that is particularly difficult to understand about these
regions is that most of the time is spent between the close of the
ensure_full_index region and the reporting of the end data. This is
because of the restart of the loop being within the same region as the
first iteration of the loop.

This change refactors the method into two separate methods that are
traced separately. This will be more important later when we change
other features of the methods, but for now the only functional change is
the difference in the structure of the trace regions.

After this change, the same telemetry section is split into three
distinct chunks:

| region_enter | ... | label:clear_skip_worktree_from_present_files_sparse
| data         | ... | ..sparse_path_count:1
| region_leave | ... | label:clear_skip_worktree_from_present_files_sparse
| region_enter | ... | label:update
| region_leave | ... | label:update
| region_enter | ... | label:ensure_full_index
| region_enter | ... | ..label:update
| region_leave | ... | ..label:update
| region_leave | ... | label:ensure_full_index
| region_enter | ... | label:clear_skip_worktree_from_present_files_full
| data         | ... | ..full_path_count:269538
| region_leave | ... | label:clear_skip_worktree_from_present_files_full

Here, we see the sparse loop terminating early with its first sparse
path being a sparse directory containing a file. Then, that loop's
region terminates before ensure_full_index begins (in this case, the
cache-tree must also be computed). Then, _after_ the index is expanded,
the full loop begins with its own region.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The seventeenth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jk/fetch-pack-fsck-wo-lock-pack'

"git fetch-pack -k -k" without passing "--lock-pack" (which we
never do ourselves) did not work at all, which has been corrected.

* jk/fetch-pack-fsck-wo-lock-pack:
fetch-pack: fix segfault when fscking without --lock-pack

Merge branch 'rs/remove-unused-find-header-mem'

Code clean-up.

* rs/remove-unused-find-header-mem:
commit: remove find_header_mem()

Merge branch 'jk/t5500-typofix'

A helper function shared between two tests had a copy-paste bug,
which has been corrected.

* jk/t5500-typofix:
t5500: fix mistaken $SERVER reference in helper function

Merge branch 'js/mingw-remove-unused-extern-decl'

An unused extern declaration for mingw has been removed to prevent
it from causing build failure.

* js/mingw-remove-unused-extern-decl:
mingw: drop bogus (and unneeded) declaration of `_pgmptr`