git.ipfire.org Git - thirdparty/git.git/log

global: mark usage strings and string tables const

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/difftool: remove unnecessary if statement

Since we already teach the `repo_config()` in "f29f1990b5
(config: teach repo_config to allow `repo` to be NULL, 2025-03-08)"
to allow `repo` to be NULL, no need to check if `repo` is NULL
before calling `repo_config()`.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/add: remove unnecessary if statement

Since we already teach the `repo_config()` in "f29f1990b5
(config: teach repo_config to allow `repo` to be NULL, 2025-03-08)"
to allow `repo` to be NULL, no need to check if `repo` is NULL
before calling `repo_config()`.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

perf: do allow `GIT_PERF_*` to be overridden again

A common way to run Git's performance benchmarks on repositories other
than Git's own repository (which is not exactly large when compared to
actually large repositories) is to run them like this:

GIT_PERF_LARGE_REPO=/path/to/my/large/repo \
./p1234-*.sh -ivx

Contrary to developers' common expectations, this failed to work when
Git was built with a different `GIT_PERF_LARGE_REPO` value specified at
build time: That build-time option would have been written to the
`GIT-BUILD-OPTIONS` file, which in turn would have been sourced by
`test-lib.sh`, which in turn would have been sourced by `perf-lib.sh`,
which in turn would have been sourced by the perf test script,
_overriding_ the environment variable specified in the way illustrated
above.

Since perf tests are not run as part of the build, this most likely
unintended behavior was not caught and certainly not fixed, as the
`GIT_PERF_*` values would have been empty at build-time.

However, in 4638e8806e3a (Makefile: use common template for
GIT-BUILD-OPTIONS, 2024-12-06), a subtle change of behavior was
introduced: Whereas before, a couple of build-time options (the
`GIT_PERF_*` ones included) were written to `GIT-BUILD-OPTIONS` only
when their values were non-empty. With this commit, they are also
written when they are empty.

The consequence is that above-mentioned way to run the perf tests will
not only fail to pick up the desired `GIT_PERF_*` settings when they
were specified differently while building Git, instead the desired
settings will be only respected when specified _while building_ Git.

Let's work around the original issue, i.e. let `GIT_PERF_*` environment
variables override what is recorded in `GIT-BUILD-OPTIONS`.

Note that this is just the tip of the iceberg, there are a couple of
`GIT_TEST_*` options that may want a similar fix in `test-lib.sh`. Due
to time constraints on my side, this here patch focuses exclusively on
the `GIT_PERF_*` settings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ob/strip-comments-on-commit'

* ob/strip-comments-on-commit:
git-gui: heed core.commentChar/commentString

t9811: fix misconversion of tests

The previous commit started to insist TAG_F1_ONLY to be missing,
which was not in the original. Let's not be overly eager in the
conversion.

Also, the other hunk in the commit introduced a shell syntax error,
causing the test to fail. Fix it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

environment: fix typo: 'setup_git_directory_gently'

Above the declaration of git_work_tree_cfg, we have:

  /* This is set by setup_git_dir_gently() and/or git_default_config() */
  char *git_work_tree_cfg;

It can be verified that there is no function called
'setup_git_dir_gently' by running grep on the codebase:

  $ grep -R setup_git_dir_gently .
  ./environment.c:/* This is set by setup_git_dir_gently() and/or git_default_config() */

The comment, introduced in e90fdc39b6 (Clean up work-tree handling), is
the only occurrence of the name 'setup_git_dir_gently'.

It probably meant 'setup_git_directory_gently' as that is a name of a
real function in setup.c. Correct it.

Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config.mak.uname: set CSPRNG_METHOD to getrandom on Linux

Commit 05cd988dce ("wrapper: add a helper to generate numbers from a
CSPRNG", 2022-01-17) added a csprng_bytes() function which used one
of several interfaces to provide a source of cryptographically secure
pseudorandom numbers. The CSPRNG_METHOD make variable was provided to
determine the choice of available 'backends' for the source of random
bytes.

Commit 05cd988dce did not set CSPRNG_METHOD in the Linux section of
the config.mak.uname file, so it defaults to using '/dev/urandom' as
the source of random bytes. The 'backend' values which could be used
on Linux are 'arc4random', 'getrandom' or 'getentropy' ('openssl' is
an option, but seems to be discouraged).

The arc4random routines (arc4random_buf() is the one actually used) were
added to glibc in version 2.36, while both getrandom() and getentropy()
were included in 2.25. So, some of the more up-to-date distributions of
Linux (eg Debian 12, Ubuntu 24.04) would be able to use the 'arc4random'
setting. All currently supported distributions have glibc 2.25 or later
(RHEL 8 has v2.28) and, therefore, have support for the 'getrandom' and
'getentropy' settings.

The arc4random routines on the *BSDs (along with cygwin) implement the
ChaCha20 stream cipher algorithm (see RFC8439) in userspace, rather than
as a system call, and are thus somewhat faster (having avoided a context
switch to the kernel). In contrast, on Linux all three functions are
simple wrappers around the same kernel CSPRNG syscall.

If the meson build system is used on a newer platform, then they will be
configured to use 'arc4random', whereas the make build will currently
default to using '/dev/urandom' on Linux. Since there is no advantage,
in terms of performance, to the 'arc4random' setting, the 'getrandom'
setting should be preferred from an availability perspective. (Also, the
current uses of csprng_bytes() are not in any hot path).

In order to set an appropriate default, set the CSPRNG_METHOD build
variable to 'getrandom' in the Linux section of the 'config.mak.uname'
file.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The seventh batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ab/environment-clean-header'

Code clean-up.

* ab/environment-clean-header:
environment.h: remove unused variables

Merge branch 'ps/refname-avail-check-optim'

Incorrect sorting of refs with bytes with high-bit set on platforms
with signed char led to a BUG, which has been corrected.

* ps/refname-avail-check-optim:
refs/packed: fix BUG when seeking refs with UTF-8 characters

Merge branch 'cj/refname-avail-check-optim-typofix'

Comment fix.

* cj/refname-avail-check-optim-typofix:
refs: fix duplicated word in comment

Merge branch 'ua/update-update-server-info'

Code simplification.

* ua/update-update-server-info:
builtin/update-server-info: remove unnecessary if statement

Merge branch 'en/merge-recursive-debug'

Remove remnants of the recursive merge strategy backend, which was
superseded by the ort merge strategy.

* en/merge-recursive-debug:
  builtin/{merge,rebase,revert}: remove GIT_TEST_MERGE_ALGORITHM
  tests: remove GIT_TEST_MERGE_ALGORITHM and test_expect_merge_algorithm
  merge-recursive.[ch]: thoroughly debug these
  merge, sequencer: switch recursive merges over to ort
  sequencer: switch non-recursive merges over to ort
  merge-ort: enable diff-algorithms other than histogram
  builtin/merge-recursive: switch to using merge_ort_generic()
  checkout: replace merge_trees() with merge_ort_nonrecursive()

Merge branch 'kn/blame-porcelain-unblamable'

"git blame --porcelain" mode now talks about unblamable lines and
lines that are blamed to an ignored commit.

* kn/blame-porcelain-unblamable:
blame: print unblamable and ignored commits in porcelain mode

Merge branch 'jk/fetch-follow-remote-head-fix'

"git fetch [<remote>]" with only the configured fetch refspec
should be the only thing to update refs/remotes/<remote>/HEAD,
but the code was overly eager to do so in other cases.

* jk/fetch-follow-remote-head-fix:
  fetch: make set_head() call easier to read
  fetch: don't ask for remote HEAD if followRemoteHEAD is "never"
  fetch: only respect followRemoteHEAD with configured refspecs

parse-options: detect mismatches in integer signedness

It was reported that "t5620-backfill.sh" fails on s390x and sparc64 in a
test that exercises the "--min-batch-size" command line option. The
symptom was that the option didn't seem to have an effect: we didn't
fetch objects with a batch size of 20, but instead fetched all objects
at once.

As it turns out, the root cause is that `--min-batch-size` uses
`OPT_INTEGER()` to parse the command line option. While this macro
expects the caller to pass a pointer to an integer, we instead pass a
pointer to a `size_t`. This coincidentally works on most platforms, but
it breaks apart on the mentioned platforms because they are big endian.

This issue isn't specific to git-backfill(1): there are a couple of
other places where we have the same type confusion going on. This
indicates that the issue really is the interface that the parse-options
subsystem provides -- it is simply too easy to get this wrong as there
isn't any kind of compiler warning, and things just work on the most
common systems.

Address the systemic issue by introducing two new build asserts
`BARF_UNLESS_SIGNED()` and `BARF_UNLESS_UNSIGNED()`. As the names
already hint at, those macros will cause a compiler error when passed a
value that is not signed or unsigned, respectively.

Adapt `OPT_INTEGER()`, `OPT_UNSIGNED()` as well as `OPT_MAGNITUDE()` to
use those asserts. This uncovers a small set of sites where we indeed
have the same bug as in git-backfill(1). Adapt all of them to use the
correct option.

Reported-by: Todd Zullinger <tmz@pobox.com>
Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

parse-options: introduce precision handling for `OPTION_UNSIGNED`

This commit is the equivalent to the preceding commit, but instead of
introducing precision handling for `OPTION_INTEGER` we introduce it for
`OPTION_UNSIGNED`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

parse-options: introduce precision handling for `OPTION_INTEGER`

The `OPTION_INTEGER` option type accepts a signed integer. The type of
the underlying integer is a simple `int`, which restricts the range of
values accepted by such options. But there is a catch: because the
caller provides a pointer to the value via the `.value` field, which is
a simple void pointer. This has two consequences:

  - There is no check whether the passed value is sufficiently long to
    store the entire range of `int`. This can lead to integer wraparound
    in the best case and out-of-bounds writes in the worst case.

  - Even when a caller knows that they want to store a value larger than
    `INT_MAX` they don't have a way to do so.

In practice this doesn't tend to be a huge issue because users typically
don't end up passing huge values to most commands. But the parsing logic
is demonstrably broken, and it is too easy to get the calling convention
wrong.

Improve the situation by introducing a new `precision` field into the
structure. This field gets assigned automatically by `OPT_INTEGER_F()`
and tracks the size of the passed value. Like this it becomes possible
for the caller to pass arbitrarily-sized integers and the underlying
logic knows to handle it correctly by doing range checks. Furthermore,
convert the code to use `strtoimax()` intstead of `strtol()` so that we
can also parse values larger than `LONG_MAX`.

Note that we do not yet assert signedness of the passed variable, which
is another source of bugs. This will be handled in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()`

With the preceding commit, `OPT_INTEGER()` has learned to support unit
factors. Consequently, the major differencen between `OPT_INTEGER()` and
`OPT_MAGNITUDE()` isn't the support of unit factors anymore, as both of
them do support them now. Instead, the difference is that one handles
signed and the other handles unsigned integers.

Adapt the name of `OPT_MAGNITUDE()` accordingly by renaming it to
`OPT_UNSIGNED()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

parse-options: support unit factors in `OPT_INTEGER()`

There are two main differences between `OPT_INTEGER()` and
`OPT_MAGNITUDE()`:

  - The former parses signed integers whereas the latter parses unsigned
    integers.

  - The latter parses unit factors like 'k', 'm' or 'g'.

While the first difference makes obvious sense, there isn't really a
good reason why signed integers shouldn't support unit factors, too.

This inconsistency will also become a bit of a problem with subsequent
commits, where we will fix a couple of callsites that pass an unsigned
integer to `OPT_INTEGER()`. There are three options:

  - We could adapt those users to instead pass a signed integer, but
    this would needlessly extend the range of accepted integer values.

  - We could convert them to use `OPT_MAGNITUDE()`, as it only accepts
    unsigned integers. But now we have the inconsistency that we also
    start to accept unit factors.

  - We could introduce `OPT_UNSIGNED()` as equivalent to `OPT_INTEGER()`
    so that it knows to only accept unsigned integers without unit
    suffix.

Introducing a whole new option type feels a bit excessive. There also
isn't really a good reason why `OPT_INTEGER()` cannot be extended to
also accept unit factors: all valid values passed to such options cannot
have a unit factors right now, so there wouldn't be any ambiguity.

Refactor `OPT_INTEGER()` to use `git_parse_int()`, which knows to
interpret unit factors. This removes the inconsistency between the
signed and unsigned options so that we can easily fix up callsites that
pass the wrong integer type right now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

global: use designated initializers for options

While we expose macros for most of our different option types understood
by the "parse-options" subsystem, not every combination of fields that
has one as that would otherwise quickly lead to an explosion of macros.
Instead, we just initialize structures manually for those variants of
fields that don't have a macro.

Callsites that open-code these structure initialization don't use
designated initializers though and instead just provide values for each
of the fields that they want to initialize. This has three significant
downsides:

  - Callsites need to specify all values up to the last field that they
    care about. This often includes fields that should simply be left at
    their default zero-initialized state, which adds distraction.

  - Any reader not deeply familiar with the layout of the structure
    has a hard time figuring out what the respective initializers mean.

  - Reordering or introducing new fields in the middle of the structure
    is impossible without adapting all callsites.

Convert all sites to instead use designated initializers, which we have
started using in our codebase quite a while ago. This allows us to skip
any default-initialized fields, gives the reader context by specifying
the field names and allows us to reorder or introduce new fields where
we want to.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

parse: fix off-by-one for minimum signed values

We accept a maximum value in `git_parse_signed()` that restricts the
range of accepted integers. As the intent is to pass `INT*_MAX` values
here, this maximum doesn't only act as the upper bound, but also as the
implicit lower bound of the accepted range.

This lower bound is calculated by negating the maximum. But given that
the maximum value of a signed integer with N bits is `2^(N-1)-1` whereas
the minimum value is `-2^(N-1)` we have an off-by-one error in the lower
bound.

Fix this off-by-one error by using `-max - 1` as lower bound instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config.mak.uname: add arc4random to the cygwin build

The arc4random_buf() function has been available in cygwin since
about 2016 (somewhere in the v2.x branch). Set the CSPRNG_METHOD
build variable to 'arc4random', in the cygwin section, to enable
the use of this cryptographically-secure pseudorandom number
function. Note that the autoconf and new meson builds also enable
this function.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config.mak.uname: add sysinfo() configuration for cygwin

Although sysinfo() is a 'Linux only' function, cygwin provides an
implementation which appears to be functional. The assumption that
this function is Linux only is reflected in the way the HAVE_SYSINFO
build variable is handled by the Makefile and config.mak.uname.

Rework the setting of HAVE_SYSINFO in the Linux section of the system
specific config file, along with the corresponding setting of the
BASIC_CFLAGS in the Makefile. Add the setting of HAVE_SYSINFO to the
cygwin section of 'config.mak.uname'. While here, add a test for the
sysinfo() function to the autoconf build system.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/gc.c: correct RAM calculation when using sysinfo

The man page for sysinfo(2) on Linux states that (from v2.3.48) the
sizes of the memory and swap fields, of the returned structure, are
given as multiples of 'mem_unit' bytes. In earlier versions (prior to
v2.3.23 on i386 in particular), the 'mem_unit' field was not part of
the structure, and all sizes were measured in bytes. The man page does
not discuss the motivation for this change, but it is possible that the
change was intended for the, relatively rare, 32-bit platform with more
than 4GB of memory.

The total_ram() function makes the assumption that the 'totalram' field
of the 'struct sysinfo' is measured in bytes, or alternatively that the
'mem_unit' field is always equal to one. Having writen a program to call
the sysinfo() function and print the structure fields, it seems that, on
Linux x84_64 and i686 anyway, the 'mem_unit' field is indeed set to one
(note that the 32-bit system had only 2GB ram). However, cygwin also has
an sysinfo() implementation, which gives the following values:

  $ ./sysinfo
  uptime:      21381
  loads:       0, 0, 0
  total ram:   2074637
  free ram:    843237
  shared ram:  0
  buffer ram:  0
  total swap:  327680
  free swap:   306932
  procs:       15
  total high:  0
  free high:   0
  mem_unit:    4096

  total ram: 8497713152
  $

[This laptop has 8GB ram, so a little bit seems to be missing. ;) ]

Modify the total_ram() function to allow for the possibility that the
memory size is not specified in bytes (ie 'mem_unit' is greater than
one).

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config.mak.uname: add clock_gettime() to the cygwin build

Cygwin supports the clock_gettime() function, along with the associated
CLOCK_MONOTONIC preprocessor symbol. The autoconf and meson builds both
enable the use of those symbols. In order to have the same configuration
for the make builds, add the HAVE_CLOCK_GETTIME and HAVE_CLOCK_MONOTONIC
build variables to the cygwin section of the config.mak.uname file.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config.mak.uname: add HAVE_GETDELIM to the cygwin section

Cygwin has provided the getdelim() function as far back as (at least)
2011. The autoconf and meson builds enable the use of this symbol.
In order to have the same configuration for autoconf, meson and make,
enable the HAVE_GETDELIM build variable in the cygwin section of the
config.mak.uname file.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config.mak.uname: only set NO_REGEX on cygwin for v1.7

Commit 92f63d2b05 ("Cygwin 1.7 needs compat/regex", 2013-07-19) set
the NO_REGEX build variable because the platform regex library failed
some of the tests (t4018 and t4034), which passed just fine with the
compat library.

After some time (maybe a year or two), the platform library had been
updated (with an import from FreeBSD, I believe) and now passed the full
test-suite. This would be about the time of the v1.7 -> v2.0 transition
in 2015. I had a patch ready to send, but just didn't get around to
submitting it to the list. At some point in the interim, the official
cygwin git package used the autoconf build system, which sets the
NO_REGEX variable to use the platform regex library functions. The new
meson build system does likewise.

The cygwin platform regex library, in addition to now passing the tests
which formerly failed, now passes an 'test_expect_failure' test in the
t7815-grep-binary test file. In particular, test #12 'git grep .fi a'
which determines that the regex pattern '.' matches a NUL character.
The commit f96e56733a ("grep: use REG_STARTEND for all matching if
available", 2010-05-22) added the test in question, but it does not
give any indication as to why the test was framed as an expected fail,
rather than a 'positive' test that the 'git grep' command fails to
match a NUL. Note that the previous test #11 was also originally
marked in that commit as a 'test_expect_failure', but was flipped to
an 'success' test in commit 7e36de5859 ("t/t7008-grep-binary.sh: un-TODO
a test that needs REG_STARTEND", 2010-08-17).

In order to produce the same NO_REGEX configuration from autoconf, meson
and make, modify config.mak.uname to only set NO_REGEX for cygwin v1.7.
In addition, skip test t7815.12 on cygwin, by adding the !CYGWIN pre-
requisite to the test header, which (among other things) removes an
'...; please update test(s)' comment.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config.mak.uname: add a note about NO_STRLCPY for Linux

Commit 817151e61a ("Rename safe_strncpy() to strlcpy().", 2006-06-24)
added the NO_STRLCPY make variable to allow the conditional use of
the gitstrlcpy() compat function on those platforms which didn't
provide the 'standard' strlcpy() function.

Recently, in the summer of 2023, the strlcpy() and strlcat() functions
were added to the glibc library (v2.38), so some of the more up-to-date
Linux distributions no longer need to set NO_STRLCPY. For example, both
Ubuntu 24.04 LTS and RHEL 10 beta have glibc v2.39. However, several
distributions, which are still within their support window, have an
earlier version and must still use the 'compat' version of strlcpy().

If the meson or autoconf build systems are used on newer platforms, then
they will be configured to to use strlcpy() from glibc, whereas the make
build will always choose the 'compat' function instead. Add a note to
the config.mak.uname file, in the Linux section, to prompt make users to
override NO_STRLCPY in the config.mak file, if appropriate.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Makefile: remove NEEDS_LIBRT build variable

Commit d19e3a5b21 ("Makefile: add NEEDS_LIBRT to optionally link with
librt", 2016-07-07) introduced the NEEDS_LIBRT build variable to
disassociate the HAVE_CLOCK_GETTIME variable with the unconditional
linking of the librt library. At one time, the clock_gettime() function
was not available as part of the libc library and (on some unix systems)
required linking with librt.

Commit 52fcec75ce ("config.mak.uname: define NEEDS_LIBRT under Linux, for
now", 2016-07-10) set the NEEDS_LIBRT variable in the Linux section of
the config.mak.uname file, since Debian 7 (wheezy) was one of the few
remaining distributions, with glibc 2.13, that required linking with
librt for clock_gettime(). Note that from glibc version 2.17, this is no
longer necessary.

Note that Debian 7.0 was released on May 4th, 2013 and benefited from
long term support until May 2018 when it went end-of-life. Since that
time, Linux distributions use a more up-to-date library, for example:

    Distribution   version  end of support

    Debian 8       2.19     30th June 2020
    RHEL   8       2.28     31st May  2024 *
    Ubuntu 16.04   2.23     30th Apr  2021

* paid 'Maintenance support' ends 31st May 2029

Since it is no longer required, remove NEEDS_LIBRT from the Makefile and
config.mak.uname.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson.build: set default help format to html on windows

The build variable DEFAULT_HELP_FORMAT has an appropriate default
('man') set in the code, so there is no need to pass the -Define on
the compiler command-line, unless the build requires a non-standard
value.

In addition, on windows the make build overrides the default help
format to 'html', rather than 'man', in the 'config.mak.uname' file.

In order to suppress the -Define on the C compiler command-line, only
add the -Define to the 'libgit_c_args' variable when the requested
value is not the standard 'man'. In order to override the default value
on windows, add a 'platform' value to the 'default_help_format' combo
option and set it as the default choice. When this option is set to
'platform', use the 'host_machine.system()' method call to determine the
appropriate default value for the host system.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson.build: only set build variables for non-default values

Some preprocessor -Defines have defaults set in the source code when
they have not been provided to the C compiler. In this case, there is
no need to pass them on the command-line, unless the build requires a
non-standard value.

The build variables for DEFAULT_EDITOR and DEFAULT_PAGER have appropriate
defaults ('vi' and 'less') set in the code. Add the preprocessor -Defines
to the 'libgit_c_args' only if the values set with the corresponding
'options' are different to these standard values.

Also, the 'git-var' documentation contains some conditional text which
documents the chosen compiled in value, which would not read well for
the standard values. Similar to the above, only add the corresponding
'-a' attribute arguments to the 'asciidoc_common_options' variable, if
the values set in the 'options' are different to these standard values.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Makefile: only set some BASIC_CFLAGS when RUNTIME_PREFIX is set

Several build variables only have any meaning when the RUNTIME_PREFIX
variable has been set. In particular, the following build variables are
otherwise ignored:

    HAVE_BSD_KERN_PROC_SYSCTL
    PROCFS_EXECUTABLE_PATH
    HAVE_NS_GET_EXECUTABLE_PATH
    HAVE_ZOS_GET_EXECUTABLE_PATH
    HAVE_WPGMPTR

Make setting BASIC_CFLAGS, for each of these variables, conditional on
the RUNTIME_PREFIX being defined.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson.build: remove -DCURL_DISABLE_TYPECHECK

Commit 9371322a60 ("sparse: suppress some \"using sizeof on a function\"
warnings", 2013-10-06) used target-specific variable assignments to add
-DCURL_DISABLE_TYPECHECK to SPARSE_FLAGS for each of the files affected
by the "typecheck-gcc.h" warnings. (http-push.c, http.c, http-walker.c
and remote-curl.c).

These warnings are only issued by sparse, and not by gcc, so we do not
want to disable the 'type checking' for non-sparse targets. The meson
build does not provide any sparse targets, so there is no need to use
the CURL_DISABLE_TYPECHECK preprocessor flag with the c compiler.

In order to re-enable the curl 'type checking' in the meson build, remove
the assignment of -DCURL_DISABLE_TYPECHECK to libgit_c_args.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The sixth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ps/cat-file-filter-batch'

"git cat-file --batch" and friends learned to allow "--filter=" to
omit certain objects, just like the transport layer does.

* ps/cat-file-filter-batch:
  builtin/cat-file: use bitmaps to efficiently filter by object type
  builtin/cat-file: deduplicate logic to iterate over all objects
  pack-bitmap: introduce function to check whether a pack is bitmapped
  pack-bitmap: add function to iterate over filtered bitmapped objects
  pack-bitmap: allow passing payloads to `show_reachable_fn()`
  builtin/cat-file: support "object:type=" objects filter
  builtin/cat-file: support "blob:limit=" objects filter
  builtin/cat-file: support "blob:none" objects filter
  builtin/cat-file: wire up an option to filter objects
  builtin/cat-file: introduce function to report object status
  builtin/cat-file: rename variable that tracks usage

Merge branch 'ps/test-wo-perl-prereq'

"make test" used to have a hard dependency on (basic) Perl; tests
have been rewritten help environment with NO_PERL test the build as
much as possible.

* ps/test-wo-perl-prereq:
  t5703: refactor test to not depend on Perl
  t5316: refactor `max_chain()` to not depend on Perl
  t0210: refactor trace2 scrubbing to not use Perl
  t0021: refactor `generate_random_characters()` to not depend on Perl
  t/lib-httpd: refactor "one-time-perl" CGI script to not depend on Perl
  t/lib-t6000: refactor `name_from_description()` to not depend on Perl
  t/lib-gpg: refactor `sanitize_pgp()` to not depend on Perl
  t: refactor tests depending on Perl for textconv scripts
  t: refactor tests depending on Perl to print data
  t: refactor tests depending on Perl substitution operator
  t: refactor tests depending on Perl transliteration operator
  Makefile: stop requiring Perl when running tests
  meson: stop requiring Perl when tests are enabled
  t: adapt existing PERL prerequisites
  t: introduce PERL_TEST_HELPERS prerequisite
  t: adapt `test_readlink()` to not use Perl
  t: adapt `test_copy_bytes()` to not use Perl
  t: adapt character translation helpers to not use Perl
  t: refactor environment sanitization to not use Perl
  t: skip chain lint when PERL_PATH is unset

Merge branch 'jt/help-sha-backend-info-in-build-options'

"git help --build-options" reports SHA-1 and SHA-256 backends used
in the build.

* jt/help-sha-backend-info-in-build-options:
help: include unsafe SHA-1 build info in version
help: include SHA implementation in version info

Merge branch 'kn/non-transactional-batch-updates'

Updating multiple references have only been possible in all-or-none
fashion with transactions, but it can be more efficient to batch
multiple updates even when some of them are allowed to fail in a
best-effort manner.  A new "best effort batches of updates" mode
has been introduced.

* kn/non-transactional-batch-updates:
  update-ref: add --batch-updates flag for stdin mode
  refs: support rejection in batch updates during F/D checks
  refs: implement batch reference update support
  refs: introduce enum-based transaction error types
  refs/reftable: extract code from the transaction preparation
  refs/files: remove duplicate duplicates check
  refs: move duplicate refname update check to generic layer
  refs/files: remove redundant check in split_symref_update()

Merge branch 'zy/send-email-error-handling'

Auth-related (and unrelated) error handling in send-email has been
made more robust.

* zy/send-email-error-handling:
send-email: finer-grained SMTP error handling
send-email: capture errors in an eval {} block

Merge branch 'ps/maintenance-reflog-expire'

"git maintenance" learns a new task to expire reflog entries.

* ps/maintenance-reflog-expire:
  builtin/maintenance: introduce "reflog-expire" task
  builtin/gc: split out function to expire reflog entries
  builtin/reflog: make functions regarding `reflog_expire_options` public
  builtin/reflog: stop storing per-reflog expiry dates globally
  builtin/reflog: stop storing default reflog expiry dates globally
  reflog: rename `cmd_reflog_expire_cb` to `reflog_expire_options`

Merge branch 'jt/rev-list-z'

"git rev-list" learns machine-parsable output format that delimits
each field with NUL.

* jt/rev-list-z:
  rev-list: support NUL-delimited --missing option
  rev-list: support NUL-delimited --boundary option
  rev-list: support delimiting objects with NUL bytes
  rev-list: refactor early option parsing
  rev-list: inline `show_object_with_name()` in `show_object()`

Merge branch 'ab/pathspec-sign-compare-workaround'

Some warnings from "-Wsign-compare" for pathspec.c have been
squelched.

* ab/pathspec-sign-compare-workaround:
pathspec: fix sign comparison warnings

Merge branch 'ps/misc-build-fixes'

Random build fixes.

* ps/misc-build-fixes:
  ci: use Visual Studio for win+meson job on GitHub Workflows
  meson: distinguish build and target host binaries
  meson: respect 'tests' build option in contrib
  gitweb: fix generation of "gitweb.js"
  meson: fix handling of '-Dcurl=auto'

Merge branch 'sk/clar-trailer-urlmatch-norm-test'

A few traditional unit tests have been rewritten to use the clar
framework.

* sk/clar-trailer-urlmatch-norm-test:
t/unit-tests: convert urlmatch-normalization test to clar
t/unit-tests: convert trailer test to use clar

Merge branch 'ab/rm-sign-compare'

Some warnings from "-Wsign-compare" for builtin/rm.c have been
squelched.

* ab/rm-sign-compare:
rm: fix sign comparison warnings

Merge branch 'jt/ref-transaction-abort-fix'

A ref transaction corner case fix.

* jt/ref-transaction-abort-fix:
builtin/fetch: avoid aborting closed reference transaction

Merge branch 'js/ci-fedora-gawk'

Work around CI breakage due to fedora base image getting updated.

* js/ci-fedora-gawk:
ci(pedantic): ensure that awk is installed

Merge branch 'js/ci-github-update-ubuntu'

Adjust to the deprecation of use of Ubuntu 20.04 GitHub Actions CI.

* js/ci-github-update-ubuntu:
ci: upgrade `sparse` to supported build agents

Merge branch 'dd/sparse-glibc-workaround'

Squelch false-positive from sparse.

* dd/sparse-glibc-workaround:
sparse: ignore warning from new glibc headers

t9811: be more precise to check importing of tags

The tests use grep to search the output of `git tag` for tagnames they
expect to exist, which can incorrectly pass if an unxpected tag
has the expected tag as its substring. We fix this by using `git
show-ref --verify` instead.

Additionally, we add a negative test to verify that a possible
uninteded tag does not show up in the imported repository.

This change also fixes an additional problem, where piping the
output of `git tag` caused the exit codes to be lost.

Signed-off-by: Anthony Wang <anthonywang513@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

docs: document core.hooksPath=/dev/null

If a user wishes to disable hooks, then they can do so using the
established pattern of setting 'core.hooksPath' to /dev/null. This is
already tested in t1350-config-hooks-path.sh, but has not previously
been visible in the documentation.

Update the documentation to include this as an option.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation: stop depending on Perl to generate command list

The "cmd-list.perl" script is used to extract the list of commands part
of a specific category and extracts the description of each command from
its respective manpage. The generated output is then included in git(1)
to list all Git commands.

The script is written in Perl. Refactor it to use shell scripting
exclusively so that we can get rid of the mandatory dependency on Perl
to build our documentation.

The converted script is slower compared to its Perl implementation. But
by being careful and not spawning external commands in `format_one ()`
we can mitigate the performance hit to a reasonable level:

    Benchmark 1: Perl
      Time (mean ± σ):      10.3 ms ±   0.2 ms    [User: 7.0 ms, System: 3.3 ms]
      Range (min … max):    10.0 ms …  11.1 ms    200 runs

    Benchmark 2: Shell
      Time (mean ± σ):      74.4 ms ±   0.4 ms    [User: 48.6 ms, System: 24.7 ms]
      Range (min … max):    73.1 ms …  75.5 ms    200 runs

    Summary
      Perl ran
        7.23 ± 0.13 times faster than Shell

While a sevenfold slowdown is significant, the benefit of not requiring
Perl for a fully-functioning Git installation outweighs waiting a couple
of milliseconds longer during the build process.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation: stop depending on Perl to massage user manual

The "fix-texi.perl" script is used to fix up the output of
`docbook2x-texi`:

- It changes the filename to be "git.info".

- It changes the directory category and entry.

The script is written in Perl, but it can be rather trivially converted
to a shell script. Do so to remove the dependency on Perl for building
the user manual.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

request-pull: stop depending on Perl

While git-request-pull(1) is written as a shell script, for it to
function we depend on Perl being available. The script gets installed
unconditionally though, regardless of whether or not Perl is even
available on the system. When it's not available, the `@PERL_PATH@`
variable may be substituted with a nonexistent executable path and thus
cause the script to fail.

Refactor the script so that it does not depend on Perl at all anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

filter-branch: stop depending on Perl

While git-filter-branch(1) is written as a shell script, the
`--state-branch` feature depends on Perl to save and extract the object
ID mappings. This can lead to subtle breakage though:

  - We execute `perl` directly without respecting the `PERL_PATH`
    configured by the distribution. As such, it may happen that we use
    the wrong version of Perl.

  - We install the script unchanged even if Perl isn't available at all
    on the system, so using `--state-branch` would lead to failure
    altogether in that case.

Fix this by dropping Perl and instead implementing the feature with
shell scripting exclusively.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci(pedantic): ensure that awk is installed

The image pointed to by the fedora:latest tag has moved from fedora
41 to 42. The fedora 41 container images have awk installed while
the fedora 42 images do not.  That change is most likely just part
of reducing the size of the base container images.

In both AlmaLinux and Fedora (as well as other RHEL
derivatives/relatives), awk is provided by the gawk package.

On Fedora, `dnf install awk` would work, by using the package
filelist data to determine that /usr/bin/awk is provided by gawk and
installs gawk as a result.

On AlmaLinux (8 & 9, by quick testing by Todd), that is not the case
and you'd need to use `dnf install gawk` or `dnf install '*bin/awk'`
to get it installed. Having said that, awk _is_ included in the
current AlmaLinux 8 and 9 images, so it isn't strictly needed.  But
it's probably better to be explicit that we need it installed, as a
defense against some future change to the AlmaLinux container
removing awk.

Because we know that on both of these distros, our scripts that call
for 'awk' had been using 'gawk' that was installed as part of the
base image, let's make sure that we explicitly install 'gawk'.  If
the image already has it, it would be a no-op that does not cause
breakage.

Suggested-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The fifth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'bc/allow-upload-pack-from-other-people'

Test fix for an already graduated topic.

* bc/allow-upload-pack-from-other-people:
t5605: fix test for cloning from a different user

Merge branch 'pw/custom-conflict-marker-size-for-merge-related-docs'

"git-merge-file" documentation source, which has lines that look
like conflict markers, lacked custom conflict marker size defined,
which has been corrected..

* pw/custom-conflict-marker-size-for-merge-related-docs:
merge-file doc: set conflict-marker-size attribute

Merge branch 'js/comma-semicolon-confusion'

Code clean-up.

* js/comma-semicolon-confusion:
  detect-compiler: detect clang even if it found CUDA
  clang: warn when the comma operator is used
  compat/regex: explicitly mark intentional use of the comma operator
  wildmatch: avoid using of the comma operator
  diff-delta: avoid using the comma operator
  xdiff: avoid using the comma operator unnecessarily
  clar: avoid using the comma operator unnecessarily
  kwset: avoid using the comma operator unnecessarily
  rebase: avoid using the comma operator unnecessarily
  remote-curl: avoid using the comma operator unnecessarily

Merge branch 'jt/clone-guess-remote-head-fix'

"git clone" still gave the message about the default branch name;
this message has been turned into an advice message that can be
turned off.

* jt/clone-guess-remote-head-fix:
  advice: allow disabling default branch name advice
  builtin/clone: suppress unexpected default branch advice
  remote: allow `guess_remote_head()` to suppress advice

Merge branch 'ds/maintenance-loose-objects-batchsize'

The job to coalesce loose objects into packfiles in "git
maintenance" now has configurable batch size.

* ds/maintenance-loose-objects-batchsize:
maintenance: add loose-objects.batchSize config
maintenance: force progress/no-quiet to children

Merge branch 'lo/userdiff-gitconfig'

* lo/userdiff-gitconfig:
userdiff: add builtin driver for INI files

Merge branch 'ps/mingw-creat-excl-fix'

Fix lockfile contention in reftable code on Windows.

* ps/mingw-creat-excl-fix:
compat/mingw: fix EACCESS when opening files with `O_CREAT | O_EXCL`
meson: fix compat sources when compiling with MSVC

Merge branch 'kn/reflog-drop'

"git reflog" learns "drop" subcommand, that discards the entire
reflog data for a ref.

* kn/reflog-drop:
reflog: implement subcommand to drop reflogs
reflog: improve error for when reflog is not found

Merge branch 'ps/object-wo-the-repository'

The object layer has been updated to take an explicit repository
instance as a parameter in more code paths.

* ps/object-wo-the-repository:
  hash: stop depending on `the_repository` in `null_oid()`
  hash: fix "-Wsign-compare" warnings
  object-file: split out logic regarding hash algorithms
  delta-islands: stop depending on `the_repository`
  object-file-convert: stop depending on `the_repository`
  pack-bitmap-write: stop depending on `the_repository`
  pack-revindex: stop depending on `the_repository`
  pack-check: stop depending on `the_repository`
  environment: move access to "core.bigFileThreshold" into repo settings
  pack-write: stop depending on `the_repository` and `the_hash_algo`
  object: stop depending on `the_repository`
  csum-file: stop depending on `the_repository`

Merge branch 'md/t1403-path-is-file'

Test tweak.

* md/t1403-path-is-file:
t1403: verify that path exists and is a file

Merge branch 'jk/zlib-inflate-fixes'

Fix our use of zlib corner cases.

* jk/zlib-inflate-fixes:
  unpack_loose_rest(): rewrite return handling for clarity
  unpack_loose_rest(): simplify error handling
  unpack_loose_rest(): never clean up zstream
  unpack_loose_rest(): avoid numeric comparison of zlib status
  unpack_loose_header(): avoid numeric comparison of zlib status
  git_inflate(): skip zlib_post_call() sanity check on Z_NEED_DICT
  unpack_loose_header(): fix infinite loop on broken zlib input
  unpack_loose_header(): report headers without NUL as "bad"
  unpack_loose_header(): simplify next_out assignment
  loose_object_info(): BUG() on inflating content with unknown type

Merge branch 'ps/reftable-windows-unlink-fix'

Portability fix.

* ps/reftable-windows-unlink-fix:
reftable: ignore file-in-use errors when unlink(3p) fails on Windows

Merge branch 'ps/test-wo-perl-prereq' into ps/fewer-perl

* ps/test-wo-perl-prereq:
  t5703: refactor test to not depend on Perl
  t5316: refactor `max_chain()` to not depend on Perl
  t0210: refactor trace2 scrubbing to not use Perl
  t0021: refactor `generate_random_characters()` to not depend on Perl
  t/lib-httpd: refactor "one-time-perl" CGI script to not depend on Perl
  t/lib-t6000: refactor `name_from_description()` to not depend on Perl
  t/lib-gpg: refactor `sanitize_pgp()` to not depend on Perl
  t: refactor tests depending on Perl for textconv scripts
  t: refactor tests depending on Perl to print data
  t: refactor tests depending on Perl substitution operator
  t: refactor tests depending on Perl transliteration operator
  Makefile: stop requiring Perl when running tests
  meson: stop requiring Perl when tests are enabled
  t: adapt existing PERL prerequisites
  t: introduce PERL_TEST_HELPERS prerequisite
  t: adapt `test_readlink()` to not use Perl
  t: adapt `test_copy_bytes()` to not use Perl
  t: adapt character translation helpers to not use Perl
  t: refactor environment sanitization to not use Perl
  t: skip chain lint when PERL_PATH is unset

object-store: merge "object-store-ll.h" and "object-store.h"

The "object-store-ll.h" header has been introduced to keep transitive
header dependendcies and compile times at bay. Now that we have created
a new "object-store.c" file though we can easily move the last remaining
additional bit of "object-store.h", the `odb_path_map`, out of the
header.

Do so. As the "object-store.h" header is now equivalent to its low-level
alternative we drop the latter and inline it into the former.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-store: remove global array of cached objects

Cached objects are virtual objects that can be set up without writing
anything into the object store directly, which is used by git-blame(1)
to create fake commits for the working tree.

These cached objects are stored in a global variable, which is another
roadblock for libification of the object subsystem. Refactor the code so
that we instead store the array as part of the raw object store.

This refactoring raises the question whether virtual objects should
really be specific to a single repository (or rather a single object
store). Hypothetical usecases might for example span across submodules,
and here it may or may not be the right thing to provide virtual objects
across submodule boundaries.

The only existing usecase is git-blame(1) though, which does not know to
blame across submodule boundaries in the first place. As such, storing
these objects both globally and per-repository would achieve the same
result right now. But arguably, if we learned to blame across submodule
boundaries, we would likely want to create separate fare working tree
commits for each of the submodules so that the user can learn which
worktree a specific uncommitted change belongs to. And even if we would
want to create the same fake commit for each of the submodules we could
do that when storing separate virtual objects per object store.

While this is all rather hypothetical, the takeaway is that handling
virtual objects per-object store gives us more flexibility compared to
storing them globally. In a hypothetical future where we have achieved
full libification one might be able to handle unrelated repositories in
a single process, where the state of one repository should not have an
impact on the state of another repository. As such, storing these cached
objects per object store will enable more usecases and should lead to
less surprising outcomes overall.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: split out functions relating to object store subsystem

Split out functions relating to the object store subsystem from
"object.c". This helps us to separate concerns.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-file: drop `index_blob_stream()`

The `index_blob_stream()` function is a mere wrapper around
`index_blob_bulk_checkin()`. This has been the case since 568508e7657
(bulk-checkin: replace fast-import based implementation, 2011-10-28),
which has moved the implementation from `index_blob_stream()` (which was
still called `index_stream()`) into `index_bulk_checkin()` (which has
since been renamed to `index_blob_bulk_checkin()`).

Remove the redirection by dropping the wrapper. Move the comment to
`index_blob_bulk_checkin()` to retain its context.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-file: split up concerns of `HASH_*` flags

The functions `hash_object_file()`, `write_object_file()` and
`index_fd()` reuse the same set of flags to alter their behaviour. This
not only adds confusion, but given that every function only supports a
subset of the flags it becomes very hard to see which flags can be
passed to what function. Last but not least, this entangles the
implementation of all three function families.

Split up concerns by creating separate flags for each of the function
families.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-file: split out functions relating to object store subsystem

While we have the "object-store.h" header, most of the functionality for
object stores is actually hosted in "object-file.c". This makes it hard
to find relevant functions and causes us to mix up concerns.

Split out functions relating to the object store subsystem into a new
"object-store.c" file.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-file: move `xmmap()` into "wrapper.c"

The `xmmap()` function is provided by "object-file.c" even though its
functionality has nothing to do with the object file subsystem. Move it
into "wrapper.c", whose header already declares those functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-file: move `git_open_cloexec()` to "compat/open.c"

The `git_open_cloexec()` wrapper function provides the ability to open a
file with `O_CLOEXEC` in a platform-agnostic way. This function is
provided by "object-file.c" even though it is not specific to the object
subsystem at all.

Move the file into "compat/open.c". This file already exists before this
commit, but has only been compiled conditionally depending on whether or
not open(3p) may return EINTR. With this change we now unconditionally
compile the object, but wrap `git_open_with_retry()` in an ifdef.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-file: move `safe_create_leading_directories()` into "path.c"

The `safe_create_leading_directories()` function and its relatives are
located in "object-file.c", which is not a good fit as they provide
generic functionality not related to objects at all. Move them into
"path.c", which already hosts `safe_create_dir()` and its relative
`safe_create_dir_in_gitdir()`.

"path.c" is free of `the_repository`, but the moved functions depend on
`the_repository` to read the "core.sharedRepository" config. Adapt the
function signature to accept a repository as argument to fix the issue
and adjust callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-file: move `mkdir_in_gitdir()` into "path.c"

The `mkdir_in_gitdir()` function is similar to `safe_create_dir()`, but
the former is hosted in "object-file.c" whereas the latter is hosted in
"path.c". The latter code unit makes way more sense though as the logic
has nothing to do with object files in particular.

Move the file into "path.c". While at it, we:

  - Rename the function to `safe_create_dir_in_gitdir()` so that the
    function names are similar to one another.

  - Remove the dependency on `the_repository` by making the callers pass
    the repository instead.

Adjust callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

p7821: fix instructions for testing with threads

In 7b31b55db1 (perf: amend the grep tests to test grep.threads,
2017-12-29), p7821 was tweaked to test the performance of 'git grep'
under different number of threads. These tests are run if
GIT_PERF_GREP_THREADS is set to a list of thread numbers, but the
comment at the top of the file instead mentions GIT_PERF_7821_THREADS.
Fix the comment.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: add markup for characters in Guidelines

This rule was already implicitely applied in the converted man pages,
so let's state it loudly.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: fix asciidoctor synopsis processing of triple-dots

The processing of triple dot notation is tricky because it can be
mis-interpreted as an ellipsis. The special processing of the ellipsis
is now complete and takes into account the case of
`git-mv <source>... <dest>`

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: convert git-mv to new documentation format

- Switch the synopsis to a synopsis block which will automatically
format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

Unfortunately, there's an inconsistency in the synopsis style, where
the ellipsis is used to indicate that the option can be repeated, but
it can also be used in Git's three-dot notation to indicate a range of
commits. The rendering engine will not be able to distinguish
between these two cases.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: move synopsis git-mv commands in the synopsis section

This also entails changing the help output for the command to match the new
synopsis.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: convert git-rm to new documentation format

- Switch the synopsis to a synopsis block which will automatically
format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: fix synopsis analysis logic

The synopsis analysis logic was not able to handle backslashes and stars
which are used in the synopsis of the git-rm command. This patch fixes the
issue by updating the regular expression used to match the keywords.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: convert git-reset to new documentation format

- Switch the synopsis to a synopsis block which will automatically
format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

environment.h: remove unused variables

packed_git_window_size and packed_git_limit are not used anywhere in
the codebase. A search found that all references were removed in
d284713bae (config: make `packed_git_(limit|window_size)` non-global
variables, 2024-12-03), except the ones in this file, as they were moved
to struct repo_settings.

Remove packed_git_window_size and packed_git_limit from environment.h.

Signed-off-by: Arnav Bhate <bhatearnav@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: fix duplicated word in comment

Fix a typo in a comment in refs.c: "checking checking" → "checking".

Signed-off-by: Christian Fredrik Johnsen <christian@johnsen.no>
Acked-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs/packed: fix BUG when seeking refs with UTF-8 characters

It was reported that using git-pull(1) in a repository whose remote
contains branches with emojis leads to the following bug:

    $ git pull
    remote: Enumerating objects: 161255, done.
    remote: Counting objects: 100% (55884/55884), done.
    remote: Compressing objects: 100% (5518/5518), done.
    remote: Total 161255 (delta 54253), reused 50509 (delta 50364),
    pack-reused 105371 (from 4)
    Receiving objects: 100% (161255/161255), 309.90 MiB | 16.87 MiB/s, done.
    Resolving deltas: 100% (118048/118048), completed with 13416 local objects.
    From github.com:github/github
       97ab7ae3f3745..8fb2f9fa180ed  master -> origin/master
    [...snip many screenfuls of updates to origin remotes...]
    BUG: refs/packed-backend.c:984: packed-refs backend yielded reference
    preceding its prefix
    error: fetch died of signal 6

This issue bisects to 22600c04529 (refs/iterator: implement seeking for
packed-ref iterators, 2025-03-12) where we have implemented seeking for
the packed-ref iterator. As part of that change we introduced a check
that verifies that the iterator only returns refnames bigger than the
prefix. In theory, this check should always hold: when a prefix is set
we know that we would've seeked that prefix first, so we should never
see a reference sorting before that prefix.

But in practice the check itself is misbehaving when handling unicode
characters. The particular issue triggered with a branch that got the
"shaved ice" unicode character in its name, which is composed of the
bytes "0xEE 0x90 0xBF". The bug triggers when we compare the refname
"refs/heads/<shaved-ice>" to something like "refs/heads/z", and it
specifically hits when comparing the first byte, "0xEE".

The root cause is that the most-significant bit of 0xEE is set. The
`refname` and `prefix` pointers that we use to compare bytes with one
another are both pointers to signed characters. As such, when we
dereference the 0xEE byte the result is a _negative_ value, and this
value will of course compare smaller than "z".

We can see that this issue is avoided in `cmp_packed_refname()`, where
we explicitly cast each byte to its unsigned form. Fix the bug by doing
the same in `packed_ref_iterator_advance()`.

Reported-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: make set_head() call easier to read

We ignore any error returned from set_head(), but 638060dcb9 (fetch
set_head: refactor to use remote directly, 2025-01-26) left its call in
a noop "if" conditional as a sort of note-to-self.

When c834d1a7ce (fetch: only respect followRemoteHEAD with configured
refspecs, 2025-03-18) added a "do_set_head" flag, it was rolled into the
same conditional, putting set_head() on the right-hand side of a
short-circuit AND.

That's not wrong, but it really hides the point of the line, which
is (maybe) calling the function.

Instead, let's have a full if() block for the flag, and then our comment
(with some rewording) will be sufficient to clarify the error handling.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: upgrade `sparse` to supported build agents

The `sparse` job still uses the `ubuntu-20.04` runner pool, but that
pool is about to go away, so let's stop using it.

There is no `sparse-22.04` artifact provided by the "Build sparse for
Ubuntu" Azure Pipeline, but that is not necessary anyway because Ubuntu
22.04 has the `sparse` package: https://packages.ubuntu.com/jammy/sparse

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sparse: ignore warning from new glibc headers

With at least glibc 2.39, glibc provides a function declaration that
matches with this POSIX interface:

    int regexec(const regex_t *restrict preg, const char *restrict string,
           size_t nmatch, regmatch_t pmatch[restrict], int eflags);

such prototype requires variable-length-array for `pmatch'.

Thus, sparse reports this error:

> ../add-patch.c: note: in included file (through ../git-compat-util.h):
> /usr/include/regex.h:682:41: error: undefined identifier '__nmatch'
> /usr/include/regex.h:682:41: error: bad constant expression type
> /usr/include/regex.h:682:41: error: Variable length array is used.

Note: `__nmatch' is POSIX's nmatch.

The glibc's intention is informing their users to provides a large
enough buffer to hold `__nmatch' results and provides diagnosis if
necessary.  It's merely a glibc' implementation detail.

Hide that usage from sparse by using standard C11's macro:
__STDC_NO_VLA__

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/update-server-info: remove unnecessary if statement

Since we already teach the `repo_config()` in f29f1990 (config:
teach repo_config to allow `repo` to be NULL, 2025-03-08) to allow
`repo` to be NULL, no need to check if `repo` is NULL before calling
`repo_config()`.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ps/object-wo-the-repository' into ps/object-file-cleanup

* ps/object-wo-the-repository:
  hash: stop depending on `the_repository` in `null_oid()`
  hash: fix "-Wsign-compare" warnings
  object-file: split out logic regarding hash algorithms
  delta-islands: stop depending on `the_repository`
  object-file-convert: stop depending on `the_repository`
  pack-bitmap-write: stop depending on `the_repository`
  pack-revindex: stop depending on `the_repository`
  pack-check: stop depending on `the_repository`
  environment: move access to "core.bigFileThreshold" into repo settings
  pack-write: stop depending on `the_repository` and `the_hash_algo`
  object: stop depending on `the_repository`
  csum-file: stop depending on `the_repository`

bundle: fix non-linear performance scaling with refs

The 'git bundle create' command has non-linear performance with the
number of refs in the repository. Benchmarking the command shows that
a large portion of the time (~75%) is spent in the
`object_array_remove_duplicates()` function.

The `object_array_remove_duplicates()` function was added in
b2a6d1c686 (bundle: allow the same ref to be given more than once,
2009-01-17) to skip duplicate refs provided by the user from being
written to the bundle. Since this is an O(N^2) algorithm, in repos with
large number of references, this can take up a large amount of time.

Let's instead use a 'strset' to skip duplicates inside
`write_bundle_refs()`. This improves the performance by around 6 times
when tested against in repository with 100000 refs:

Benchmark 1: bundle (refcount = 100000, revision = master)
  Time (mean ± σ):     14.653 s ±  0.203 s    [User: 13.940 s, System: 0.762 s]
  Range (min … max):   14.237 s … 14.920 s    10 runs

Benchmark 2: bundle (refcount = 100000, revision = HEAD)
  Time (mean ± σ):      2.394 s ±  0.023 s    [User: 1.684 s, System: 0.798 s]
  Range (min … max):    2.364 s …  2.425 s    10 runs

Summary
  bundle (refcount = 100000, revision = HEAD) ran
    6.12 ± 0.10 times faster than bundle (refcount = 100000, revision = master)

Previously, `object_array_remove_duplicates()` ensured that both the
refname and the object it pointed to were checked for duplicates. The
new approach, implemented within `write_bundle_refs()`, eliminates
duplicate refnames without comparing the objects they reference. This
works because, for bundle creation, we only need to prevent duplicate
refs from being written to the bundle header. The `revs->pending` array
can contain duplicates of multiple types.

First, references which resolve to the same refname. For e.g. "git
bundle create out.bdl master master" or "git bundle create out.bdl
refs/heads/master refs/heads/master" or "git bundle create out.bdl
master refs/heads/master". In these scenarios we want to prevent writing
"refs/heads/master" twice to the bundle header. Since both the refnames
here would point to the same object (unless there is a race), we do not
need to check equality of the object.

Second, refnames which are duplicates but do not point to the same
object. This can happen when we use an exclusion criteria. For e.g. "git
bundle create out.bdl master master^!", Here `revs->pending` would
contain two elements, both with refname set to "master". However, each
of them would be pointing to an INTERESTING and UNINTERESTING object
respectively. Since we only write refnames with INTERESTING objects to
the bundle header, we perform our duplicate checks only on such objects.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t6020: test for duplicate refnames in bundle creation

The commit b2a6d1c686 (bundle: allow the same ref to be given more than
once, 2009-01-17) added functionality to detect and remove duplicate
refnames from being added during bundle creation. This ensured that
clones created from such bundles wouldn't barf about duplicate refnames.

The following commit will add some optimizations to make this check
faster, but before doing that, it would be optimal to add tests to
capture the current behavior.

Add tests to capture duplicate refnames provided by the user during
bundle creation. This can be a combination of:

  - refnames directly provided by the user.
  - refname duplicate by using the '--all' flag alongside manual
    references being provided.
  - exclusion criteria provided via a refname "main^!".
  - short forms of refnames provided, "main" vs "refs/heads/main".

Note that currently duplicates due to usage of short and long forms goes
undetected. This should be fixed with the optimizations made in the next
commit.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>