git.ipfire.org Git - thirdparty/git.git/log

Merge branch 'pw/apply-ulong-overflow-check'

"git apply" internally uses unsigned long for line numbers and uses
strtoul() to parse numbers on the hunk headers. It however forgot
to check parse errors.

* pw/apply-ulong-overflow-check:
apply: detect overflow when parsing hunk header

Merge branch 'ps/setup-reinit-fixes'

"git init" to reinitialize a repository that already exists cannot
change the hash function and ref backends; such a request is
silently ignored now.

* ps/setup-reinit-fixes:
  setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH
  setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT
  t0001: remove duplicate test

The eighth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ps/leakfixes-0129'

A few more leakfixes.

* ps/leakfixes-0129:
scalar: free result of `remote_default_branch()`
unix-socket: fix memory leak when chdir(3p) fails

Merge branch 'ps/zlib-ng'

The code paths to interact with zlib has been cleaned up in
preparation for building with zlib-ng.

* ps/zlib-ng:
  ci: make "linux-musl" job use zlib-ng
  ci: switch linux-musl to use Meson
  compat/zlib: allow use of zlib-ng as backend
  git-zlib: cast away potential constness of `next_in` pointer
  compat/zlib: provide stubs for `deflateSetHeader()`
  compat/zlib: provide `deflateBound()` shim centrally
  git-compat-util: move include of "compat/zlib.h" into "git-zlib.h"
  compat: introduce new "zlib.h" header
  git-compat-util: drop `z_const` define
  compat: drop `uncompress2()` compatibility shim

Merge branch 'js/bundle-unbundle-fd-reuse-fix'

The code path used when "git fetch" fetches from a bundle file
closed the same file descriptor twice, which sometimes broke things
unexpectedly when the file descriptor was reused, which has been
corrected.

* js/bundle-unbundle-fd-reuse-fix:
bundle: avoid closing file descriptor twice

Merge branch 'ps/ci-misc-updates'

CI updates (containerization, dropping stale ones, etc.).

* ps/ci-misc-updates:
  ci: remove stale code for Azure Pipelines
  ci: use latest Ubuntu release
  ci: stop special-casing for Ubuntu 16.04
  gitlab-ci: add linux32 job testing against i386
  gitlab-ci: remove the "linux-old" job
  github: simplify computation of the job's distro
  github: convert all Linux jobs to be containerized
  github: adapt containerized jobs to be rootless
  t7422: fix flaky test caused by buffered stdout
  t0060: fix EBUSY in MinGW when setting up runtime prefix

The seventh batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'kn/reflog-migration-fix-fix'

Fix bugs in an earlier attempt to fix "git refs migration".

* kn/reflog-migration-fix-fix:
refs/reftable: fix uninitialized memory access of `max_index`
reftable: write correct max_update_index to header

Merge branch 'kn/pack-write-with-reduced-globals'

Code clean-up.

* kn/pack-write-with-reduced-globals:
  pack-write: pass hash_algo to internal functions
  pack-write: pass hash_algo to `write_rev_file()`
  pack-write: pass hash_algo to `write_idx_file()`
  pack-write: pass repository to `index_pack_lockfile()`
  pack-write: pass hash_algo to `fixup_pack_header_footer()`

Merge branch 'ps/build-meson-fixes'

More build fixes and enhancements on meson based build procedure.

* ps/build-meson-fixes:
  ci: wire up Visual Studio build with Meson
  ci: raise error when Meson generates warnings
  meson: fix compilation with Visual Studio
  meson: make the CSPRNG backend configurable
  meson: wire up fuzzers
  meson: wire up generation of distribution archive
  meson: wire up development environments
  meson: fix dependencies for generated headers
  meson: populate project version via GIT-VERSION-GEN
  GIT-VERSION-GEN: allow running without input and output files
  GIT-VERSION-GEN: simplify computing the dirty marker

Merge branch 'ps/3.0-remote-deprecation'

Following the procedure we established to introduce breaking
changes for Git 3.0, allow an early opt-in for removing support of
$GIT_DIR/branches/ and $GIT_DIR/remotes/ directories to configure
remotes.

* ps/3.0-remote-deprecation:
  remote: announce removal of "branches/" and "remotes/"
  builtin/pack-redundant: remove subcommand with breaking changes
  ci: repurpose "linux-gcc" job for deprecations
  ci: merge linux-gcc-default into linux-gcc
  Makefile: wire up build option for deprecated features

Merge branch 'jk/combine-diff-cleanup'

Code clean-up for code paths around combined diff.

* jk/combine-diff-cleanup:
  tree-diff: make list tail-passing more explicit
  tree-diff: simplify emit_path() list management
  tree-diff: use the name "tail" to refer to list tail
  tree-diff: drop list-tail argument to diff_tree_paths()
  combine-diff: drop public declaration of combine_diff_path_size()
  tree-diff: inline path_appendnew()
  tree-diff: pass whole path string to path_appendnew()
  tree-diff: drop path_appendnew() alloc optimization
  run_diff_files(): de-mystify the size of combine_diff_path struct
  diff: add a comment about combine_diff_path.parent.path
  combine-diff: use pointer for parent paths
  tree-diff: clear parent array in path_appendnew()
  combine-diff: add combine_diff_path_new()
  run_diff_files(): delay allocation of combine_diff_path

Merge branch 'tb/unsafe-hash-cleanup'

The API around choosing to use unsafe variant of SHA-1
implementation has been updated in an attempt to make it harder to
abuse.

* tb/unsafe-hash-cleanup:
  hash.h: drop unsafe_ function variants
  csum-file: introduce hashfile_checkpoint_init()
  t/helper/test-hash.c: use unsafe_hash_algo()
  csum-file.c: use unsafe_hash_algo()
  hash.h: introduce `unsafe_hash_algo()`
  csum-file.c: extract algop from hashfile_checksum_valid()
  csum-file: store the hash algorithm as a struct field
  t/helper/test-tool: implement sha1-unsafe helper

The sixth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jc/show-index-h-update'

Doc and short-help text for "show-index" has been clarified to
stress that the command reads its data from the standard input.

* jc/show-index-h-update:
show-index: the short help should say the command reads from its input

Merge branch 'ja/doc-notes-markup-updates'

Doc mark-up updates.

* ja/doc-notes-markup-updates:
doc: convert git-notes to new documentation format

Merge branch 'sk/strlen-returns-size_t'

Code clean-up.

* sk/strlen-returns-size_t:
date.c: Fix type missmatch warings from msvc

Merge branch 'ja/doc-restore-markup-update'

Doc mark-up updates.

* ja/doc-restore-markup-update:
doc: convert git-restore to new style format

setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH

The exact same issue as described in the preceding commit also exists
for GIT_DEFAULT_HASH. Thus, reinitializing a repository that e.g. uses
SHA1 with `GIT_DEFAULT_HASH=sha256 git init` will cause the object
format of that repository to change to SHA256. This is of course bogus
as any existing objects and refs will not be converted, thus causing
repository corruption:

    $ git init repo
    Initialized empty Git repository in /tmp/repo/.git/
    $ cd repo/
    $ git commit --allow-empty -m message
    [main (root-commit) 35a7344] message
    $ GIT_DEFAULT_HASH=sha256 git init
    Reinitialized existing Git repository in /tmp/repo/.git/
    $ git show
    fatal: your current branch appears to be broken

Fix the issue by ignoring the environment variable in case the repo has
already been initialized with an object hash.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT

The GIT_DEFAULT_REF_FORMAT environment variable can be set to influence
the default ref format that new repostiories shall be initialized with.
While this is the expected behaviour when creating a new repository, it
is not when reinitializing a repository: we should retain the ref format
currently used by it in that case.

This doesn't work correctly right now:

    $ git init --ref-format=files repo
    Initialized empty Git repository in /tmp/repo/.git/
    $ GIT_DEFAULT_REF_FORMAT=reftable git init repo
    fatal: could not open '/tmp/repo/.git/refs/heads' for writing: Is a directory

Instead of retaining the current ref format, the reinitialization tries
to reinitialize the repository with the different format. This action
fails when git-init(1) tries to write the ".git/refs/heads" stub, which
in the context of the reftable backend is always written as a file so
that we can detect clients which inadvertently try to access the repo
with the wrong ref format. Seems like the protection mechanism works for
this case, as well.

Fix the issue by ignoring the environment variable in case the repo has
already been initialized with a ref storage format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0001: remove duplicate test

The test in question is an exact copy of the testcase preceding it.
Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply: detect overflow when parsing hunk header

"git apply" uses strtoul() to parse the numbers in the hunk header but
silently ignores overflows. As LONG_MAX is a legitimate return value for
strtoul() we need to set errno to zero before the call to strtoul() and
check that it is still zero afterwards. The error message we display is
not particularly helpful as it does not say what was wrong. However, it
seems pretty unlikely that users are going to trigger this error in
practice and we can always improve it later if needed.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

scalar: free result of `remote_default_branch()`

We don't free the result of `remote_default_branch()`, leading to a
memory leak. This leak is exposed by t9211, but only when run with Meson
with the `-Db_sanitize=leak` option:

    Direct leak of 5 byte(s) in 1 object(s) allocated from:
        #0 0x5555555cfb93 in malloc (scalar+0x7bb93)
        #1 0x5555556b05c2 in do_xmalloc ../wrapper.c:55:8
        #2 0x5555556b06c4 in do_xmallocz ../wrapper.c:89:8
        #3 0x5555556b0656 in xmallocz ../wrapper.c:97:9
        #4 0x5555556b0728 in xmemdupz ../wrapper.c:113:16
        #5 0x5555556b07a7 in xstrndup ../wrapper.c:119:9
        #6 0x5555555d3a4b in remote_default_branch ../scalar.c:338:14
        #7 0x5555555d20e6 in cmd_clone ../scalar.c:493:28
        #8 0x5555555d196b in cmd_main ../scalar.c:992:14
        #9 0x5555557c4059 in main ../common-main.c:64:11
        #10 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #11 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #12 0x555555592054 in _start (scalar+0x3e054)

    DEDUP_TOKEN: __interceptor_malloc--do_xmalloc--do_xmallocz--xmallocz--xmemdupz--xstrndup--remote_default_branch--cmd_clone--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 5 byte(s) leaked in 1 allocation(s).

As the `branch` variable may contain a string constant obtained from
parsing command line arguments we cannot free the leaking variable
directly. Instead, introduce a new `branch_to_free` variable that only
ever gets assigned the allocated string and free that one to plug the
leak.

It is unclear why the leak isn't flagged when running the test via our
Makefile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

unix-socket: fix memory leak when chdir(3p) fails

When trying to create a Unix socket in a path that exceeds the maximum
socket name length we try to first change the directory into the parent
folder before creating the socket to reduce the length of the name. When
this fails we error out of `unix_sockaddr_init()` with an error code,
which indicates to the caller that the context has not been initialized.
Consequently, they don't release that context.

This leads to a memory leak: when we have already populated the context
with the original directory that we need to chdir(3p) back into, but
then the chdir(3p) into the socket's parent directory fails, then we
won't release the original directory's path. The leak is exposed by
t0301, but only when running tests in a directory hierarchy whose path
is long enough to make the socket name length exceed the maximum socket
name length:

    Direct leak of 129 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e85c6 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x55555590e3d6 in xrealloc ../wrapper.c:140:8
        #2 0x5555558c8fc6 in strbuf_grow ../strbuf.c:114:2
        #3 0x5555558cacab in strbuf_getcwd ../strbuf.c:605:3
        #4 0x555555923ff6 in unix_sockaddr_init ../unix-socket.c:65:7
        #5 0x555555923e42 in unix_stream_connect ../unix-socket.c:84:6
        #6 0x55555562a984 in send_request ../builtin/credential-cache.c:46:11
        #7 0x55555562a89e in do_cache ../builtin/credential-cache.c:108:6
        #8 0x55555562a655 in cmd_credential_cache ../builtin/credential-cache.c:178:3
        #9 0x555555700547 in run_builtin ../git.c:480:11
        #10 0x5555556ff0e0 in handle_builtin ../git.c:740:9
        #11 0x5555556ffee8 in run_argv ../git.c:807:4
        #12 0x5555556fee6b in cmd_main ../git.c:947:19
        #13 0x55555593f689 in main ../common-main.c:64:11
        #14 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #15 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #16 0x5555555ad1d4 in _start (git+0x591d4)

    DEDUP_TOKEN: ___interceptor_realloc.part.0--xrealloc--strbuf_grow--strbuf_getcwd--unix_sockaddr_init--unix_stream_connect--send_request--do_cache--cmd_credential_cache--run_builtin--handle_builtin--run_argv--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 129 byte(s) leaked in 1 allocation(s).

Fix this leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The fifth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ps/reflog-migration-with-logall-fix'

The "git refs migrate" command did not migrate the reflog for
refs/stash, which is the contents of the stashes, which has been
corrected.

* ps/reflog-migration-with-logall-fix:
refs: fix migration of reflogs respecting "core.logAllRefUpdates"

Merge branch 'am/trace2-with-valueless-true'

The trace2 code was not prepared to show a configuration variable
that is set to true using the valueless true syntax, which has been
corrected.

* am/trace2-with-valueless-true:
trace2: prevent segfault on config collection with valueless true

Merge branch 'kn/reflog-symref-fix'

reflog entries for symbolic ref updates were broken, which has been
corrected.

* kn/reflog-symref-fix:
refs: fix creation of reflog entries for symrefs

Merge branch 'rs/ref-fitler-used-atoms-value-fix'

"git branch --sort=..." and "git for-each-ref --format=... --sort=..."
did not work as expected with some atoms, which has been corrected.

* rs/ref-fitler-used-atoms-value-fix:
  ref-filter: remove ref_format_clear()
  ref-filter: move is-base tip to used_atom
  ref-filter: move ahead-behind bases into used_atom

Merge branch 'ja/doc-commit-markup-updates'

Doc updates.

* ja/doc-commit-markup-updates:
  doc: migrate git-commit manpage secondary files to new format
  doc: convert git commit config to new format
  doc: make more direct explanations in git commit options
  doc: the mode param of -u of git commit is optional
  doc: apply new documentation guidelines to git commit

Merge branch 'ds/path-walk-1'

Introduce a new API to visit objects in batches based on a common
path, or by type.

* ds/path-walk-1:
  path-walk: drop redundant parse_tree() call
  path-walk: reorder object visits
  path-walk: mark trees and blobs as UNINTERESTING
  path-walk: visit tags and cached objects
  path-walk: allow consumer to specify object types
  t6601: add helper for testing path-walk API
  test-lib-functions: add test_cmp_sorted
  path-walk: introduce an object walk by path

ci: make "linux-musl" job use zlib-ng

We don't yet have any test coverage for the new zlib-ng backend as part
of our CI. Add it by installing zlib-ng in Alpine Linux, which causes
Meson to pick it up automatically.

Note that we are somewhat limited with regards to where we run that job:
Debian-based distributions don't have zlib-ng in their repositories,
Fedora has it but doesn't run tests, and Alma Linux doesn't have the
package either. Alpine Linux does have it available and is running our
test suite, which is why it was picked.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: switch linux-musl to use Meson

Switch over the "linux-musl" job to use Meson instead of Makefiles. This
is done due to multiple reasons:

  - It simplifies our CI infrastructure a bit as we don't have to
    manually specify a couple of build options anymore.

  - It verifies that Meson detects and sets those build options
    automatically.

  - It makes it easier for us to wire up a new CI job using zlib-ng as
    backend.

One platform compatibility that Meson cannot easily detect automatically
is the `GIT_TEST_UTF8_LOCALE` variable used in tests. Wire up a build
option for it, which we set via a new "MESONFLAGS" environment variable.

Note that we also drop the CC variable, which is set to "gcc". We
already default to GCC when CC is unset in "ci/lib.sh", so this is not
needed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

compat/zlib: allow use of zlib-ng as backend

The zlib-ng library is a hard fork of the old and venerable zlib
library. It describes itself as zlib replacement with optimizations for
"next generation" systems. As such, it contains several implementations
of central algorithms using for example SSE2, AVX2 and other vectorized
CPU intrinsics that supposedly speed up in- and deflating data.

And indeed, compiling Git against zlib-ng leads to a significant speedup
when reading objects. The following benchmark uses git-cat-file(1) with
`--batch --batch-all-objects` in the Git repository:

    Benchmark 1: zlib
      Time (mean ± σ):     52.085 s ±  0.141 s    [User: 51.500 s, System: 0.456 s]
      Range (min … max):   52.004 s … 52.335 s    5 runs

    Benchmark 2: zlib-ng
      Time (mean ± σ):     40.324 s ±  0.134 s    [User: 39.731 s, System: 0.490 s]
      Range (min … max):   40.135 s … 40.484 s    5 runs

    Summary
      zlib-ng ran
        1.29 ± 0.01 times faster than zlib

So we're looking at a ~25% speedup compared to zlib. This is of course
an extreme example, as it makes us read through all objects in the
repository. But regardless, it should be possible to see some sort of
speedup in most commands that end up accessing the object database.

The zlib-ng library provides a compatibility layer that makes it a
proper drop-in replacement for zlib: nothing needs to change in the
build system to support it. Unfortunately though, this mode isn't easy
to use on most systems because distributions do not allow you to install
zlib-ng in that way, as that would mean that the zlib library would be
globally replaced. Instead, many distributions provide a package that
installs zlib-ng without the compatibility layer. This version does
provide effectively the same APIs like zlib does, but all of the symbols
are prefixed with `zng_` to avoid symbol collisions.

Implement a new build option that allows us to link against zlib-ng
directly. If set, we redefine zlib symbols so that we use the `zng_`
prefixed versions thereof provided by that library. Like this, it
becomes possible to install both zlib and zlib-ng (without the compat
layer) and then pick whichever library one wants to link against for
Git.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-zlib: cast away potential constness of `next_in` pointer

The `struct git_zstream::next_in` variable points to the input data and
is used in combination with `struct z_stream::next_in`. While that
latter field is not marked as a constant in zlib, it is marked as such
in zlib-ng. This causes a couple of compiler errors when we try to
assign these fields to one another due to mismatching constness.

Fix the issue by casting away the potential constness of `next_in`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

compat/zlib: provide stubs for `deflateSetHeader()`

The function `deflateSetHeader()` has been introduced with zlib v1.2.2.1,
so we don't use it when linking against an older version of it. Refactor
the code to instead provide a central stub via "compat/zlib.h" so that
we can adapt it based on whether or not we use zlib-ng in a subsequent
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

compat/zlib: provide `deflateBound()` shim centrally

The `deflateBound()` function has only been introduced with zlib 1.2.0.
When linking against a zlib version older than that we thus provide our
own compatibility shim. Move this shim into "compat/zlib.h" so that we
can adapt it based on whether or not we use zlib-ng in a subsequent
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-compat-util: move include of "compat/zlib.h" into "git-zlib.h"

We include "compat/zlib.h" in "git-compat-util.h", which is
unnecessarily broad given that we only have a small handful of files
that use the zlib library. Move the header into "git-zlib.h" instead and
adapt users of zlib to include that header.

One exception is the reftable library, as we don't want to use the
Git-specific wrapper of zlib there, so we include "compat/zlib.h"
instead. Furthermore, we move the include into "reftable/system.h" so
that users of the library other than Git can wire up zlib themselves.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

compat: introduce new "zlib.h" header

Introduce a new "compat/zlib-compat.h" header that we include instead of
including <zlib.h> directly. This will allow us to wire up zlib-ng as an
alternative backend for zlib compression in a subsequent commit.

Note that we cannot just call the file "compat/zlib.h", as that may
otherwise cause us to include that file instead of <zlib.h>.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-compat-util: drop `z_const` define

Before including <zlib.h> we explicitly define `z_const` to an empty
value. This has the effect that the `z_const` macro in "zconf.h" itself
will remain empty instead of being defined as `const`, which effectively
adapts a couple of APIs so that their parameters are not marked as being
constants.

It is dubious though whether this is something we actually want: not
marking a parameter as a constant doesn't make it any less constant than
it was. The define was added via 07564773c2 (compat: auto-detect if zlib
has uncompress2(), 2022-01-24), where it was seemingly carried over from
our internal compatibility shim for `uncompress2()` that was removed in
the preceding commit. The commit message doesn't mention why we carry
over the define and make it public, either, and I cannot think of any
reason for why we would want to have it.

Drop the define.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

compat: drop `uncompress2()` compatibility shim

Our compat library has an implementation of zlib's `uncompress2()`
function that gets used when linking against an old version of zlib
that doesn't yet have it. The last user of `uncompress2()` got removed
in 15a60b747e (reftable/block: open-code call to `uncompress2()`,
2024-04-08), so the compatibility code is not required anymore. Drop it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The fourth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jp/t8002-printf-fix'

Test fix.

* jp/t8002-printf-fix:
t8002: fix ambiguous printf conversion specifications

Merge branch 'ps/reftable-sign-compare'

The reftable/ library code has been made -Wsign-compare clean.

* ps/reftable-sign-compare:
  reftable: address trivial -Wsign-compare warnings
  reftable/blocksource: adjust `read_block()` to return `ssize_t`
  reftable/blocksource: adjust type of the block length
  reftable/block: adjust type of the restart length
  reftable/block: adapt header and footer size to return a `size_t`
  reftable/basics: adjust `hash_size()` to return `uint32_t`
  reftable/basics: adjust `common_prefix_size()` to return `size_t`
  reftable/record: handle overflows when decoding varints
  reftable/record: drop unused `print` function pointer
  meson: stop disabling -Wsign-compare

Merge branch 'mh/credential-cache-authtype-request-fix'

The "cache" credential back-end did not handle authtype correctly,
which has been corrected.

* mh/credential-cache-authtype-request-fix:
credential-cache: respect authtype capability

Merge branch 'jk/pack-header-parse-alignment-fix'

It was possible for "git unpack-objects" and "git index-pack" to
make an unaligned access, which has been corrected.

* jk/pack-header-parse-alignment-fix:
  index-pack, unpack-objects: use skip_prefix to avoid magic number
  index-pack, unpack-objects: use get_be32() for reading pack header
  parse_pack_header_option(): avoid unaligned memory writes
  packfile: factor out --pack_header argument parsing
  bswap.h: squelch potential sparse -Wcast-truncate warnings

Merge branch 'ps/build-meson-subtree'

The meson-driven build is now aware of "git-subtree" housed in
contrib/subtree hierarchy.

* ps/build-meson-subtree:
  meson: wire up the git-subtree(1) command
  meson: introduce build option for contrib
  contrib/subtree: fix building docs

Merge branch 'mh/connect-sign-compare'

The code in connect.c has been updated to work around complaints
from -Wsign-compare.

* mh/connect-sign-compare:
connect: address -Wsign-compare warnings

Merge branch 'sk/unit-tests'

Move a few more unit tests to the clar test framework.

* sk/unit-tests:
  t/unit-tests: convert reftable tree test to use clar test framework
  t/unit-tests: adapt priority queue test to use clar test framework
  t/unit-tests: convert mem-pool test to use clar test framework
  t/unit-tests: handle dashes in test suite filenames

Merge branch 'jc/show-usage-help'

The help text from "git $cmd -h" appear on the standard output for
some $cmd and the standard error for others.  The built-in commands
have been fixed to show them on the standard output consistently.

* jc/show-usage-help:
  builtin: send usage() help text to standard output
  oddballs: send usage() help text to standard output
  builtins: send usage_with_options() help text to standard output
  usage: add show_usage_if_asked()
  parse-options: add show_usage_with_options_if_asked()
  t0012: optionally check that "-h" output goes to stdout

refs/reftable: fix uninitialized memory access of `max_index`

When migrating reflogs between reference backends, maintaining the
original order of the reflog entries is crucial. To achieve this, an
`index` field is stored within the `ref_update` struct that encodes the
relative order of reflog entries. This field is used by the reftable
backend as update index for the respective reflog entries to maintain
that ordering.

These update indices must be respected when writing table headers, which
encode the minimum and maximum update index of contained records in the
header and footer. This logic was added in commit bc67b4ab5f (reftable:
write correct max_update_index to header, 2025-01-15), which started to
use `reftable_writer_set_limits()` to propagate the mininum and maximum
update index of all records contained in a ref transaction.

However, we only set the maximum update index for the first transaction
argument, even though there can be multiple such arguments. This is the
case when we write to multiple stacks in a single transaction, e.g. when
updating references in two different worktrees at once. Consequently,
the update index for all but the first argument remain uninitialized,
which may cause undefined behaviour.

Fix this by moving the assignment of the maximum update index in
`reftable_be_transaction_finish()` inside the loop, which ensures that
all elements of the array are correctly initialized.

Furthermore, initialize the `max_index` field to 0 when queueing a new
transaction argument. This is not strictly necessary, as all elements of
`write_transaction_table_arg.max_index` are now assigned correctly.
However, this initialization is added for consistency and to safeguard
against potential future changes that might inadvertently introduce
uninitialized memory access.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bundle: avoid closing file descriptor twice

Already when introduced in c7a8a16239 (Add bundle transport,
2007-09-10), the `bundle` transport had a bug where it would open a file
descriptor to the bundle file and then close it _twice_: First, the file
descriptor (`data->fd`) is passed to `unbundle()`, which would use it as
the `stdin` of the `index-pack` process, which as a consequence would
close it via `start_command()`. However, `data->fd` would still hold the
numerical value of the file descriptor, and `close_bundle()` would see
that and happily close it again.

This seems not to have caused too many problems in almost two decades,
but I encountered a situation today where it _does_ cause problems: In
i686 variants of Git for Windows, it seems that file descriptors are
reused quickly after they have been closed.

In the particular scenario I faced, `git fetch <bundle> <ref>` gets the
same file descriptor value when opening the bundle file and importing
its embedded packfile (which implicitly closes the file descriptor) and
then when opening a pack file in `fetch_and_consume_refs()` while
looking up an object's header.

Later on, after the bundle has been imported (and the `close_bundle()`
function erroneously closes the file descriptor that has _already_ been
closed when using it as `stdin` for `git index-pack`), the same file
descriptor value has now been reused via `use_pack()`. Now, when either
the recursive fetch (which defaults to "on", unfortunately) or a
commit-graph update needs to `mmap()` the packfile, it fails due to a
now-invalid file descriptor that _should_ point to the pack file but
doesn't anymore.

To fix that, let's invalidate `data->fd` after calling `unbundle()`.
That way, `close_bundle()` does not close a file descriptor that may
have been reused for something different. While at it, document that
`unbundle()` closes the file descriptor, and ensure that it also does
that when failing to verify the bundle.

Luckily, this bug does not affect the bundle URI feature, it only
affects the `git fetch <bundle>` code path.

Note that this patch does not _completely_ clarifies who is responsible
to close that file descriptor, as `run_command()` may fail _without_
closing `cmd->in`. Addressing this issue thoroughly, however, would
require a rather thorough re-design of the `start_command()` and
`finish_command()` functionality to make it a lot less murky who is
responsible for what file descriptors.

At least this here patch is relatively easy to reason about, and
addresses a hard failure (`fatal: mmap: could not determine filesize`)
at the expense of leaking a file descriptor under very rare
circumstances in which `git fetch` would error out anyway.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

remote: announce removal of "branches/" and "remotes/"

Back when Git was in its infancy, remotes were configured via separate
files in "branches/" (back in 2005). This mechanism was replaced later
that year with the "remotes/" directory. Both mechanisms have eventually
been replaced by config-based remotes, and it is very unlikely that
anybody still uses these directories to configure their remotes.

Both of these directories have been marked as deprecated, one in 2005
and the other one in 2011. Follow through with the deprecation and
finally announce the removal of these features in Git 3.0.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
[jc: with a small tweak to the help message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The third batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jc/cli-doc-option-and-config'

Doc update.

* jc/cli-doc-option-and-config:
gitcli: document that command line trumps config and env

Merge branch 'mh/doc-credential-helpers-with-pat'

Document that it is insecure to use Personal Access Tokens, which
some hosting providers take as username/password, embedded in URLs.

* mh/doc-credential-helpers-with-pat:
docs: discuss caching personal access tokens
docs: list popular credential helpers

Merge branch 'ak/instaweb-python-port-binding-fix'

The "instaweb" bound only to local IP address without "--local" and
to all addresses with "--local", which was the other way around, when
using Python's http.server class, which has been corrected.

* ak/instaweb-python-port-binding-fix:
instaweb: fix ip binding for the python http.server

Merge branch 'sj/meson-doc-technical-dependency-fix'

The meson build procedure for Documentation/technical/ hierarchy was
missing necessary dependencies, which has been corrected.

* sj/meson-doc-technical-dependency-fix:
meson: fix missing deps for technical articles

Merge branch 'tc/meson-use-our-version-def-h'

The meson build procedure looked for the 'version-def.h' file in a
wrong directory, which has been corrected.

* tc/meson-use-our-version-def-h:
meson: ensure correct version-def.h is used

Merge branch 'en/object-name-with-funny-refname-fix'

Extended SHA-1 expression parser did not work well when a branch
with an unusual name (e.g. "foo{bar") is involved.

* en/object-name-with-funny-refname-fix:
object-name: be more strict in parsing describe-like output
object-name: fix resolution of object names containing curly braces

hash.h: drop unsafe_ function variants

Now that all callers have been converted from:

the_hash_algo->unsafe_init_fn();

to

unsafe_hash_algo(the_hash_algo)->init_fn();

and similar, we can remove the scaffolding for the unsafe_ function
variants and force callers to use the new unsafe_hash_algo() mechanic
instead.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

csum-file: introduce hashfile_checkpoint_init()

In 106140a99f (builtin/fast-import: fix segfault with unsafe SHA1
backend, 2024-12-30) and 9218c0bfe1 (bulk-checkin: fix segfault with
unsafe SHA1 backend, 2024-12-30), we observed the effects of failing to
initialize a hashfile_checkpoint with the same hash function
implementation as is used by the hashfile it is used to checkpoint.

While both 106140a99f and 9218c0bfe1 work around the immediate crash,
changing the hash function implementation within the hashfile API to,
for example, the non-unsafe variant would re-introduce the crash. This
is a result of the tight coupling between initializing hashfiles and
hashfile_checkpoints.

Introduce and use a new function which ensures that both parts of a
hashfile and hashfile_checkpoint pair use the same hash function
implementation to avoid such crashes.

A few things worth noting:

  - In the change to builtin/fast-import.c::stream_blob(), we can see
    that by removing the explicit reference to
    'the_hash_algo->unsafe_init_fn()', we are hardened against the
    hashfile API changing away from the_hash_algo (or its unsafe
    variant) in the future.

  - The bulk-checkin code no longer needs to explicitly zero-initialize
    the hashfile_checkpoint, since it is now done as a result of calling
    'hashfile_checkpoint_init()'.

  - Also in the bulk-checkin code, we add an additional call to
    prepare_to_stream() outside of the main loop in order to initialize
    'state->f' so we know which hash function implementation to use when
    calling 'hashfile_checkpoint_init()'.

    This is OK, since subsequent 'prepare_to_stream()' calls are noops.
    However, we only need to call 'prepare_to_stream()' when we have the
    HASH_WRITE_OBJECT bit set in our flags. Without that bit, calling
    'prepare_to_stream()' does not assign 'state->f', so we have nothing
    to initialize.

  - Other uses of the 'checkpoint' in 'deflate_blob_to_pack()' are
    appropriately guarded.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/helper/test-hash.c: use unsafe_hash_algo()

Remove a series of conditionals within the shared cmd_hash_impl() helper
that powers the 'sha1' and 'sha1-unsafe' helpers.

Instead, replace them with a single conditional that transforms the
specified hash algorithm into its unsafe variant. Then all subsequent
calls can directly use whatever function it wants to call without having
to decide between the safe and unsafe variants.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

csum-file.c: use unsafe_hash_algo()

Instead of calling the unsafe_ hash function variants directly, make use
of the shared 'algop' pointer by initializing it to:

f->algop = unsafe_hash_algo(the_hash_algo);

, thus making all calls use the unsafe variants directly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

hash.h: introduce `unsafe_hash_algo()`

In 253ed9ecff (hash.h: scaffolding for _unsafe hashing variants,
2024-09-26), we introduced "unsafe" variants of the SHA-1 hashing
functions by introducing new functions like "unsafe_init_fn()" and so
on.

This approach has a major shortcoming that callers must remember to
consistently use one variant or the other. Failing to consistently use
(or not use) the unsafe variants can lead to crashes at best, or subtle
memory corruption issues at worst.

In the hashfile API, this isn't difficult to achieve, but verifying that
all callers consistently use the unsafe variants is somewhat of a chore
given how spread out all of the callers are. In the sha1 and sha1-unsafe
test helpers, all of the calls to various hash functions are guarded by
an "if (unsafe)" conditional, which is repetitive and cumbersome.

Address these issues by introducing a new pattern whereby one
'git_hash_algo' can return a pointer to another 'git_hash_algo' that
represents the unsafe version of itself. So instead of having something
like:

    if (unsafe)
      the_hash_algo->init_fn(...);
      the_hash_algo->update_fn(...);
      the_hash_algo->final_fn(...);
    else
      the_hash_algo->unsafe_init_fn(...);
      the_hash_algo->unsafe_update_fn(...);
      the_hash_algo->unsafe_final_fn(...);

we can instead write:

    struct git_hash_algo *algop = the_hash_algo;
    if (unsafe)
      algop = unsafe_hash_algo(algop);

    algop->init_fn(...);
    algop->update_fn(...);
    algop->final_fn(...);

This removes the existing shortcoming by no longer forcing the caller to
"remember" which variant of the hash functions it wants to call, only to
hold onto a 'struct git_hash_algo' pointer that is initialized once.

Similarly, while there currently is still a way to "mix" safe and unsafe
functions, this too will go away after subsequent commits remove all
direct calls to the unsafe_ variants.

Note that hash_algo_by_ptr() needs an adjustment to allow passing in the
unsafe variant of a hash function. All other query functions on the
hash_algos array will continue to return the safe variants of any
function.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

csum-file.c: extract algop from hashfile_checksum_valid()

Perform a similar transformation as in the previous commit, but focused
instead on hashfile_checksum_valid(). This function does not work with a
hashfile structure itself, and instead validates the raw contents of a
file written using the hashfile API.

We'll want to be prepared for a similar change to this function in the
future, so prepare ourselves for that by extracting 'the_hash_algo' into
its own field for use within this function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

csum-file: store the hash algorithm as a struct field

Throughout the hashfile API, we rely on a reference to 'the_hash_algo',
and call its _unsafe function variants directly.

Prepare for a future change where we may use a different 'git_hash_algo'
pointer (instead of just relying on 'the_hash_algo' throughout) by
making the 'git_hash_algo' pointer a member of the 'hashfile' structure
itself.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/helper/test-tool: implement sha1-unsafe helper

With the new "unsafe" SHA-1 build knob, it is convenient to have a
test-tool that can exercise Git's unsafe SHA-1 wrappers for testing,
similar to 't/helper/test-tool sha1'.

Implement that helper by altering the implementation of that test-tool
(in cmd_hash_impl(), which is generic and parameterized over different
hash functions) to conditionally run the unsafe variants of the chosen
hash function, and expose the new behavior via a new 'sha1-unsafe' test
helper.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

trace2: prevent segfault on config collection with valueless true

When TRACE2 analytics is enabled, a configuration variable set to
"valueless true" causes a segfault.

Steps to Reproduce

    GIT_TRACE2=true GIT_TRACE2_CONFIG_PARAMS=status.*
    git -c status.relativePaths version
    Expected Result
    git version 2.46.0
    Actual Result
    zsh: segmentation fault GIT_TRACE2=true

Add checks to prevent the segfault and instead show that the
variable without value.

Signed-off-by: Adam Murray <ad@canva.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: fix creation of reflog entries for symrefs

The commit 297c09eabb (refs: allow multiple reflog entries for the
same refname, 2024-12-16) added logic to exit early in
`lock_ref_for_update()` after obtaining the required lock. This was
added as a performance optimization on a false assumption that no
further processing was required for reflog-only updates.

However the assumption was wrong. For a symref's reflog entry, the
update needs to be populated with the old_oid value, but the early
exit skipped this necessary step.

This caused a bug in Git 2.48 in the files backend where target
references of symrefs being updated would create a corrupted reflog
entry for the symref since the old_oid is not populated.

Everything the early exit skipped in the code path is necessary for
both regular and symbolic ref, so eliminate the mistaken
optimization, and also add a test to ensure that such an issue
doesn't arise in the future.

Reported-by: Nika Layzell <nika@thelayzells.com>
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

path-walk: drop redundant parse_tree() call

This call to parse_tree() was flagged by Coverity for ignoring the
return value. But if we look a little further up the function, we can
see that there is already a call to parse_tree_gently(), and we'll
return early if that fails. So by this point the tree will always be
parsed, and the call is redundant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ps/build-meson-fixes' into ps/zlib-ng

* ps/build-meson-fixes:
  ci: wire up Visual Studio build with Meson
  ci: raise error when Meson generates warnings
  meson: fix compilation with Visual Studio
  meson: make the CSPRNG backend configurable
  meson: wire up fuzzers
  meson: wire up generation of distribution archive
  meson: wire up development environments
  meson: fix dependencies for generated headers
  meson: populate project version via GIT-VERSION-GEN
  GIT-VERSION-GEN: allow running without input and output files
  GIT-VERSION-GEN: simplify computing the dirty marker

ci: wire up Visual Studio build with Meson

Add a new job to GitHub Actions and GitLab CI that builds and tests
Meson-based builds with Visual Studio.

A couple notes:

  - While the build job is mandatory, the test job is marked as "manual"
    on GitLab so that it doesn't run by default. We already have a bunch
    of Windows-based jobs, and the computational overhead that these
    cause is simply out of proportion to run the test suite twice.

    The same isn't true for GitHub as I could not find a way to make a
    subset of jobs manually triggered.

  - We disable Perl. This is because we pick up Perl from Git for
    Windows, which outputs different paths ("/c/" instead of "C:\") than
    what we expect in our tests.

  - We don't use the Git for Windows SDK. Instead, the build only
    depends on Visual Studio, Meson and Git for Windows. All the other
    dependencies like curl, pcre2 and zlib get pulled in and compiled
    automatically by Meson and thus do not have to be provided by the
    system.

  - We open-code "ci/run-test-slice.sh". This is because we only have
    direct access to PowerShell, so we manually implement the logic.
    There is an upstream pull request for the Meson build system [1] to
    implement test slicing in Meson directly.

  - We don't process test artifacts for failed CI jobs. This is done to
    keep down prerequisites to a minimum.

All tests are passing.

[1]: https://github.com/mesonbuild/meson/pull/14092

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: raise error when Meson generates warnings

Meson prints warnings in several cases, like for example when using a
feature supported by the current version of Meson, but not yet supported
by the minimum required version as declared by the project. These
warnings will not cause the setup to fail by default, which makes it
quite easy to miss them.

Improve this by passing `--fatal-meson-warnings` to `meson setup` so
that our CI jobs will fail on warnings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson: fix compilation with Visual Studio

The Visual Studio compiler defaults to C89 unless explicitly asked to
use a different version of the C standard. We don't specify any C
standard at all though in our Meson build, and consequently compiling
Git fails:

...\git\git-compat-util.h(14): fatal error C1189: #error: "Required C99 support is in a test phase. Please see git-compat-util.h for more details."

Fix the issue by specifying the project's C standard. Funny enough,
specifying C99 does not work because apparently, `__STDC_VERSION__` is
not getting defined in that version at all. Instead, we have to specify
C11 as the project's C standard, which is also done in our CMake build
instructions.

We don't want to generally enforce C11 though, as our requiremets only
state that a C99 compiler is required. In fact, we don't even require
plain C99, but rather the GNU variant thereof.

Meson allows us to handle this case rather easily by specifying
"gnu99,c11", which will cause it to fall back to C11 in case GNU C99 is
unsupported. This feature has only been introduced with Meson 1.3.0
though, and we support 0.61.0 and newer. In case we use such an oldish
version though we fall back to requiring GNU99 unconditionally. This
means that Windows essentially requires Meson 1.3.0 and newer when using
Visual Studio, but I doubt that this is ever going to be a real problem.

Tested-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson: make the CSPRNG backend configurable

The CSPRNG backend is not configurable in Meson and isn't quite
discoverable, either. Make it configurable and add the actual backend
used to the summary.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson: wire up fuzzers

Meson does not yet know to build our fuzzers. Introduce a new build
option "fuzzers" and wire up the fuzzers in case it is enabled. Adapt
our CI jobs so that they build the fuzzers by default.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson: wire up generation of distribution archive

Meson knows to generate distribution archives via `meson dist`. In
addition to generating the archive itself, this target also knows to
compile and execute tests from that archive, which helps to ensure that
the result is an adequate drop-in replacement for the versioned project.

While this already works as-is, one omission is that we don't propagate
the commit that this is built from into the resulting archive. This can
be fixed though by adding a distribution script that propagates the
version into the "version" file, which GIT-VERSION-GEN knows to read if
present.

Use GIT-VERSION-GEN to populate that file. As the script is executed in
the build directory, not in the directory where we generate the archive,
we have to use a shell to resolve the "MESON_DIST_ROOT" environment
variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson: wire up development environments

The Meson build system is able to wire up development environments. The
intent is to make build artifacts of the project available. This is
typically used to export e.g. paths to linkable libraries, which isn't
all that interesting in our context given that we don't have an official
library interface.

But what we can use this mechanism for is to expose the built Git
executables as well as the build directory. This allows users to play
around with the built Git version in the devenv, and allows them to
execute our test scripts directly with the built distribution.

Wire up this feature, which can then be used via `meson devenv` in the
build directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson: fix dependencies for generated headers

We generate a couple of headers from our documentation. These headers
are added to the libgit sources, but two of them aren't used by the
library, but instead by our builtins. This can cause parallel builds to
fail because the builtin object may be compiled before the header was
generated.

Fix the issue by adding both "config-list.h" and "hook-list.h" to the
list of builtin sources. While "command-list.h" is generated similarly,
it is used by "help.c" and thus part of the libgit sources indeed.

Reported-by: Evan Martin <evan.martin@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson: populate project version via GIT-VERSION-GEN

The Git version for Meson is currently wired up manually. It can thus
grow (and already has grown) stale quite easily, as having multiple
sources of truth is never a good idea. This issue is mostly of cosmetic
nature as we don't use the project version anywhere, and instead use the
GIT-VERSION-GEN script to propagate the correct version into our build.
But it is somewhat puzzling when `meson setup` announces to build an old
Git release.

There are a couple of alternatives for how to solve this:

  - We can keep the version undefined, but this makes Meson output
    "undefined" for the version, as well.

  - We can use GIT-VERSION-GEN to generate the version for us. At the
    point of configuring the project we haven't yet figured out host
    details though, and thus we didn't yet set up the shell environment.
    While not an issue for Unix-based systems, this would be an issue in
    Windows, where the shell typically gets provided via Git for Windows
    and thus requires some special setup.

  - We can pull the default version out of GIT-VERSION-GEN and move it
    into its own file. This likely requires some adjustments for scripts
    that bump the version, but allows Meson to read the version from
    that file trivially.

Pick the second option and use GIT-VERSION-GEN as it gives us the most
accurate version. In order to fix the bootstrapping issue on Windows
systems we simply set the version to 'unknown' in case no shell was
found. As the version is only of cosmetic value this isn't really much
of an issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

GIT-VERSION-GEN: allow running without input and output files

The GIT-VERSION-GEN script requires an input file containing formatting
directives to be replaced as well as an output file that will get
overwritten in case the file contents have changed. When computing the
project version for Meson we don't want to have either though:

  - We only want to compute the version without anything else, but don't
    have an input file that would match that exact format. While we
    could of course introduce a new file just for that usecase, it feels
    suboptimal to add another file every time we want to have a slightly
    different format for versioned data.

  - The computed version needs to be read from stdout so that Meson can
    wire it up for the project.

Extend the script to handle both usecases by recognizing `--format=` as
alternative to providing an input path and by writing to stdout in case
no output file was given.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

GIT-VERSION-GEN: simplify computing the dirty marker

The GIT-VERSION-GEN script computes the version that Git is being built
from. When building from a commit with an unclean worktree it knows to
append "-dirty" to that version to indicate that there were custom
changes applied and that it isn't the exact same as that commit.

The dirtiness check is done manually via git-diff-index(1), which is
somewhat puzzling though: we already use git-describe(1) to compute the
version, which also knows to compute dirtiness via the "--dirty" flag.
But digging back in history explains why: the "-dirty" suffix was added
in 31e0b2ca81 (GIT 1.5.4.3, 2008-02-23), and git-describe(1) didn't yet
have support for "--dirty" back then.

Refactor the script to use git-describe(1). Despite being simpler, it
also results in a small speedup:

    Benchmark 1: git describe --dirty --match "v[0-9]*"
      Time (mean ± σ):      12.5 ms ±   0.3 ms    [User: 6.3 ms, System: 8.8 ms]
      Range (min … max):    12.0 ms …  13.5 ms    200 runs

    Benchmark 2: git describe --match "v[0-9]*" HEAD && git update-index -q --refresh && git diff-index --name-only HEAD --
      Time (mean ± σ):      17.9 ms ±   1.1 ms    [User: 8.8 ms, System: 14.4 ms]
      Range (min … max):    17.0 ms …  30.6 ms    148 runs

    Summary
      git describe --dirty --match "v[0-9]*" ran
        1.43 ± 0.09 times faster than git describe --match "v[0-9]*" && git update-index -q --refresh && git diff-index --name-only HEAD --

While the speedup doesn't really matter on Unix-based systems, where
filesystem operations are typically fast, they do matter on Windows
where the commands take a couple hundred milliseconds. A quick and dirty
check on that system shows a speedup from ~800ms to ~400ms.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/pack-redundant: remove subcommand with breaking changes

The git-pack-redundant(1) subcommand has been castrated to require
the "--i-still-use-this" option to do anything since 4406522b
(pack-redundant: escalate deprecation warning to an error,
2023-03-23), which appeared in Git 2.41 and was announced for
removal with 53a92c9552 (Documentation/BreakingChanges: announce
removal of git-pack-redundant(1), 2024-09-02). Stop compiling the
subcommand in case the `WITH_BREAKING_CHANGES` build flag is set.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: repurpose "linux-gcc" job for deprecations

The "linux-gcc" job isn't all that interesting by itself and can be
considered more or less the "standard" job: it is running with a
reasonably up-to-date image and uses GCC as a compiler, both of which we
already cover in other jobs.

There is one exception though: we change the default branch to be "main"
instead of "master", so it is forging ahead a bit into the future to
make sure that this change does not cause havoc. So let's expand on this
a bit and also add the new "WITH_BREAKING_CHANGES" flag to the mix.

Rename the job to "linux-breaking-changes" accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: merge linux-gcc-default into linux-gcc

The "linux-gcc-default" job is mostly doing the same as the "linux-gcc"
job, except for a couple of minor differences:

  - We use an explicit GCC version instead of the default version
    provided by the distribution. We have other jobs that test with
    "gcc-8", making this distinction pointless.

  - We don't set up the Python version explicitly, and instead use the
    default Python version. Python 2 has been end-of-life for quite a
    while now though, making this distinction less interesting.

  - We set up the default branch name to be "main" in "linux-gcc". We
    have other testcases that don't and also some that explicitly use
    "master".

  - We use "ubuntu:20.04" in one job and "ubuntu:latest" in another. We
    already have a couple other jobs testing these respectively.

So overall, the job does not add much to our test coverage.

Drop the "linux-gcc-default" job and adapt "linux-gcc" to start using
the default GCC compiler, effectively merging those two jobs into one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Makefile: wire up build option for deprecated features

With 57ec9254eb (docs: introduce document to announce breaking changes,
2024-06-14), we have introduced a new document that tracks upcoming
breaking changes in the Git project. In 2454970930 (BreakingChanges:
early adopter option, 2024-10-11) we have amended the document a bit to
mention that any introduced breaking changes must be accompanied by
logic that allows us to enable the breaking change at compile-time.
While we already have two breaking changes lined up, neither of them has
such a switch because they predate those instructions.

Introduce the proposed `WITH_BREAKING_CHANGES` preprocessor macro and
wire it up with both our Makefiles and Meson. This does not yet wire up
the build flag for existing deprecations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: fix migration of reflogs respecting "core.logAllRefUpdates"

In 246cebe320 (refs: add support for migrating reflogs, 2024-12-16) we
have added support to git-refs(1) to migrate reflogs between reference
backends. It was reported [1] though that not we don't migrate reflogs
for a subset of references, most importantly "refs/stash".

This issue is caused by us still honoring "core.logAllRefUpdates" when
trying to migrate reflogs: we do queue the updates, but depending on the
value of that config we may decide to just skip writing the reflog entry
altogether. And given that:

  - The default for "core.logAllRefUpdates" is to only create reflogs
    for branches, remotes, note refs and "HEAD"

  - "refs/stash" is neither of these ref types.

We end up skipping the reflog creation for that particular reference.

Fix the bug by setting `REF_FORCE_CREATE_REFLOG`, which instructs the
ref backends to create the reflog entry regardless of the config or any
preexisting state.

[1]: <Z5BTQRlsOj1sygun@tapette.crustytoothpaste.net>

Reported-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable: address trivial -Wsign-compare warnings

Address the last couple of trivial -Wsign-compare warnings in the
reftable library and remove the DISABLE_SIGN_COMPARE_WARNINGS macro that
we have in "reftable/system.h".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/blocksource: adjust `read_block()` to return `ssize_t`

The `block_source_read_block()` function and its implementations return
an integer as a result that reflects either the number of bytes read, or
an error. As such its return type, a signed integer, isn't wrong, but it
doesn't give the reader a good hint what it actually returns.

Refactor the function to return an `ssize_t` instead, which is typical
for functions similar to read(3p) and should thus give readers a better
signal what they can expect as a result.

Adjust callers to better handle the returned value to avoid warnings
with -Wsign-compare. One of these callers is `reader_get_block()`, whose
return value is only ever used by its callers to figure out whether or
not the read was successful. So instead of bubbling up the `ssize_t`
there, too, we adapt it to only indicate success or errors.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/blocksource: adjust type of the block length

The block length is used to track the number of bytes available in a
specific block. As such, it is never set to a negative value, but is
still represented by a signed integer.

Adjust the type of the variable to be `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/block: adjust type of the restart length

The restart length is tracked as a positive integer even though it
cannot ever be negative. Furthermore, it is effectively capped via the
MAX_RESTARTS variable.

Adjust the type of the variable to be `uint32_t`. While this type is
excessive given that MAX_RESTARTS fits into an `uint16_t`, other places
already use 32 bit integers for restarts, so this type is being more
consistent.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/block: adapt header and footer size to return a `size_t`

The functions `header_size()` and `footer_size()` return a positive
integer representing the size of the header and footer, respectively,
dependent on the version of the reftable format. Similar to the
preceding commit, these functions return a signed integer though, which
is nonsensical given that there is no way for these functions to return
negative.

Adapt the functions to return a `size_t` instead to fix a couple of sign
comparison warnings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/basics: adjust `hash_size()` to return `uint32_t`

The `hash_size()` function returns the number of bytes used by the hash
function. Weirdly enough though, it returns a signed integer for its
size even though the size obviously cannot ever be negative. The only
case where it could be negative is if the function returned an error
when asked for an unknown hash, but we assert(3p) instead.

Adjust the type of `hash_size()` to be `uint32_t` and adapt all places
that use signed integers for the hash size to follow suit. This also
allows us to get rid of a couple asserts that we had which verified that
the size was indeed positive, which further stresses the point that this
refactoring makes sense.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/basics: adjust `common_prefix_size()` to return `size_t`

The `common_prefix_size()` function computes the length of the common
prefix between two buffers. As such its return value will always be an
unsigned integer, as the length cannot be negative. Regardless of that,
the function returns a signed integer, which is nonsensical and causes a
couple of -Wsign-compare warnings all over the place.

Adjust the function to return a `size_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/record: handle overflows when decoding varints

The logic to decode varints isn't able to detect integer overflows: as
long as the buffer still has more data available, and as long as the
current byte has its 0x80 bit set, we'll continue to add up these values
to the result. This will eventually cause the `uint64_t` to overflow, at
which point we'll return an invalid result.

Refactor the function so that it is able to detect such overflows. The
implementation is basically copied from Git's own `decode_varint()`,
which already knows to handle overflows. The only adjustment is that we
also take into account the string view's length in order to not overrun
it. The reftable documentation explicitly notes that those two encoding
schemas are supposed to be the same:

    Varint encoding
    ^^^^^^^^^^^^^^^

    Varint encoding is identical to the ofs-delta encoding method used
    within pack files.

    Decoder works as follows:

    ....
    val = buf[ptr] & 0x7f
    while (buf[ptr] & 0x80) {
      ptr++
      val = ((val + 1) << 7) | (buf[ptr] & 0x7f)
    }
    ....

While at it, refactor `put_var_int()` in the same way by copying over
the implementation of `encode_varint()`. While `put_var_int()` doesn't
have an issue with overflows, it generates warnings with -Wsign-compare.
The implementation of `encode_varint()` doesn't, is battle-tested and at
the same time way simpler than what we currently have.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/record: drop unused `print` function pointer

In 42c424d69d (t/helper: inline printing of reftable records,
2024-08-22) we stopped using the `print` function of the reftable record
vtable and instead moved its implementation into the single user of it.
We didn't remove the function itself from the vtable though. Drop it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

meson: stop disabling -Wsign-compare

In 4f9264b0cd (config.mak.dev: drop `-Wno-sign-compare`, 2024-12-06) we
have started an effort to make our codebase compile with -Wsign-compare.
But while we removed the -Wno-sign-compare flag from "config.mak.dev",
we didn't adjust the Meson build instructions in the same way.

Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t8002: fix ambiguous printf conversion specifications

In e7fb2ca945 (builtin/blame: fix out-of-bounds write with blank
boundary commits, 2025-01-10), we have introduced two new tests that
expect a certain amount of padding. This padding is generated via
printf using the "%0.s" conversion specification. That directive is
ambiguous because it might be interpreted as field width (most shells)
or 0-padding flag for numeric fields (coreutils).

Fix this issue by using "%${N}s" instead, which is already being
used in other tests (i.e. t5300, t0450) and is unambiguous.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jan Palus <jpalus@fastmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>