Ramsay Jones [Tue, 15 Jul 2025 23:32:39 +0000 (00:32 +0100)]
po/meson.build: add missing 'ga' language code
Commit bf5ce434db ("l10n: Add full Irish translation (ga.po)", 2025-05-16)
added a new translation to git. In a make build, new 'po' files (ga.po
in this case) are added to the build automatically using a wildcard
pattern. In a meson build you have to add the language code ('ga') to a
list explicitly to have it included in the build. In order to include the
new translation in the meson build, add the 'ga' language code to the
list of translations in the 'po/meson.build' file.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ramsay Jones [Tue, 15 Jul 2025 23:32:38 +0000 (00:32 +0100)]
meson: fix installation when -Dlibexexdir is set
commit 837f637cf5 ("meson.build: correct setting of GIT_EXEC_PATH",
2025-05-19) corrected the GIT_EXEC_PATH build setting, but then forgot
to update the installation path for the library executables. This causes
a regression when attempting to execute commands, after installing to a
non-standard location (reported here[1]):
$ meson -Dprefix=/tmp/git -Dlibexecdir=libexec-different build
$ meson install
$ /tmp/git/bin/git --exec-path
/tmp/git/libexec-different
$ /tmp/git/bin/git daemon
git: 'daemon' is not a git command. See 'git --help'
In order to fix the issue, use the 'git_exec_path' variable (calculated
while processing -Dlibexecdir) as the 'install_dir' field during the
installation of the library executables.
Junio C Hamano [Tue, 15 Jul 2025 22:18:17 +0000 (15:18 -0700)]
Merge branch 'ps/object-store'
Code clean-up around object access API.
* ps/object-store:
odb: rename `read_object_with_reference()`
odb: rename `pretend_object_file()`
odb: rename `has_object()`
odb: rename `repo_read_object_file()`
odb: rename `oid_object_info()`
odb: trivial refactorings to get rid of `the_repository`
odb: get rid of `the_repository` when handling submodule sources
odb: get rid of `the_repository` when handling the primary source
odb: get rid of `the_repository` in `for_each()` functions
odb: get rid of `the_repository` when handling alternates
odb: get rid of `the_repository` in `odb_mkstemp()`
odb: get rid of `the_repository` in `assert_oid_type()`
odb: get rid of `the_repository` in `find_odb()`
odb: introduce parent pointers
object-store: rename files to "odb.{c,h}"
object-store: rename `object_directory` to `odb_source`
object-store: rename `raw_object_store` to `object_database`
bswap.h: provide a built-in based version of bswap32/64 if possible
The compiler is in general able to recognize the endian shift and
replace it with an optimized opcode if possible. On certain
architectures such as RiscV or MIPS the situation can get complicated.
They don't provide an optimized opcode and masking the "higher" bits may
required loading a constant which needs shifting. This causes the
compiler to emit a lot of instructions for the operation.
The provided builtin directive on these architecture calls a function
which does the operation instead of emitting the code for operation.
Bring back the change from commit 6547d1c9 (bswap.h: add support for
built-in bswap functions, 2025-04-23). The bswap32/64 macro can now be
defined unconditionally so it won't regress on big endian architectures.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Signed-off-by: Junio C Hamano <gitster@pobox.com>
bswap.h: remove optimized x86 version of bswap32/64
On x86 the bswap32/64 macro is implemented based on the x86 opcode which
performs the required shifting in just one opcode.
The other CPUs fallback to the generic shifting as implemented by
default_swab32() and default_bswap64() if needed.
I've been looking at how good a compiler is at recognizing the default
shift and emitting an optimized operation:
- x86, arm64 msvc v19.20
default_swab32() optimized
default_bswap64() shifts
_byteswap_uint64() optimized
The ntohl and htonl macros are redefined because the provided macros were
not always optimal. Sometimes it was a function call, sometimes it was a
macro which did the shifting. Using the 'bswap' opcode on x86 provides
probably better performance than performing the shifting.
These macros are only overwritten on x86 if the "optimized" version is
available.
The ntohll and htonll macros are not available on every platform (at
least glibc does not provide them) which means they need to be defined
once the endianness of the system is determined.
In order to get a more symmetrical setup, redfine the macros once the
endianness of the system has been determined.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Signed-off-by: Junio C Hamano <gitster@pobox.com>
bswap.h: define GIT_LITTLE_ENDIAN on msvc as little endian
The Microsoft Visual C++ (MSVC) compiler (as of Visual Studio 2022
version 17.13.6) does not define __BYTE_ORDER__ and its C-library does
not define __BYTE_ORDER. The compiler is supported only on arm64 and x86
which are all little endian.
Define GIT_BYTE_ORDER on msvc as little endian to avoid further checks.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-lib: respect GIT_TEST_INSTALLED when querying default hash
$GIT_TEST_INSTALLED can be set to use an "installed" git instead of the
one from $GIT_BUILD_DIR. This is used by my company's internal test
infrastructure, and not using $GIT_TEST_INSTALLED when querying the
default hash meant that the tests were failing because the hash was
effectively set to the empty string (since git didn't execute).
In the two places we attempt to detect/execute git itself prior to
overriding everything and putting it in $PATH, use identical logic for
identifying the git binary to execute. This also has the effect of
including the $X suffix when querying the default hash, but that's not
strictly necessary. You don't need to specify .exe when running a binary
on Windows, just when testing whether it exists or not.
Signed-off-by: Kyle Lippincott <spectral@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
As well as receiving the config key and value, config callbacks
also receive a "struct key_value_info" containing information about
the source of the key-value pair. Accessing the "path" field of
this struct from a callback passed to repo_config() results in a
use-after-free. This happens because repo_config() first populates a
configset by calling config_with_options() and then iterates over the
configset with the callback passed by the caller. When the configset
is constructed it takes a shallow copy of the "struct key_value_info"
for each config setting. This leads to the use-after-free as the
"path" member is freed before config_with_options() returns.
We could fix this by interning the "path" field as we do
for the "filename" field but the "path" field is not actually
needed. It is populated with a copy of the "path" field from "struct
config_source". That field was added in d14d42440d8 (config: disallow
relative include paths from blobs, 2014-02-19) to distinguish between
relative include directives in files and those in blobs. However,
since 1b8132d99d8 (i18n: config: unfold error messages marked for
translation, 2016-07-28) we can differentiate these by looking at the
"origin_type" field in "struct key_value_info". So let's remove the
"path" members from "struct config_source" and "struct key_value_info"
and instead use a combination of the "filename" and "origin_type"
fields to determine the absolute path of relative includes.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
midx: remove now-unused linked list of multi-pack indices
In the preceding commits we have migrated all users of the linked list
of multi-pack indices to instead use those stored in the object database
sources. Remove those now-unused pointers.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
packfile: stop using linked MIDX list in `get_all_packs()`
Refactor `get_all_packs()` so that we stop using the linked list of
multi-pack indices. Note that there is no need to explicitly prepare
alternates, and neither do we have to use `get_multi_pack_index()`,
because `prepare_packed_git()` already takes care of populating all data
structures for us.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
packfile: stop using linked MIDX list in `find_pack_entry()`
Refactor `find_pack_entry()` so that we stop using the linked list of
multi-pack indices. Note that there is no need to explicitly prepare
alternates, and neither do we have to use `get_multi_pack_index()`,
because `prepare_packed_git()` already takes care of populating all data
structures for us.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
packfile: refactor `get_multi_pack_index()` to work on sources
The function `get_multi_pack_index()` loads multi-pack indices via
`prepare_packed_git()` and then returns the linked list of multi-pack
indices that is stored in `struct object_database`. That list is in the
process of being removed though in favor of storing the MIDX as part of
the object database source it belongs to.
Refactor `get_multi_pack_index()` so that it returns the multi-pack
index for a single object source. Callers are now expected to call this
function for each source they are interested in. This requires them to
iterate through alternates, so we have to prepare alternate object
sources before doing so.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `close_midx()` we not only close the multi-pack index for
one object source, but instead we iterate through the whole linked list
of MIDXs to close all of them. This linked list is about to go away in
favor of using the new per-source pointer to its respective MIDX.
Refactor the function to iterate through sources instead.
Note that after this patch, there's a couple of callsites left that
continue to use `close_midx()` without iterating through all sources.
These are all cases where we don't care about the MIDX from other
sources though, so it's fine to keep them as-is.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
packfile: refactor `prepare_packed_git_one()` to work on sources
In the preceding commit we refactored how we load multi-pack indices to
take a corresponding "source" as input. As part of this refactoring we
started to store a pointer to the MIDX in `struct odb_source` itself.
Refactor loading of packfiles in the same way: instead of passing in the
object directory, we now pass in the source from which we want to load
packfiles. This allows us to simplify the code because we don't have to
search for a corresponding MIDX anymore, but we can instead directly use
the MIDX that we have already prepared beforehand.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Multi-pack indices are tracked via `struct multi_pack_index`. This data
structure is stored as a linked list inside `struct object_database`,
which is the global database that spans across all of the object
sources.
This layout causes two problems:
- Object databases consist of multiple object sources (e.g. one source
per alternate object directory), where each multi-pack index is
specific to one of those sources. Regardless of that though, the
MIDX is not tracked per source, but tracked globally for the whole
object database. This creates a mismatch between the on-disk layout
and how things are organized in the object database subsystems and
makes some parts, like figuring out whether a source has an MIDX,
quite awkward.
- Multi-pack indices are an implementation detail of how efficient
access for packfiles work. As such, they are neither relevant in the
context of loose objects, nor in a potential future where we have
pluggable backends.
Refactor `prepare_multi_pack_index_one()` so that it works on a specific
source, which allows us to easily store a pointer to the multi-pack
index inside of it. For now, this pointer exists next to the existing
linked list we have in the object database. Users will be adjusted in
subsequent patches to instead use the per-source pointers.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 15 Jul 2025 19:06:57 +0000 (12:06 -0700)]
Merge branch 'tb/midx-avoid-cruft-packs' into ps/object-store-midx
* tb/midx-avoid-cruft-packs:
repack: exclude cruft pack(s) from the MIDX where possible
pack-objects: introduce '--stdin-packs=follow'
pack-objects: swap 'show_{object,commit}_pack_hint'
pack-objects: fix typo in 'show_object_pack_hint()'
pack-objects: perform name-hash traversal for unpacked objects
pack-objects: declare 'rev_info' for '--stdin-packs' earlier
pack-objects: factor out handling '--stdin-packs'
pack-objects: limit scope in 'add_object_entry_from_pack()'
pack-objects: use standard option incompatibility functions
The `git-for-each-ref(1)` command is used to iterate over references
present in a repository. In large repositories with millions of
references, it would be optimal to paginate this output such that we
can start iteration from a given reference. This would avoid having to
iterate over all references from the beginning each time when paginating
through results.
The previous commit added 'seek' functionality to the reference
backends. Utilize this and expose a '--start-after' option in
'git-for-each-ref(1)'. When used, the reference iteration seeks to the
lexicographically next reference and iterates from there onward.
This enables efficient pagination workflows, where the calling script
can remember the last provided reference and use that as the starting
point for the next set of references:
git for-each-ref --count=100
git for-each-ref --count=100 --start-after=refs/heads/branch-100
git for-each-ref --count=100 --start-after=refs/heads/branch-200
Since the reference iterators only allow seeking to a specified marker
via the `ref_iterator_seek()`, we introduce a helper function
`start_ref_iterator_after()`, which seeks to next reference by simply
adding (char) 1 to the marker.
We must note that pagination always continues from the provided marker,
as such any concurrent reference updates lexicographically behind the
marker will not be output. Document the same.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 'ref-filter.c', there is an 'else' clause within `do_filter_refs()`.
This is unnecessary since the 'if' clause calls `die()`, which would
exit the program. So let's remove the unnecessary 'else' clause. This
improves readability since the indentation is also reduced and flow is
simpler.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: selectively set prefix in the seek functions
The ref iterator exposes a `ref_iterator_seek()` function. The name
suggests that this would seek the iterator to a specific reference in
some ways similar to how `fseek()` works for the filesystem.
However, the function actually sets the prefix for refs iteration. So
further iteration would only yield references which match the particular
prefix. This is a bit confusing.
Let's add a 'flags' field to the function, which when set with the
'REF_ITERATOR_SEEK_SET_PREFIX' flag, will set the prefix for the
iteration in-line with the existing behavior. Otherwise, the reference
backends will simply seek to the specified reference and clears any
previously set prefix. This allows users to start iteration from a
specific reference.
In the packed and reftable backend, since references are available in a
sorted list, the changes are simply setting the prefix if needed. The
changes on the files-backend are a little more involved, since the files
backend uses the 'ref-cache' mechanism. We move out the existing logic
within `cache_ref_iterator_seek()` to `cache_ref_iterator_set_prefix()`
which is called when the 'REF_ITERATOR_SEEK_SET_PREFIX' flag is set. We
then parse the provided seek string and set the required levels and
their indexes to ensure that seeking is possible.
Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `ref_iterator` is an internal structure to the 'refs/'
sub-directory, which allows iteration over refs. All reference iteration
is built on top of these iterators.
External clients of the 'refs' subsystem use the various
'refs_for_each...()' functions to iterate over refs. However since these
are wrapper functions, each combination of functionality requires a new
wrapper function. This is not feasible as the functions pile up with the
increase in requirements. Expose the internal reference iterator, so
advanced users can mix and match options as needed.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Tue, 15 Jul 2025 02:56:22 +0000 (10:56 +0800)]
bloom: optimize multiple pathspec items in revision
To enable optimize multiple pathspec items in revision traversal,
return 0 if all pathspec item is literal in forbid_bloom_filters().
Add for loops to initialize and check each pathspec item's bloom_keyvec
when optimization is possible.
Add new test cases in t/t4216-log-bloom.sh to ensure
- consistent results between the optimization for multiple pathspec
items using bloom filter and the case without bloom filter
optimization.
- does not use bloom filter if any pathspec item is not literal.
With these optimizations, we get some improvements for multi-pathspec runs
of 'git log'. First, in the Git repository we see these modest results:
Benchmark 1: old
Time (mean ± σ): 73.1 ms ± 2.9 ms
Range (min … max): 69.9 ms … 84.5 ms 42 runs
Benchmark 2: new
Time (mean ± σ): 55.1 ms ± 2.9 ms
Range (min … max): 51.1 ms … 61.2 ms 52 runs
Summary
'new' ran
1.33 ± 0.09 times faster than 'old'
But in a larger repo, such as the LLVM project repo below, we get even
better results:
Benchmark 1: old
Time (mean ± σ): 1.974 s ± 0.006 s
Range (min … max): 1.960 s … 1.983 s 10 runs
Benchmark 2: new
Time (mean ± σ): 262.9 ms ± 2.4 ms
Range (min … max): 257.7 ms … 266.2 ms 11 runs
Summary
'new' ran
7.51 ± 0.07 times faster than 'old'
Signed-off-by: Derrick Stolee <stolee@gmail.com>
[ly: rename convert_pathspec_to_filter() to convert_pathspec_to_bloom_keyvec()] Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 14 Jul 2025 18:19:29 +0000 (11:19 -0700)]
Merge branch 'rp/apply-intent-to-add-fix'
"git apply -N" should start from the current index and register
only new files, but it instead started from an empty index, which
has been corrected.
* rp/apply-intent-to-add-fix:
apply docs: clarify wording for --intent-to-add
t4140: test apply --intent-to-add interactions
apply: only write intents to add for new files
apply: read in the index in --intent-to-add mode
Junio C Hamano [Mon, 14 Jul 2025 18:19:28 +0000 (11:19 -0700)]
Merge branch 'sj/string-list'
Code and test clean-up around string-list API.
* sj/string-list:
u-string-list: move "remove duplicates" test to "u-string-list.c"
u-string-list: move "filter string" test to "u-string-list.c"
u-string-list: move "test_split_in_place" to "u-string-list.c"
u-string-list: move "test_split" into "u-string-list.c"
string-list: enable sign compare warnings check
string-list: return index directly when inserting an existing element
string-list: remove unused "insert_at" parameter from add_entry
string-list: fix sign compare warnings for loop iterator
Junio C Hamano [Mon, 14 Jul 2025 18:19:26 +0000 (11:19 -0700)]
Merge branch 'kn/clang-format-updates'
Update ".clang-format" and ".editorconfig" to match our style guide
a bit better.
* kn/clang-format-updates:
meson: add rule to run 'git clang-format'
clang-format: add 'RemoveBracesLLVM' to the main config
clang-format: set 'ColumnLimit' to 0
Junio C Hamano [Mon, 14 Jul 2025 18:19:25 +0000 (11:19 -0700)]
Merge branch 'kh/doc-config-subcommands'
Documentation updates.
* kh/doc-config-subcommands:
config: mention --url in the synopsis
config: use --value instead of value-pattern
config: document --[no-]value
config: use --value=<pattern> consistently
config: document --[no-]show-names
Junio C Hamano [Mon, 14 Jul 2025 18:19:25 +0000 (11:19 -0700)]
Merge branch 'mc/netrc-service-names'
"netrc" credential helper has been improved to understand textual
service names (like smtp) in addition to the numeric port numbers
(like 25).
* mc/netrc-service-names:
contrib: better support symbolic port names in git-credential-netrc
contrib: warn for invalid netrc file ports in git-credential-netrc
contrib: use a more portable shebang for git-credential-netrc
Junio C Hamano [Mon, 14 Jul 2025 18:19:24 +0000 (11:19 -0700)]
Merge branch 'ps/use-reftable-as-default-in-3.0'
The reftable ref backend has matured enough; Git 3.0 will make it
the default format in a newly created repositories by default.
* ps/use-reftable-as-default-in-3.0:
setup: use "reftable" format when experimental features are enabled
BreakingChanges: announce switch to "reftable" format
Junio C Hamano [Mon, 14 Jul 2025 18:19:23 +0000 (11:19 -0700)]
Merge branch 'ac/prune-wo-the-repository'
Some code paths in the "git prune" used to ignore passed in
repository object and used the_repository singleton instance
instead, which has been corrected.
* ac/prune-wo-the-repository:
builtin/prune: stop depending on 'the_repository'
repository: move 'repository_format_precious_objects' to repo scope
Lidong Yan [Sat, 12 Jul 2025 09:35:16 +0000 (17:35 +0800)]
revision: make helper for pathspec to bloom keyvec
When preparing to use bloom filters in a revision walk, Git populates a
boom_keyvec with an array of bloom keys for the components of a path.
Before we create the ability to map multiple pathspecs to multiple
bloom_keyvecs, extract the conversion from a pathspec to a bloom_keyvec
into its own helper method. This simplifies the state that persists in
prepare_to_use_bloom_filter() as well as makes the future change much
simpler.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Sat, 12 Jul 2025 09:35:15 +0000 (17:35 +0800)]
bloom: replace struct bloom_key * with struct bloom_keyvec
Previously, we stored bloom keys in a flat array and marked a commit
as NOT TREESAME if any key reported "definitely not changed".
To support multiple pathspec items, we now require that for each
pathspec item, there exists a bloom key reporting "definitely not
changed".
This "for every" condition makes a flat array insufficient, so we
introduce a new structure to group keys by a single pathspec item.
`struct bloom_keyvec` is introduced to replace `struct bloom_key *`
and `bloom_key_nr`. And because we want to support multiple pathspec
items, we added a bloom_keyvec * and a bloom_keyvec_nr field to
`struct rev_info` to represent an array of bloom_keyvecs. This commit
still optimize only one pathspec item, thus bloom_keyvec_nr can only
be 0 or 1.
New bloom_keyvec_* functions are added to create and destroy a keyvec.
bloom_filter_contains_vec() is added to check if all key in keyvec is
contained in a bloom filter.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Sat, 12 Jul 2025 09:35:14 +0000 (17:35 +0800)]
bloom: rename function operates on bloom_key
git code style requires that functions operating on a struct S
should be named in the form S_verb. However, the functions operating
on struct bloom_key do not follow this convention. Therefore,
fill_bloom_key() and clear_bloom_key() are renamed to bloom_key_fill()
and bloom_key_clear(), respectively.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Sat, 12 Jul 2025 09:35:13 +0000 (17:35 +0800)]
bloom: add test helper to return murmur3 hash
In bloom.h, murmur3_seeded_v2() is exported for the use of test murmur3
hash. To clarify that murmur3_seeded_v2() is exported solely for testing
purposes, a new helper function test_murmur3_seeded() was added instead
of exporting murmur3_seeded_v2() directly.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Takashi Iwai [Tue, 17 Jun 2025 05:59:54 +0000 (07:59 +0200)]
gitk: Add support of SHA256 repositories
This patch adds a basic support of SHA256 Git repository to Gitk, so
that Gitk can show and operate on both SHA1 and SHA256 repos
gracefully. Since SHA256 has a longer ID length (64 char) than SHA1
(40 char), many field widths are adjusted to fit with it.
A caveat is that the configuration of auto selection length is shared
between SHA1 and SHA256 repos. That is, once when this value is saved
and read, it's applied to both repo types, which may result in shorter
selection than the full SHA256 ID. We may introduce another
individual config for sha256 (actually I did write in the first
version), but for simplicity, the common config is used as of writing
this.
Many lines still refer "sha1" although they may point to both SHA1 and
SHA256. They are left untouched for making the changes simpler.
This patch is based on the early work by Rostislav Krasny:
https://patchwork.kernel.org/project/git/patch/pull.979.git.1623687519832.gitgitgadget@gmail.com
I refreshed, revised and extended to the latest state.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
As executing our test suite is notoriously slow on Windows we use matrix
jobs in our CI systems to slice up tests and run them via multiple jobs.
On Meson this is done with a comparatively complex PowerShell invocation
as Meson didn't yet have a native way to slice tests like this.
I have upstreamed a new `--slice` option [1] that addresses this use
case though, which has been merged and released with Meson 1.8. Both
GitLab and GitHub CI have Meson 1.8.2 available by now, so let's update
the jobs to use that new option.
gitglossary documents Git pathspecs. One type of pathspec is the "glob"
pathspec, prefixed with the magic word "glob".
Regarding glob pathspecs, gitglossary says, '"**/foo" matches file or
directory "foo" anywhere, the same as pattern "foo".' That last phrase
('the same as pattern "foo") is incorrect. "**/foo" and "foo" are not
equivalent. "**/foo" matches foo anywhere, but "foo" does not.
This change removes the incorrect phrase from the glob pathspec doc.
Signed-off-by: Russell Hanneken <rhanneken@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
daemon: use sigaction() to install child_handler()
Replace signal() with an equivalent invocation of sigaction(), but
make sure to NOT set SA_RESTART so the original code that expects
to be interrupted when children complete still works as designed.
This change has the added benefit of using BSD signal semantics reliably
and therefore not needing the rearming call in the signal handler.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future change will start using sigaction to setup a SIGCHLD signal
handler.
The current code uses signal(), which returns SIG_ERR (but doesn't
seem to set errno) so instruct sigaction() to do the same.
A new SA flag will be needed, so copy the one from Cygwin; note that
the sigaction() implementation that is provided won't use it, so
its value is otherwise irrelevant.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compiling Git fails on Amazon Linux 2 when using GCC 7.3.1 with the
following compiler error:
In file included from compat/posix.h:449:0,
from git-compat-util.h:26,
from daemon.c:3:
compat/../sane-ctype.h:29:60: error: expected expression before ']' token
#define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
^
compat/../sane-ctype.h:29:72: error: expected ')' before '!=' token
#define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
^
compat/../sane-ctype.h:29:60: error: expected expression before ']' token
#define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
^
... lots of similar lines ...
compat/../sane-ctype.h:45:50: error: expected declaration specifiers or '...' before numeric constant
#define toupper(x) sane_case((unsigned char)(x), 0)
^
/usr/include/ctype.h:142:12: error: expected identifier or '(' before 'int'
extern int isascii (int __c) __THROW;
^
compat/../sane-ctype.h:30:26: error: expected ')' before '&' token
#define isascii(x) (((x) & ~0x7f) == 0)
^
compat/../sane-ctype.h:30:35: error: expected ')' before '==' token
#define isascii(x) (((x) & ~0x7f) == 0)
^
In file included from /usr/include/features.h:423:0,
from /usr/include/unistd.h:25,
from compat/posix.h:90,
from git-compat-util.h:26,
from daemon.c:3:
compat/../sane-ctype.h:44:30: error: expected declaration specifiers or '...' before '(' token
#define tolower(x) sane_case((unsigned char)(x), 0x20)
^
compat/../sane-ctype.h:44:50: error: expected declaration specifiers or '...' before numeric constant
#define tolower(x) sane_case((unsigned char)(x), 0x20)
^
compat/../sane-ctype.h:45:30: error: expected declaration specifiers or '...' before '(' token
#define toupper(x) sane_case((unsigned char)(x), 0)
^
compat/../sane-ctype.h:45:50: error: expected declaration specifiers or '...' before numeric constant
#define toupper(x) sane_case((unsigned char)(x), 0)
^
This error bisect back to 75a044f748 (git-compat-util.h: split out
POSIX-emulating bits, 2025-02-18), where lots of bits got split out of
"git-compat-util.h" into a new "compat/posix.h" header.
The compiler error isn't immediately obvious, doubly so because the
actual errors are ~3x as long as the above snippet. But what happens
here is that we transitively include <ctype.h> after we have included
our own "sane-ctype.h" header. Consequently, the function declarations
that exist in <ctype.h> for isascii(3p) et al will be mangled by our
macros of the same type. The result is of course completely broken.
It's unclear why this issue only happens on Amazon Linux 2. My guess is
that it's either specific to the compiler version or specific to the
glibc version. We don't explicitly include <ctypes.h> anywhere, but it's
being transitively included. So chances are that later versions of the
toolchain reorganized their headers so that <ctypes.h> is not included
transitively anymore.
Fix the issue by explicitly including <ctype.h> in "sane-ctype.h". This
ensures that the header guards will be activated and that any subsequent
include of the same header will become a no-op. With this we can then
safely override the function declarations with our own macros.
Reported-by: Stan Hu <stanhu@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 9 Jul 2025 23:29:52 +0000 (16:29 -0700)]
Merge branch 'ps/object-store' into ps/object-file-wo-the-repository
* ps/object-store:
odb: rename `read_object_with_reference()`
odb: rename `pretend_object_file()`
odb: rename `has_object()`
odb: rename `repo_read_object_file()`
odb: rename `oid_object_info()`
odb: trivial refactorings to get rid of `the_repository`
odb: get rid of `the_repository` when handling submodule sources
odb: get rid of `the_repository` when handling the primary source
odb: get rid of `the_repository` in `for_each()` functions
odb: get rid of `the_repository` when handling alternates
odb: get rid of `the_repository` in `odb_mkstemp()`
odb: get rid of `the_repository` in `assert_oid_type()`
odb: get rid of `the_repository` in `find_odb()`
odb: introduce parent pointers
object-store: rename files to "odb.{c,h}"
object-store: rename `object_directory` to `odb_source`
object-store: rename `raw_object_store` to `object_database`
fast-(import|export): improve on commit signature output format
A recent commit, d9cb0e6ff8 (fast-export, fast-import: add support for
signed-commits, 2025-03-10), added support for signed commits to
fast-export and fast-import.
When a signed commit is processed, fast-export can output either
"gpgsig sha1" or "gpgsig sha256" depending on whether the signed
commit uses the SHA-1 or SHA-256 Git object format.
However, this implementation has a number of limitations:
- the output format was not properly described in the documentation,
- the output format is not very informative as it doesn't even say
if the signature is an OpenPGP, an SSH, or an X509 signature,
- the implementation doesn't support having both one signature on
the SHA-1 object and one on the SHA-256 object.
Let's improve on these limitations by improving fast-export and
fast-import so that:
- all the signatures are exported,
- at most one signature on the SHA-1 object and one on the SHA-256
are imported,
- if there is more than one signature on the SHA-1 object or on
the SHA-256 object, fast-import emits a warning for each
additional signature,
- the output format is "gpgsig <git-hash-algo> <signature-format>",
where <git-hash-algo> is the Git object format as before, and
<signature-format> is the signature type ("openpgp", "x509",
"ssh" or "unknown"),
- the output is properly documented.
About the output format:
- <git-hash-algo> allows to know which representation of the commit
was signed (the SHA-1 or the SHA-256 version) which helps with
both signature verification and interoperability between repos
with different hash functions,
- <signature-format> helps tools that process the fast-export
stream, so they don't have to parse the ASCII armor to identify
the signature type.
It could be even better to be able to import more than one signature
on the SHA-1 object and on the SHA-256 object, but other parts of
Git don't handle that well for now, so this is left for future
improvements.
Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse-options: add precision handling for OPTION_COUNTUP
Similar to 09705696f7 (parse-options: introduce precision handling for
`OPTION_INTEGER`, 2025-04-17) support value variables of different sizes
for OPTION_COUNTUP. Do that by requiring their "precision" to be set,
casting their "value" pointer accordingly and checking whether the value
fits.
parse-options: add precision handling for OPTION_BITOP
Similar to 09705696f7 (parse-options: introduce precision handling for
`OPTION_INTEGER`, 2025-04-17) support value variables of different sizes
for OPTION_BITOP. Do that by requiring their "precision" to be set,
casting their "value" pointer accordingly and checking whether the value
fits.
Check if "devfal" fits into an integer variable with the given
"precision", but don't check "extra", as its value is only used to clear
bits, so cannot lead to an overflow. Not checking continues to allow
e.g., using -1 to clear all bits even if the value variable has a
narrower type than intptr_t.
parse-options: add precision handling for OPTION_NEGBIT
Similar to 09705696f7 (parse-options: introduce precision handling for
`OPTION_INTEGER`, 2025-04-17) support value variables of different sizes
for OPTION_NEGBIT. Do that by requiring their "precision" to be set,
casting their "value" pointer accordingly and checking whether the value
fits.
parse-options: add precision handling for OPTION_BIT
Similar to 09705696f7 (parse-options: introduce precision handling for
`OPTION_INTEGER`, 2025-04-17) support value variables of different sizes
for OPTION_BIT. Do that by requiring their "precision" to be set,
casting their "value" pointer accordingly and checking whether the value
fits.
parse-options: add precision handling for OPTION_SET_INT
Similar to 09705696f7 (parse-options: introduce precision handling for
`OPTION_INTEGER`, 2025-04-17) support value variables of different sizes
for OPTION_SET_INT. Do that by requiring their "precision" to be set,
casting their "value" pointer accordingly and checking whether the value
fits.
Factor out the casting code from the part of do_get_value() that handles
OPTION_INTEGER to avoid code duplication. We're going to use it in the
next patches as well.
parse-options: add precision handling for PARSE_OPT_CMDMODE
Build on 09705696f7 (parse-options: introduce precision handling for
`OPTION_INTEGER`, 2025-04-17) to support value variables of different
sizes for PARSE_OPT_CMDMODE options. Do that by requiring their
"precision" to be set and casting their "value" pointer accordingly.
Call the function that does the raw casting do_get_int_value() to
reserve the name get_int_value() for a more friendly wrapper we're
going to introduce in one of the next patches.
Junio C Hamano [Wed, 9 Jul 2025 15:29:08 +0000 (08:29 -0700)]
Merge branch 'ps/object-store' into ps/object-store-midx
* ps/object-store:
odb: rename `read_object_with_reference()`
odb: rename `pretend_object_file()`
odb: rename `has_object()`
odb: rename `repo_read_object_file()`
odb: rename `oid_object_info()`
odb: trivial refactorings to get rid of `the_repository`
odb: get rid of `the_repository` when handling submodule sources
odb: get rid of `the_repository` when handling the primary source
odb: get rid of `the_repository` in `for_each()` functions
odb: get rid of `the_repository` when handling alternates
odb: get rid of `the_repository` in `odb_mkstemp()`
odb: get rid of `the_repository` in `assert_oid_type()`
odb: get rid of `the_repository` in `find_odb()`
odb: introduce parent pointers
object-store: rename files to "odb.{c,h}"
object-store: rename `object_directory` to `odb_source`
object-store: rename `raw_object_store` to `object_database`
In 4cba20fbdc6 (meson: prefer shell at "/bin/sh", 2025-04-25) we have
addressed an issue where the shell path embedded into Git was looked up
via PATH, which easily led to unportable shell paths other than the
usual "/bin/sh" location. The fix was to simply add '/bin' to the search
path explicitly, which made us prefer that directory over the PATH-based
lookup.
This fix causes issues on MINGW64 though, which uses Windows-style
paths. "/bin" is not an absolute Windows-style path, but Meson expects
the directories to be absolute. This leads to the following error:
meson.build:248:15: ERROR: Search directory /bin is not an absolute path.
Fix this by instead searching for both '/bin/sh' and 'sh', which also
causes us to prefer '/bin/sh' over a PATH-based lookup. Meson does
accept that path alright on MINGW64, even though it's not an absolute
Windows-style path, either.
Furthermore, this continues to work alright with cross-files, as well,
in case one wants to explicitly override the shell path:
The `manpage_target` variable isn't used at all, and the `manpage_path`
variable is only used in a single location. Remove the former variable
and inline the latter.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The summary of auto-detected features prints a boolean for every option
to tell the user whether or not the feature has been auto-enabled or
not. This summary can be improved though, as in some cases this boolean
is derived from a dependency. So if we pass in the dependency directly,
then Meson knows to both print a boolean and, if the dependency was
found, it also prints a version number.
Adapt the code accordingly and enable `bool_yn` so that actual booleans
are formatted similarly to dependencies. Before this change:
meson: stop printing 'https' option twice in our summaries
The value for the 'https' backend option is printed twice: once via the
summary of auto-detected features and once via our summary of backends.
Drop it from the former summary.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Python features are enabled we search both for a native and
non-native version of Python. This is wrong though: we don't use Python
in our build process, so there is no need to search for it in the first
place.
There is one location where we use the native version of Python, namely
when deciding whether or not we want to wire up git-p4(1). This check is
invalid though, as we shouldn't check for the build host to have Python,
but for the target host.
Fix this invalid check to use the non-native version of Python and stop
searching for a native version of Python altogether.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
we may try to write to the same ref twice (once for each remote we're
fetching). There's also a more subtle version of this. If you have
remotes "outer/inner" and "outer", then the ref "inner/branch" on the
second remote will conflict with just "branch" on the former (they both
want to write to "refs/remotes/outer/inner/branch").
We probably don't want to forbid this kind of overlap completely. While
the results can be confusing, there are legitimate reasons to have
multiple refs write into the same namespace (e.g., if one is a "backup"
of the other that is rarely fetched from).
But it may be worth limiting the porcelain "git remote" command to avoid
this confusion. The example above cannot be done with "git remote",
because it always[1] matches the refspecs to the remote name, and you
can only have one instance of each remote name. But you can still
trigger the more subtle variant like this:
So let's detect that kind of name collision (in both directions) and
forbid it. You can still do whatever you like by manipulating the config
directly, but this should prevent the most obvious foot-gun.
[1] Almost always. With the --mirror option, the resulting refspec will
just write into "refs/*"; the remote name does not appear in the ref
namespace at all.
Our new "names must not overlap" rule is not necessary for that
case, but it seems reasonable to enforce it consistently. We already
require all remote names to be valid in the ref namespace, even
though we won't ever use them in that context for --mirror remotes.
Likewise, our new rule doesn't help with overlap here. Any two
mirror remotes will always overlap (in fact, any mirror remote along
with any other single one, since refs/remotes/ is a subset of the
mirrored refs). I'm not sure this is worth worrying about, but if it
is, we'd want an additional rule like "mirror remotes must be the
only remote".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 8 Jul 2025 22:49:19 +0000 (15:49 -0700)]
Merge branch 'kn/fetch-push-bulk-ref-update'
"git push" and "git fetch" are taught to update refs in batches to
gain performance.
* kn/fetch-push-bulk-ref-update:
receive-pack: handle reference deletions separately
refs/files: skip updates with errors in batched updates
receive-pack: use batched reference updates
send-pack: fix memory leak around duplicate refs
fetch: use batched reference updates
refs: add function to translate errors to strings
In a recent security release, 05e9cd64ee (config: quote values
containing CR character, 2025-05-19) added calls to `git config get`,
`git config set`, and `git config unset` which are not present on the
maint-2.43 branch.
These subcommands were added in the following commits, released in
git-2.46.0:
While Meson copes with it alright, it's still annoying to see these
errors on every test run.
The root cause of the broken format is a call to grep(1) that gets
executed outside of a test case, which has been added recently via 9fd38038b9c (t1006: update 'run_tests' to test generic object
specifiers, 2025-06-02). This call is done to determine whether a
subsequent test case is expected to succeed or fail, so it makes sense
to have it execute outside of a test case. But whenever we do that, we
must be extra careful to not generate any output that breaks the TAP
format.
Fix the issue by adding '-q' to the command so that it doesn't print
any matching lines.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/files: remove empty parent dirs when ref creation fails
When creating a new reference in the "files" backend we first create the
directory hierarchy for that reference, then create the lockfile for
that reference, and finally rename the lockfile into place. When the
transaction gets aborted we prune the lockfile, but we don't clean up
the directory hierarchy that we may have created for the lockfile.
In some egde cases this can lead to lots of empty directories being
cluttered in the ".git/refs" directory that really serve no purpose at
all. We know to prune such empty directories when packing refs, but that
only patches over the issue.
Improve this by removing empty parents when cleaning up still-locked
references in `files_transaction_cleanup()`. This function is also
called when preparing or committing the transaction, so this change also
helps when not explicitly aborting the transaction.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
docs/git-pack-refs: document heuristic used for packing loose refs
The `git pack-refs --auto` flag asks the ref backend to decide for
itself whether or not references need to be repacked. This is done to
ensure that we don't repack in cases where the backend is already in a
good-enough state, which is typically the case for the "reftable"
backend that performs auto-compaction on writes.
As such, we initially only had heuristics in place for the "reftable"
backend. The "files" backend didn't have any heuristics, so we'd repack
loose references every time `git pack-refs --auto` was executed. This
caused excessive repacking with that backend though, which is why we
eventually implemented a heuristic via c3459ae9ef2 (refs/files: use
heuristic to decide whether to repack with `--auto`, 2024-09-04).
The documentation for the `--auto` flag hasn't been updated accordingly
and still claims that we don't have any metrics for the "files" backend.
Update it to reflect the new reality.
Reported-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Tue, 8 Jul 2025 18:47:50 +0000 (14:47 -0400)]
Documentation/RelNotes: use .adoc extension for new security releases
When preparing the latest round of security fixes, we wrote release
notes in v2.43.7, and then successively merged those up through to the
various 'maint' branches.
However, the 2.49 release series is the first to have commit 1f010d6bdf
(doc: use .adoc extension for AsciiDoc files, 2025-01-20). This means
that we should have renamed the new-but-historical release notes from
*.txt to *.adoc during the merge into the 'maint-2.49' branch, but
neglected to do so.
Rename them accordingly to match the convention introduced by 1f010d6bdf. Since the release materials in question here were prepared
before v2.50.0 was tagged, the 'maint' track for that release series is
OK as is.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt [Tue, 8 Jul 2025 19:22:00 +0000 (21:22 +0200)]
Merge branch 'js/fix-open-exec-git'
This addresses CVE-2025-46835, Git GUI can create and overwrite a
user's files:
When a user clones an untrusted repository and is tricked into editing
a file located in a maliciously named directory in the repository, then
Git GUI can create and overwrite files for which the user has write
permission.
* js/fix-open-exec-git:
git-gui: sanitize 'exec' arguments: convert new 'cygpath' calls
git-gui: do not mistake command arguments as redirection operators
git-gui: introduce function git_redir for git calls with redirections
git-gui: pass redirections as separate argument to git_read
git-gui: pass redirections as separate argument to _open_stdout_stderr
git-gui: convert git_read*, git_write to be non-variadic
git-gui: use git_read in githook_read
git-gui: break out a separate function git_read_nice
git-gui: remove option --stderr from git_read
git-gui: sanitize 'exec' arguments: background
git-gui: sanitize 'exec' arguments: simple cases
git-gui: treat file names beginning with "|" as relative paths
git-gui: remove git config --list handling for git < 1.5.3
git-gui: remove HEAD detachment implementation for git < 1.5.3
git-gui: remove Tcl 8.4 workaround on 2>@1 redirection
Johannes Sixt [Tue, 8 Jul 2025 19:19:28 +0000 (21:19 +0200)]
Merge branch 'ml/replace-auto-execok'
This addresses CVE-2025-46334, Git GUI malicious command injection on
Windows.
A malicious repository can ship versions of sh.exe or typical textconv
filter programs such as astextplain. Due to the unfortunate design of
Tcl on Windows, the search path when looking for an executable always
includes the current directory. The mentioned programs are invoked when
the user selects "Git Bash" or "Browse Files" from the menu.
* ml/replace-auto-execok:
git-gui: override exec and open only on Windows
git-gui: sanitize $PATH on all platforms
git-gui: assure PATH has only absolute elements.
git-gui: cleanup git-bash menu item
git-gui: avoid auto_execok in do_windows_shortcut
git-gui: avoid auto_execok for git-bash menu item
git-gui: remove unused proc is_shellscript
git-gui: remove special treatment of Windows from open_cmd_pipe
git-gui: use only the configured shell
git-gui: make _shellpath usable on startup
git-gui: use [is_Windows], not bad _shellpath
git-gui: _which, only add .exe suffix if not present
Johannes Sixt [Tue, 8 Jul 2025 19:00:34 +0000 (21:00 +0200)]
Merge branch 'js/fix-open-exec'
This addresses CVE-2025-27613, Gitk can create and truncate a user's
files:
When a user clones an untrusted repository and runs gitk without
additional command arguments, files for which the user has write
permission can be created and truncated. The option "Support per-file
encoding" must have been enabled before in Gitk's Preferences. This
option is disabled by default.
The same happens when "Show origin of this line" is used in the main
window (regardless of whether "Support per-file encoding" is enabled or
not).
* js/fix-open-exec:
gitk: sanitize 'open' arguments: revisit recently updated 'open' calls
gitk: sanitize 'open' arguments: command pipeline
gitk: collect construction of blameargs into a single conditional
gitk: sanitize 'open' arguments: simple commands, readable and writable
gitk: sanitize 'open' arguments: simple commands with redirections
gitk: sanitize 'open' arguments: simple commands
gitk: sanitize 'exec' arguments: redirect to process
gitk: sanitize 'exec' arguments: redirections and background
gitk: sanitize 'exec' arguments: redirections
gitk: sanitize 'exec' arguments: 'eval exec'
gitk: sanitize 'exec' arguments: simple cases
gitk: have callers of diffcmd supply pipe symbol when necessary
gitk: treat file names beginning with "|" as relative paths
Johannes Sixt [Tue, 8 Jul 2025 18:46:24 +0000 (20:46 +0200)]
Merge branch 'ah/fix-open-with-stdin'
This addresses CVE-2025-27614, Arbitrary command execution with Gitk:
A Git repository can be crafted in such a way that with some social
engineering a user who has cloned the repository can be tricked into
running any script (e.g., Bourne shell, Perl, Python, ...) supplied by
the attacker by invoking `gitk filename`, where `filename` has a
particular structure. The script is run with the privileges of the user.
* ah/fix-open-with-stdin:
gitk: encode arguments correctly with "open"