shejialuo [Sun, 29 Jun 2025 04:28:41 +0000 (12:28 +0800)]
u-string-list: move "remove duplicates" test to "u-string-list.c"
We use "test-tool string-list remove_duplicates" to test the
"string_list_remove_duplicates" function. As we have introduced the unit
test, we'd better remove the logic from shell script to C program to
improve test speed and readability.
As all the tests in shell script are removed, let's just delete the
"t0063-string-list.sh" and update the "meson.build" file to align with
this change.
Also we could simply remove "DISABLE_SIGN_COMPARE_WARNINGS" due to we
have already deleted related code.
Unfortunately, we cannot totally remove "test-string-list.c" due to that
we would test the performance of sorting about string list by executing
"test-tool string-list sort" in "p0071-sort.sh".
Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
shejialuo [Sun, 29 Jun 2025 04:28:32 +0000 (12:28 +0800)]
u-string-list: move "filter string" test to "u-string-list.c"
We use "test-tool string-list filter" to test the "filter_string_list"
function. As we have introduced the unit test, we'd better remove the
logic from shell script to C program to improve test speed and
readability.
Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
shejialuo [Sun, 29 Jun 2025 04:28:23 +0000 (12:28 +0800)]
u-string-list: move "test_split_in_place" to "u-string-list.c"
We use "test-tool string-list split_in_place" to test the
"string_list_split_in_place" function. As we have introduced the unit
test, we'd better remove the logic from shell script to C program to
improve test speed and readability.
Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
shejialuo [Sun, 29 Jun 2025 04:28:14 +0000 (12:28 +0800)]
u-string-list: move "test_split" into "u-string-list.c"
We rely on "test-tool string-list" command to test the functionality of
the "string-list". However, as we have introduced clar test framework,
we'd better move the shell script into C program to improve speed and
readability.
Create a new file "u-string-list.c" under "t/unit-tests", then update
the Makefile and "meson.build" to build the file. And let's first move
"test_split" into unit test and gradually convert the shell script into
C program.
In order to create `string_list` easily by simply specifying strings in
the function call, create "t_vcreate_string_list_dup" function to do
this.
Then port the shell script tests to C program and remove unused
"test-tool" code and tests.
Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
shejialuo [Sun, 29 Jun 2025 04:28:06 +0000 (12:28 +0800)]
string-list: enable sign compare warnings check
In "add_entry", we call "get_entry_index" function to get the inserted
position. However, as the return type of "get_entry_index" function is
`int`, there is a sign compare warning when comparing the `index` with
the `list-nr` of unsigned type.
"get_entry_index" would always return unsigned index. However, the
current binary search algorithm initializes "left" to be "-1", which
necessitates the use of signed `int` return type.
The reason why we need to assign "left" to be "-1" is that in the
`while` loop, we increment "left" by 1 to determine whether the loop
should end. This design choice, while functional, forces us to use
signed arithmetic throughout the function.
To resolve this sign comparison issue, let's modify the binary search
algorithm with the following approach:
1. Initialize "left" to 0 instead of -1
2. Use `left < right` as the loop termination condition instead of
`left + 1 < right`
3. When searching the right part, set `left = middle + 1` instead of
`middle`
Then, we could delete "#define DISABLE_SIGN_COMPARE_WARNING" to enable
sign warnings check for "string-list".
Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
shejialuo [Sun, 29 Jun 2025 04:27:57 +0000 (12:27 +0800)]
string-list: return index directly when inserting an existing element
When inserting an existing element, "add_entry" would convert "index"
value to "-1-index" to indicate the caller that this element is in the
list already. However, in "string_list_insert", we would simply convert
this to the original positive index without any further action.
In 8fd2cb4069 (Extract helper bits from c-merge-recursive work,
2006-07-25), we create "path-list.c" and then introduce above code path.
Let's directly return the index as we don't care about whether the
element is in the list by using "add_entry". In the future, if we want
to let "add_entry" tell the caller, we may add "int *exact_match"
parameter to "add_entry" instead of converting the index to negative to
indicate.
Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
shejialuo [Sun, 29 Jun 2025 04:27:49 +0000 (12:27 +0800)]
string-list: remove unused "insert_at" parameter from add_entry
In "add_entry", we accept "insert_at" parameter which must be either -1
(auto) or between 0 and `list->nr` inclusive. Any other value is
invalid. When caller specify any invalid "insert_at" value, we won't
check the range and move the element, which would definitely cause the
trouble.
However, we only use "add_entry" in "string_list_insert" function and we
always pass the "-1" for "insert_at" parameter. So, we never use this
parameter to insert element in a user specified position.
And we should know why there is such code path in the first place. We
used to have another function "string_list_insert_at_index()", which
uses the extra "insert_at" parameter. And in f8c4ab611a (string_list:
remove string_list_insert_at_index() from its API, 2014-11-24), we
remove this function but we don't clean all the code path.
Let's simply delete this parameter as we'd better use "strmap" for such
functionality.
Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Han Young [Thu, 3 Jul 2025 07:45:02 +0000 (15:45 +0800)]
read-cache: report lock error when refreshing index
In the repo_refresh_and_write_index of read-cache.c, we return -1 to
indicate that writing the index to disk failed.
However, callers do not use this information. Commands such as stash print
"could not write index"
and then exit, which does not help to discover the exact problem.
We can let repo_hold_locked_index print the error message if the locking
failed.
Signed-off-by: Han Young <hanyang.tony@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test that applying a new file creation patch with --intent-to-add to
an existing index does not modify the index outside adding the correct
intents-to-add, and that applying a patch with both modifications
and new file creations with --intent-to-add correctly only adds
intents-to-add to the index.
Signed-off-by: Raymond E. Pasco <ray@ameretat.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the "apply only to files" mode (i.e., neither --index nor --cached
mode), the index should not be touched except to record intents to
add when --intent-to-add is on. Because having --intent-to-add on sets
update_index, to indicate that we may touch the index, we can't rely
only on that flag in create_file() (which is called to write both new
files and updated files) to decide whether to write an index entry;
if we did, we would write an index entry for every file being patched
(which would moreover be an intent-to-add entry despite not being a
new file, because we are going to turn on the CE_INTENT_TO_ADD flag
in add_index_entry() if we enter it here and ita_only is true).
To decide whether to touch the index, we need to check the
specific reason the index would be updated, rather than merely
their aggregate in the update_index flag. Because we have already
entered write_out_results() and are performing writes, we know that
state->apply is true. If state->check_index is additionally true, we
are in --index or --cached mode, which updates the index and should
always write, whereas if we are merely in ita_only mode we must only
write if the patch is a new file creation patch.
Signed-off-by: Raymond E. Pasco <ray@ameretat.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are three main modes of operation for apply: applying only to the
worktree, applying to the worktree and index (--index), and applying
only to the index (--cached).
The --intent-to-add flag modifies the first of these modes, applying
only to the worktree, in a way which touches the index, because intents
to add are special index entries. However, since its introduction
in cff5dc09ed (apply: add --intent-to-add, 2018-05-26), it has not
worked correctly in any but the most trivial (empty repository)
cases, because the index is never read in (in apply, this is done in
read_apply_cache()) before writing to it.
This causes the operation to clobber the old, correct index with a
new empty-tree index before writing intent-to-add entries to this
empty index; the final result is that the index now records every
existing file in the repository as deleted, which is incorrect.
This error can be corrected by first reading the index. The
update_index flag is correctly set if ita_only is true, because this
flag causes the index to be updated. However, if we merely gate the
call to read_apply_cache() behind update_index, then it will not be
read when state->apply is false, even if it must be checked due to
being in --index or --cached mode. Therefore, we instead read the
index if it will be either checked or updated, because reading the
index is a prerequisite to either.
Reported-by: Ryan Hodges <rhodges@cisco.com> Original-patch-by: Johannes Altmanninger <aclopte@gmail.com> Signed-off-by: Raymond E. Pasco <ray@ameretat.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 3 Jul 2025 22:44:28 +0000 (18:44 -0400)]
setup_revisions(): turn on diffs for all-negative diff filter
When the user gives us a diff filter like --diff-filter=D, we need to do
a tree diff even if we're not planning to show the diff result itself,
in order to decide whether to show the commit at all. So there's an
explicit check of revs->diffopt.filter in setup_revisions(), and we set
revs->diff if any bits are set.
Originally that "filter" field covered both positive capital-letter
filters (like "D") and also negative lowercase filters (like "d"), so it
was sufficient for both cases. But later, 75408ca949 (diff-filter: be
more careful when looking for negative bits, 2022-01-28) split the
negative bits out into a "filter_not" field.
We eventually fold those into "filter", but not until diff_setup_done()
is called, which happens after our explicit check. As a result, a purely
negative filter like:
git log --diff-filter=d
failed to turn on diffs at all. But rather than fail to filter by diff,
because the filter variable is eventually set, we mistakenly show no
commits at all, thinking that the empty diffs were cases where nothing
passed through the filter.
The smallest fix here is to just have our check look for any bits in
either "filter" or "filter_not". I suspect it would also be OK to
reorder the function a bit to call diff_setup_done() earlier, but that
risks violating some other subtle ordering dependency. So I went with
the simple and safe solution here.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
setup: use "reftable" format when experimental features are enabled
With the preceding commit we have announced the switch to the "reftable"
format in Git 3.0 for newly created repositories. The format is being
battle tested by GitLab and a couple of other developers, and except for
a small handful of issues exposed early after it has been merged it has
been rock solid. Regardless of that though the test user base is still
comparatively small, which increases the risk that we miss critical
bugs.
Address this by enabling the reftable format when experimental features
are enabled. This should increase the test user base by some margin and
thus give us more input before making the format the default.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
BreakingChanges: announce switch to "reftable" format
The "reftable" format has come a long way and has matured nicely since
it has been merged into git via 57db2a094d5 (refs: introduce reftable
backend, 2024-02-07). It fixes longstanding issues that cannot be fixed
with the "files" format in a backwards-compatible way and performs
significantly better in many use cases.
Announce that we will switch to the "reftable" format in Git 3.0 for
newly created repositories and wire up the change, hidden behind the
WITH_BREAKING_CHANGES preprocessor define.
This switch is dependent on support in the larger Git ecosystem. Most
importantly, libraries like JGit, libgit2 and Gitoxide should support
the reftable backend so that we don't break all applications and tools
built on top of those libraries.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 2 Jul 2025 19:08:04 +0000 (12:08 -0700)]
Merge branch 'ag/imap-send-resurrection'
"git imap-send" has been broken for a long time, which has been
resurrected and then taught to talk OAuth2.0 etc.
* ag/imap-send-resurrection:
imap-send: fix minor mistakes in the logs
imap-send: display the destination mailbox when sending a message
imap-send: display port alongwith host when git credential is invoked
imap-send: add ability to list the available folders
imap-send: enable specifying the folder using the command line
imap-send: add PLAIN authentication method to OpenSSL
imap-send: add support for OAuth2.0 authentication
imap-send: gracefully fail if CRAM-MD5 authentication is requested without OpenSSL
imap-send: fix memory leak in case auth_cram_md5 fails
imap-send: fix bug causing cfg->folder being set to NULL
config.mak.uname: set NO_MEMMEM only for functional version
FreeBSD 6 introduced memmem(), but the implementation diverged
from what was standard everywhere else (including our "compat"
fallback).
FreeBSD 10.4 (went EOL in 2018) corrected the functionality bugs
but kept a suboptimal implementation until FreeBSD 11.4 (the last
version of FreeBSD 11, that went EOL in September 2021).
Let's draw the line to require FreeBSD 12 or newer, which allows us
to drop the special casing of FreeBSD 4.x and rely on the platform
implementation of memmem() unconditionally for all versions that are
still being supported.
Suggested-by: Brad Smith <brad@comstyle.com> Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Makefile has a 'style' rule to run 'git clang-format'. While Meson
intrinsically supports a 'clang-format' target, which can be run when
using the ninja backend by running 'ninja clang-format', this runs the
formatting on all existing files.
Our Meson build doesn't yet support a way to run 'git clang-format',
which runs the formatter between the working directory and commit
provided. Add a new 'style' target to Meson to mimic the target in the
Makefile.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
clang-format: add 'RemoveBracesLLVM' to the main config
In 1b8f306612 (ci/style-check: add `RemoveBracesLLVM` in CI job,
2024-07-23) we added 'RemoveBracesLLVM' to the CI job of running the
clang formatter.
This rule checks and warns against using braces on simple
single-statement bodies of statements. Since we haven't had any issues
regarding this rule, we can now move it into the main clang-format
config and remove it from being CI exclusive.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When clang-format was introduced to the Git project in 6134de6ac1 (clang-format: outline the git project's coding style,
2017-08-14), the 'ColumnLimit' was set to 80. This is inline with our
recommendation in 'Documentation/CodingGuidelines', which states:
We try to keep to at most 80 characters per line.
However while this is recommended limit, this is not the enforced
limit. In some cases in we do overflow this limit to prioritize
readability. Setting the 'ColumnLimit' also means that shorter lines are
concatenated to simply as the result would still be below 80 characters,
which is undesirable.
In the past, we tried to adjust the penalties around line wrapping, once
in 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29)
and another time in 5e9fa0f9fa (clang-format: re-adjust line break
penalties, 2024-10-18). While these settings help tweak the line break
penalties to be more in-line with the requirements of the Git project,
using 'clang-format' still produces a lot of false positives.
So to make 'clang-format' more usable, set the 'ColumnLimit' to 0. This
means that line-wrapping is no-longer a concern of the formatter and
something that the user needs to take care of. The previous commit also
added a more flexible guideline to the '.editorconfig' setting a
'max_line_length' of 120 characters. This should provide some guidance
to users.
In the future, it would be nice to re-instate this limit with adequate
penalties which would follow our guidelines, but currently, it makes
more sense to have a working formatter which we can rely on and which
doesn't create too many false positives.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phil Hord [Wed, 2 Jul 2025 01:12:15 +0000 (18:12 -0700)]
clean up interface for refs_warn_dangling_symrefs
The refs_warn_dangling_symrefs interface is a bit fragile as it passes
in printf-formatting strings with expectations about the number of
arguments. This patch series made it worse by adding a 2nd positional
argument. But there are only two call sites, and they both use almost
identical display options.
Make this safer by moving the format strings into the function that uses
them to make it easier to see when the arguments don't match. Pass a
prefix string and a dry_run flag so the decision logic can be handled
where needed.
Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phil Hord [Wed, 2 Jul 2025 01:12:13 +0000 (18:12 -0700)]
fetch-prune: optimize dangling-ref reporting
When pruning during `git fetch` we check each pruned ref against the
ref_store one at a time to decide whether to report it as dangling.
This causes every local ref to be scanned for each ref being pruned.
If there are N refs in the repo and M refs being pruned, this code is
O(M*N). However, `git remote prune` uses a very similar function that
is only O(N*log(M)).
Remove the wasteful ref scanning for each pruned ref and use the faster
version already available in refs_warn_dangling_symrefs. Change the
message to include the original refname since the message is no longer
printed immediately after the line that did just print the refname.
In a repo with 126,000 refs, where I was pruning 28,000 refs, this
code made about 3.6 billion calls to strcmp and consumed 410 seconds
of CPU. (Invariably in that time, my remote would timeout and the
fetch would fail anyway.)
After this change, the same operation completes in under a second.
Signed-off-by: Phil Hord <phil.hord@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable SHA-256 by default in breaking changes mode
Our document on breaking changes indicates that we intend to default to
SHA-256 in Git 3.0. Since most people choose the default option, this
is an important security upgrade to our defaults.
To allow people to test this case, when WITH_BREAKING_CHANGES is set in
the configuration, build Git with SHA-256 as the default hash. Update
the testsuite to use the build options information to automatically
choose the right value.
Note that if the command substitution for GIT_TEST_BUILTIN_HASH fails,
so does the testsuite—and quite spectacularly at that. Thus, the case
where the Git binary is somehow subtly broken will not go undetected.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We'd like users to be able to determine the hash algorithm that is the
builtin default in their version of Git. This is useful for
troubleshooting, especially when we decide to change the default. Add
an entry for the default hash in the output of git version
--build-options so that users can easily access that information and
include it in bug reports.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now, the built-in default hash is always SHA-1, but that will
change in a future commit. Instead of assuming that operating outside
of a repository will always use SHA-1, look up the default hash
algorithm for operating outside of a repository using an appropriate
environment variable, which will always be correct.
Additionally, for operations outside of a repository, use the
DEFAULT_HASH_ALGORITHM prerequisite rather than SHA1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now, the built-in default hash is always SHA-1, but that will
change in a future commit. Instead of assuming that operating outside
of a repository will always use SHA-1, provide constants for both
algorithms and then simply ask test_oid for the built-in hash instead,
which will always be correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now, the built-in default hash is always SHA-1, but that will
change in a future commit. Instead of assuming that operating outside
of a repository will always use SHA-1, simply ask test_oid for the
built-in hash instead, which will always be correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t: default to compile-time default hash if not set
Right now, the default compile-time hash is SHA-1. However, in the
future, this might change and it would be helpful to gracefully handle
this case in our testsuite.
To avoid making these assumptions, let's introduce a variable that
contains the built-in default hash and use it in our setup code as the
fallback value if no hash was explicitly set. For now, this is always
SHA-1, but in a future commit, we'll allow adjusting this and the
variable will be more useful.
To allow us to make our tests more robust, allow test_oid to take the
--hash=builtin option to specify this hash, whatever it is.
Additionally, add a DEFAULT_HASH_ALGORITHM prerequisite to check for the
compile-time hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
setup: use the default algorithm to initialize repo format
When we define a new repository format with REPOSITORY_FORMAT_INIT, we
always use GIT_HASH_SHA1, and this value ends up getting used as the
default value to initialize a repository if none of the command line,
environment, or config tell us to do otherwise.
Because we might not always want to use SHA-1 as the default, let's
instead specify the default hash algorithm constant so that we will use
whatever the specified default is.
However, we also need to continue to read older repositories. If we're
in a v0 repository or extensions.objectformat is not set, then we must
continue to default to the original hash algorithm: SHA-1. If an
algorithm is set explicitly, however, it will override the hash_algo
member of the repository_format struct and we'll get the right value.
Similarly, if the repository was initialized before Git 0.99.3, then it
may lack a core.repositoryformatversion key, and some repositories lack
a config file altogether. In both cases, format->version is -1 and we
need to assume that SHA-1 is in use.
Because clear_repository_format reinitializes the struct
repository_format and therefore sets the hash_algo member to the default
(which could in the future not be SHA-1), we need to reset this member
explicitly. We know, however, that at the point we call
read_repository_format, we are actually reading an existing repository
and not initializing a new one or operating outside of a repository, so
we are not changing the default behavior back to SHA-1 if the default
algorithm is different.
It is potentially questionable that we ignore all repository
configuration if there is a config file but it doesn't have
core.repositoryformatversion set, in which case we reset all of the
configuration to the default. However, it is unclear what the right
thing to do instead with such an old repository is and a simple git init
will add the missing entry, so for now, we simply honor what the
existing code does and reset the value to the default, simply adding our
initialization to SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a large variety of data formats and protocols where no hash
algorithm was defined and the default was assumed to always be SHA-1.
Instead of explicitly stating SHA-1, let's use the constant to represent
the legacy hash algorithm (which is still SHA-1) so that it's clear
for documentary purposes that it's a legacy fallback option and not an
intentional choice to use SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin: use default hash when outside a repository
We have some commands that can operate inside or outside a repository.
If we're operating outside a repository, we clearly cannot use the
repository's hash algorithm as a default since it doesn't exist, so
instead, let's pick the default instead of specifically SHA-1. Right
now this results in no functional change since the default is SHA-1, but
that may change in the future.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
hash: add a constant for the legacy hash algorithm
We have a a variety of uses of GIT_HASH_SHA1 littered throughout our
code. Some of these really mean to represent specifically SHA-1, but
some actually represent the original hash algorithm used in Git which is
implied by older, legacy formats and protocols which do not contain hash
information. For instance, the bundle v1 and v2 formats do not contain
hash algorithm information, and thus SHA-1 is implied by the use of
these formats.
Add a constant for documentary purposes which indicates this value. It
will always be the same as SHA-1, since this is an essential part of
these formats, but its use indicates this particular reason and not any
other reason why SHA-1 might be used.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
hash: add a constant for the default hash algorithm
Right now, SHA-1 is the default hash algorithm in Git. However, this
may change in the future.
We have many places in our code that use the SHA-1 constant to indicate
the default hash if none is specified, but it will end up being more
practical to specify this explicitly and clearly using a constant for
whatever the default hash algorithm is. Then, if we decide to change it
in the future, we can simply replace the constant representing the
default with a new value.
For these reasons, introduce GIT_HASH_DEFAULT to represent the default
hash algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename `read_object_with_reference()` to `odb_read_object_peeled()` to
match other functions related to the object database and our modern
coding guidelines. Furthermore though, the old name didn't really
describe very well what this function actually does, which is to walk
down any commit and tag objects until an object of the required type has
been found. This is generally referred to as "peeling", so the new name
should be way more descriptive.
No compatibility wrapper is introduced as the function is not used a lot
throughout our codebase.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename `oid_object_info()` to `odb_read_object_info()` as well as their
`_extended()` variant to match other functions related to the object
database and our modern coding guidelines.
Introduce compatibility wrappers so that any in-flight topics will
continue to compile.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: trivial refactorings to get rid of `the_repository`
All of the external functions provided by the object database subsystem
don't depend on `the_repository` anymore, but some internal functions
still do. Refactor those cases by plumbing through the repository that
owns the object database.
This change allows us to get rid of the `USE_THE_REPOSITORY_VARIABLE`
preprocessor define.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: get rid of `the_repository` when handling submodule sources
The "--recursive" flag for git-grep(1) allows users to grep for a string
across submodule boundaries. To make this work we add each submodule's
object sources to our own object database so that the objects can be
accessed directly.
The infrastructure for this depends on a global string list of submodule
paths. The caller is expected to call `add_submodule_odb_by_path()` for
each source and the object database will then eventually register all
submodule sources via `do_oid_object_info_extended()` in case it isn't
able to look up a specific object.
This reliance on global state is of course suboptimal with regards to
our libification efforts.
Refactor the logic so that the list of submodule sources is instead
tracked in the object database itself. This allows us to lose the
condition of `r == the_repository` before registering submodule sources
as we only ever add submodule sources to `the_repository` anyway. As
such, behaviour before and after this refactoring should always be the
same.
Rename the functions accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: get rid of `the_repository` when handling the primary source
The functions `set_temporary_primary_odb()` and `restore_primary_odb()`
are responsible for managing a temporary primary source for the
database. Both of these functions implicitly rely on `the_repository`.
Refactor them to instead take an explicit object database parameter as
argument and adjust callers. Rename the functions accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: get rid of `the_repository` in `for_each()` functions
There are a couple of iterator-style functions that execute a callback
for each instance of a given set, all of which currently depend on
`the_repository`. Refactor them to instead take an object database as
parameter so that we can get rid of this dependency.
Rename the functions accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: get rid of `the_repository` when handling alternates
The functions to manage alternates all depend on `the_repository`.
Refactor them to accept an object database as a parameter and adjust all
callers. The functions are renamed accordingly.
Note that right now the situation is still somewhat weird because we end
up using the object store path provided by the object store's repository
anyway. Consequently, we could have instead passed in a pointer to the
repository instead of passing in the pointer to the object store. This
will be addressed in subsequent commits though, where we will start to
use the path owned by the object store itself.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of our dependency on `the_repository` in `find_odb()` by passing
in the object database in which we want to search for the source and
adjusting all callers.
Rename the function to `odb_find_source()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In subsequent commits we'll get rid of our use of `the_repository` in
"odb.c" in favor of explicitly passing in a `struct object_database` or
a `struct odb_source`. In some cases though we'll need access to the
repository, for example to read a config value from it, but we don't
have a way to access the repository owning a specific object database.
Introduce parent pointers for `struct object_database` to its owning
repository as well as for `struct odb_source` to its owning object
database, which will allow us to adapt those use cases.
Note that this change requires us to pass through the object database to
`link_alt_odb_entry()` so that we can set up the parent pointers for any
source there. The callchain is adapted to pass through the object
database accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commits we have renamed the structures contained in
"object-store.h" to `struct object_database` and `struct odb_backend`.
As such, the code files "object-store.{c,h}" are confusingly named now.
Rename them to "odb.{c,h}" accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-store: rename `object_directory` to `odb_source`
The `object_directory` structure is used as an access point for a single
object directory like ".git/objects". While the structure isn't yet
fully self-contained, the intent is for it to eventually contain all
information required to access objects in one specific location.
While the name "object directory" is a good fit for now, this will
change over time as we continue with the agenda to make pluggable object
databases a thing. Eventually, objects may not be accessed via any kind
of directory at all anymore, but they could instead be backed by any
kind of durable storage mechanism. While it seems quite far-fetched for
now, it is thinkable that eventually this might even be some form of a
database, for example.
As such, the current name of this structure will become worse over time
as we evolve into the direction of pluggable ODBs. Immediate next steps
will start to carve out proper self-contained object directories, which
requires us to pass in these object directories as parameters. Based on
our modern naming schema this means that those functions should then be
named after their subsystem, which means that we would start to bake the
current name into the codebase more and more.
Let's preempt this by renaming the structure. There have been a couple
alternatives that were discussed:
- `odb_backend` was discarded because it led to the association that
one object database has a single backend, but the model is that one
alternate has one backend. Furthermore, "backend" is more about the
actual backing implementation and less about the high-level concept.
- `odb_alternate` was discarded because it is a bit of a stretch to
also call the main object directory an "alternate".
Instead, pick `odb_source` as the new name. It makes it sufficiently
clear that there can be multiple sources and does not cause confusion
when mixed with the already-existing "alternate" terminology.
In the future, this change allows us to easily introduce for example a
`odb_files_source` and other format-specific implementations.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-store: rename `raw_object_store` to `object_database`
The `raw_object_store` structure is the central entry point for reading
and writing objects in a repository. The main purpose of this structure
is to manage object directories and provide an interface to access and
write objects in those object directories.
Right now, many of the functions associated with the raw object store
implicitly rely on `the_repository` to get access to its `objects`
pointer, which is the `raw_object_store`. As we want to generally get
rid of using `the_repository` across our codebase we will have to
convert this implicit dependency on this global variable into an
explicit parameter.
This conversion can be done by simply passing in an explicit pointer to
a repository and then using its `->objects` pointer. But there is a
second effort underway, which is to make the object subsystem more
selfcontained so that we can eventually have pluggable object backends.
As such, passing in a repository wouldn't make a ton of sense, and the
goal is to convert the object store interfaces such that we always pass
in a reference to the `raw_object_store` instead.
This will expose the `raw_object_store` type to a lot more callers
though, which surfaces that this type is named somewhat awkwardly. The
"raw_" prefix makes readers wonder whether there is a non-raw variant of
the object store, but there isn't. Furthermore, we nowadays want to name
functions in a way that they can be clearly attributed to a specific
subsystem, but calling them e.g. `raw_object_store_has_object()` is just
too unwieldy, even when dropping the "raw_" prefix.
Instead, rename the structure to `object_database`. This term is already
used a lot throughout our codebase, and it cannot easily be mistaken for
"object directories", either. Furthermore, its acronym ODB is already
well-known and works well as part of a function's name, like for example
`odb_has_object()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Tue, 1 Jul 2025 05:32:09 +0000 (05:32 +0000)]
pack-bitmap: add load corrupt bitmap test
t5310 lacks a test to ensure git works correctly when commit bitmap
data is corrupted. So this patch add test helper in pack-bitmap.c to
list each commit bitmap position in bitmap file and `load corrupt bitmap`
test case in t/t5310 to corrupt a commit bitmap before loading it.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Tue, 1 Jul 2025 05:32:08 +0000 (05:32 +0000)]
pack-bitmap: reword comments in test_bitmap_commits()
The comment in pack-bitmap.c:test_bitmap_commits(), suggests that
we can avoid reading the commit table altogether. However, this
comment is misleading. The reason we load bitmap entries here is
because test_bitmap_commits() needs to print the commit IDs from the
bitmap, and we must read the bitmap entries to obtain those commit IDs.
So reword this comment.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Tue, 1 Jul 2025 05:32:07 +0000 (05:32 +0000)]
pack-bitmap: fix memory leak if load_bitmap() failed
After going through the "failed" label, load_bitmap() will return -1,
and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
will then call free_bitmap_index().
, but won't since load_bitmap() already called kh_destroy_oid_map() and
NULL'd the "bitmaps" pointer from within its "failed" label. Thus if you
got part of the way through loading bitmap entries and then failed, you
would leak all of the previous entries that you were able to load
successfully.
The solution is to remove the error handling code in load_bitmap(), because
its caller will always call free_bitmap_index() in case of an error.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 1 Jul 2025 21:17:25 +0000 (14:17 -0700)]
send-pack: clean-up even when taking an early exit
Previous commit has plugged one leak in the normal code path, but
there is an early exit that leaves without releasing any resources
acquired in the function.
This option was introduced in a series of commits from fe3ccc7aab (Merge
branch 'ps/config-subcommands', 2024-05-15) and deprecated
`value-pattern`. But `value-pattern` is still used throughout the doc.
The deprecated modes have been quarantined in the “Deprecated Modes”
section. So let’s only use `--value=<pattern>` in the rest of the doc.
Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
These options were introduced in a series of commits from fe3ccc7aab (Merge branch 'ps/config-subcommands', 2024-05-15).[1]
But they were not documented here.
Document this option and the negated form according to the current
convention.[2]
[1]: `--value` is a replacement for the `value-pattern`
positional argument
[2]: https://lore.kernel.org/git/xmqqcyct1mtq.fsf@gitster.g/
Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option was introduced in a series of commits from fe3ccc7aab (Merge
branch 'ps/config-subcommands', 2024-05-15). But two styles were used
for the value provided to the option:
These options were introduced in 4e513890008 (builtin/config:
introduce "get" subcommand, 2024-05-06) but not documented here.
Use the description from the source code.
Document this option and the negated form according to the current
convention.[1]
`--show-names` is also the default when `--get-regexp` is given. But
don’t mention it here since all the deprecated modes are quarantined in
the “Deprecated Modes” section.
FreeBSD 13.4 is no longer supported, and 13.5 will be the last
release from that series, so jump instead to 14.3 which should
be supported for another 10 months and will be at that point
the oldest supported release with the interim release of 15.
While at it, move some variables to the environment and make
sure to skip a git grep test that assumes glibc regex.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 30 Jun 2025 21:30:31 +0000 (14:30 -0700)]
Merge branch 'jc/merge-compact-summary'
"git merge/pull" has been taught the "--compact-summary" option to
use the compact-summary format, intead of diffstat, when showing
the summary of the incoming changes.
* jc/merge-compact-summary:
merge/pull: extend merge.stat configuration variable to cover --compact-summary
merge/pull: add the "--compact-summary" option
Junio C Hamano [Mon, 30 Jun 2025 21:30:30 +0000 (14:30 -0700)]
Merge branch 'bc/stash-export-import'
An interchange format for stash entries is defined, and subcommand
of "git stash" to import/export has been added.
* bc/stash-export-import:
builtin/stash: provide a way to import stashes from a ref
builtin/stash: provide a way to export stashes to a ref
builtin/stash: factor out revision parsing into a function
object-name: make get_oid quietly return an error
Junio C Hamano [Mon, 30 Jun 2025 21:30:30 +0000 (14:30 -0700)]
Merge branch 'jc/cocci-avoid-regexp-constraint'
Avoid regexp_constraint and instead use comparison_constraint when
listing functions to exclude from application of coccinelle rules,
as spatch can be built with different regexp engine X-<.
Aditya Garg [Mon, 30 Jun 2025 18:06:56 +0000 (18:06 +0000)]
docs: mention possible options for Proton Mail users
Proton Mail is an privacy-focused email service gaining popularity.
Unfortunately, it does not provide an SMTP server to send emails.
Proton Mail Bridge is an official solution for paid users, and for free
users, a client named git-protonmail is available. Mention the same in the
docs.
Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aditya Garg [Mon, 30 Jun 2025 18:06:46 +0000 (18:06 +0000)]
docs: add a paragraph explaining the `sendmailCmd` option of sendemail
`sendmailCmd` is a configuration option in `git-send-email` that allows
users to send emails using an external application that supports
sendmail-like commands. This ability has been very useful to support
proprietary email APIs without modifying the `git-send-email` codebase.
It is also useful for users who prefer to use another SMTP client
instead of the SMTP perl library used by `git-send-email`.
This commit adds a paragraph to the documentation explaining this
option.
Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aditya Garg [Mon, 30 Jun 2025 18:06:40 +0000 (18:06 +0000)]
docs: add an OAuth2.0 credential helper for AOL accounts
Yahoo and AOL, both advertise that they support app passwords for third-party
applications. But generating app passwords for them is broken and unreliable
for quite some time now. Yahoo already had an OAuth2.0 credential helper
added in the documentation, so I thought it would be a good idea to add one
for AOL accounts as well, which is more reliable and secure.
Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aditya Garg [Mon, 30 Jun 2025 18:06:34 +0000 (18:06 +0000)]
docs: add outlookidfix config option to sendemail documentation
The documentation for command line option `--outlook-id-fix` is there in
the sendemail documentation, but the config option `sendemail.outlookidfix`
was missing. Add the same to the documentation.
White at it, also enclose the values `true` and `false` in backticks in
the documentation for `sendemail.mailmap`.
Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aditya Garg [Mon, 30 Jun 2025 18:06:28 +0000 (18:06 +0000)]
docs: link OpenSSL's verify(1) manual page to know about -CAfile and -CApath options
The description of `--smtp-ssl-cert-path` in the git-send-email documentation
mentions consulting OpenSSL's verify(1) manual page for details about the
`-CAfile` and `-CApath` options. However, the way it was written was quite
confusing, and it didn't mention that OpenSSL's verify(1) is the manual page
to refer to.
Fix this by slightly rewording the description and also add a link to the
OpenSSL verify(1) manual page.
Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
daemon: correctly handle soft accept() errors in service_loop
Since df076bdbcc ([PATCH] GIT: Listen on IPv6 as well, if available.,
2005-07-23), the original error checking was included in an inner loop
unchanged, where its effect was different.
Instead of retrying, after a EINTR during accept() in the listening
socket, it will advance to the next one and try with that instead,
leaving the client waiting for another round.
Make sure to retry with the same listener socket that failed originally.
To avoid an unlikely busy loop, fallback to the old behaviour after a
couple of attempts.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jacob Keller [Fri, 27 Jun 2025 22:09:04 +0000 (15:09 -0700)]
send-pack: clean up extra_have oid array
Commit c8009635785e ("fetch-pack, send-pack: clean up shallow oid
array", 2024-09-25) cleaned up the shallow oid array in cmd_send_pack,
but didn't clean up extra_have, which is still leaked at program exit.
I suspect the particular tests in t5539 don't trigger any additions to
the extra_have array, which explains why the tests can pass leak free
despite this gap.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
daemon: remove unnecesary restriction for listener fd
Since df076bdbcc ([PATCH] GIT: Listen on IPv6 as well, if available.,
2005-07-23), any file descriptor assigned to a listening socket was
validated to be within the range to be used in an FDSET later.
6573faff34 (NO_IPV6 support for git daemon, 2005-09-28), moves to
use poll() instead of select(), that doesn't have that restriction,
so remove the original check.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Acked-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 25 Jun 2025 21:07:36 +0000 (14:07 -0700)]
Merge branch 'ps/maintenance-ref-lock'
"git maintenance" lacked the care "git gc" had to avoid holding
onto the repository lock for too long during packing refs, which
has been remedied.
* ps/maintenance-ref-lock:
builtin/maintenance: fix locking race when handling "gc" task
builtin/gc: avoid global state in `gc_before_repack()`
usage: allow dying without writing an error message
builtin/maintenance: fix locking race with refs and reflogs tasks
builtin/maintenance: split into foreground and background tasks
builtin/maintenance: fix typedef for function pointers
builtin/maintenance: extract function to run tasks
builtin/maintenance: stop modifying global array of tasks
builtin/maintenance: mark "--task=" and "--schedule=" as incompatible
builtin/maintenance: centralize configuration of explicit tasks
builtin/gc: drop redundant local variable
builtin/gc: use designated field initializers for maintenance tasks
Junio C Hamano [Wed, 25 Jun 2025 21:07:35 +0000 (14:07 -0700)]
Merge branch 'jc/you-still-use-whatchanged'
"git whatchanged" that is longer to type than "git log --raw"
which is its modern rough equivalent has outlived its usefulness
more than 10 years ago. Plan to deprecate and remove it.
* jc/you-still-use-whatchanged:
whatschanged: list it in BreakingChanges document
whatchanged: remove when built with WITH_BREAKING_CHANGES
whatchanged: require --i-still-use-this
tests: prepare for a world without whatchanged
doc: prepare for a world without whatchanged
you-still-use-that??: help deprecating commands for removal
Maxim Cournoyer [Wed, 25 Jun 2025 14:25:11 +0000 (23:25 +0900)]
contrib: better support symbolic port names in git-credential-netrc
To improve support for symbolic port names in netrc files, this
changes does the following:
- Treat symbolic port names as ports, not protocols in git-credential-netrc
- Validate the SMTP server port provided to send-email
- Convert the above symbolic port names to their numerical values.
Before this change, it was not possible to have a SMTP server port set
to "smtps" in a netrc file (e.g. Emacs' ~/.authinfo.gpg), as it would
be registered as a protocol and break the match for a "smtp" protocol
host, as queried for by git-send-email.
Signed-off-by: Maxim Cournoyer <maxim@guixotic.coop> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Maxim Cournoyer [Wed, 25 Jun 2025 14:25:09 +0000 (23:25 +0900)]
contrib: use a more portable shebang for git-credential-netrc
While the installed scripts have their Perl shebang set to PERL_PATH,
it is nevertheless useful to be able to run the uninstalled script for
manual tests while developing. This change makes the shebang more
portable by having the perl command looked from PATH instead of from a
fixed location.
Signed-off-by: Maxim Cournoyer <maxim@guixotic.coop> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 9d2962a7c4 (receive-pack: use batched reference updates, 2025-05-19)
we updated the 'git-receive-pack(1)' command to use batched reference
updates. One edge case which was missed during this implementation was
when a user pushes multiple branches such as:
Before using batched updates, the references would be applied
sequentially and hence no conflicts would arise. With batched updates,
while the first update applies, the second fails due to D/F conflict. A
similar issue was present in 'git-fetch(1)' and was fixed by separating
out reference pruning into a separate transaction in the commit 'fetch:
use batched reference updates'. Apply a similar mechanism for
'git-receive-pack(1)' and separate out reference deletions into its own
batch.
This means 'git-receive-pack(1)' will now use up to two transactions,
whereas before using batched updates it would use _at least_ two
transactions. So using batched updates is still the better option.
Add a test to validate this behavior.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Fri, 20 Jun 2025 07:15:44 +0000 (09:15 +0200)]
refs/files: skip updates with errors in batched updates
The commit 23fc8e4f61 (refs: implement batch reference update support,
2025-04-08) introduced support for batched reference updates. This
allows users to batch updates together, while allowing some of the
updates to fail.
Under the hood, batched updates use the reference transaction mechanism.
Each update which fails is marked as such. Any failed updates must be
skipped over in the rest of the code, as they wouldn't apply any more.
In two of the loops within 'files_transaction_finish()' of the files
backend, the failed updates aren't skipped over. This can cause a
SEGFAULT otherwise. Add the missing skips and a test to validate the
same.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 24 Jun 2025 16:48:51 +0000 (09:48 -0700)]
Merge branch 'sa/multi-mailmap-fix'
When asking to apply mailmap to both author and committer field
while showing a commit object, the field that appears later was not
correctly parsed and replaced, which has been corrected.
* sa/multi-mailmap-fix:
cat-file: fix mailmap application for different author and committer
Junio C Hamano [Tue, 24 Jun 2025 16:48:47 +0000 (09:48 -0700)]
Merge branch 'ag/send-email-edit-threading-fix'
"git send-email" incremented its internal message counter when a
message was edited, which made logic that treats the first message
specially misbehave, which has been corrected.
* ag/send-email-edit-threading-fix:
send-email: show the new message id assigned by outlook in the logs
send-email: fix bug resulting in broken threads if a message is edited