When a repository is configured to have a compatibility hash algorithm
we keep track of object ID mappings for loose objects via the loose
object map. This map simply maps an object ID of the actual hash to the
object ID of the compatibility hash. This loose object map is an
inherent property of the loose files backend and thus of one specific
object source.
Refactor the interfaces to reflect this by requiring a `struct
odb_source` as input instead of a repository. This prepares for
subsequent commits where we will refactor writing of loose objects to
work on a `struct odb_source`, as well.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-file: get rid of `the_repository` in `finalize_object_file()`
We implicitly depend on `the_repository` when moving an object file into
place in `finalize_object_file()`. Get rid of this global dependency by
passing in a repository.
Note that one might be pressed to inject an object database instead of a
repository. But the function doesn't really care about the ODB at all.
All it does is to move a file into place while checking whether there is
any collision. As such, the functionality it provides is independent of
the object database and only needs the repository as parameter so that
it can adjust permissions of the file we are about to finalize.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-file: get rid of `the_repository` in `loose_object_info()`
While `loose_object_info()` already accepts a repository as parameter we
still have one callsite in there where we use `the_repository` to figure
out the hash algorithm. Use the passed-in repository instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-file: get rid of `the_repository` when freshening objects
We implicitly depend on `the_repository` when freshening either loose or
packed objects. Refactor these functions to instead accept an object
database as input so that we can get rid of the global dependency.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-file: get rid of `the_repository` in `has_loose_object()`
We implicitly depend on `the_repository` in `has_loose_object()`.
Refactor the function to accept an `odb_source` as input that should be
checked for such a loose object.
This refactoring changes semantics of the function to not check the
whole object database for such a loose object anymore, but instead we
now only check that single source. Existing callers thus need to loop
through all sources manually now.
While this change may seem illogical at first, whether or not an object
exists in a specific format should be answered by the source using that
format. As such, we can eventually convert this into a generic function
`odb_source_has_object()` that simply checks whether a given object
exists in an object source. And as we will know about the format that
any given source uses it allows us to derive whether the object exists
in a given format.
This change also makes `has_loose_object_nonlocal()` obsolete. The only
caller of this function is adapted so that it skips the primary object
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a couple of users of the `the_hash_algo` macro, which
implicitly depends on `the_repository`. Adapt these callers to not do so
anymore, either by deriving it from already-available context or by
using `the_repository->hash_algo`. The latter variant doesn't yet help
to remove the global dependency, but such users will be adapted in the
following commits to not use `the_repository` anymore.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 9 Jul 2025 23:29:52 +0000 (16:29 -0700)]
Merge branch 'ps/object-store' into ps/object-file-wo-the-repository
* ps/object-store:
odb: rename `read_object_with_reference()`
odb: rename `pretend_object_file()`
odb: rename `has_object()`
odb: rename `repo_read_object_file()`
odb: rename `oid_object_info()`
odb: trivial refactorings to get rid of `the_repository`
odb: get rid of `the_repository` when handling submodule sources
odb: get rid of `the_repository` when handling the primary source
odb: get rid of `the_repository` in `for_each()` functions
odb: get rid of `the_repository` when handling alternates
odb: get rid of `the_repository` in `odb_mkstemp()`
odb: get rid of `the_repository` in `assert_oid_type()`
odb: get rid of `the_repository` in `find_odb()`
odb: introduce parent pointers
object-store: rename files to "odb.{c,h}"
object-store: rename `object_directory` to `odb_source`
object-store: rename `raw_object_store` to `object_database`
Junio C Hamano [Tue, 8 Jul 2025 22:49:19 +0000 (15:49 -0700)]
Merge branch 'kn/fetch-push-bulk-ref-update'
"git push" and "git fetch" are taught to update refs in batches to
gain performance.
* kn/fetch-push-bulk-ref-update:
receive-pack: handle reference deletions separately
refs/files: skip updates with errors in batched updates
receive-pack: use batched reference updates
send-pack: fix memory leak around duplicate refs
fetch: use batched reference updates
refs: add function to translate errors to strings
In a recent security release, 05e9cd64ee (config: quote values
containing CR character, 2025-05-19) added calls to `git config get`,
`git config set`, and `git config unset` which are not present on the
maint-2.43 branch.
These subcommands were added in the following commits, released in
git-2.46.0:
Taylor Blau [Tue, 8 Jul 2025 18:47:50 +0000 (14:47 -0400)]
Documentation/RelNotes: use .adoc extension for new security releases
When preparing the latest round of security fixes, we wrote release
notes in v2.43.7, and then successively merged those up through to the
various 'maint' branches.
However, the 2.49 release series is the first to have commit 1f010d6bdf
(doc: use .adoc extension for AsciiDoc files, 2025-01-20). This means
that we should have renamed the new-but-historical release notes from
*.txt to *.adoc during the merge into the 'maint-2.49' branch, but
neglected to do so.
Rename them accordingly to match the convention introduced by 1f010d6bdf. Since the release materials in question here were prepared
before v2.50.0 was tagged, the 'maint' track for that release series is
OK as is.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 7 Jul 2025 21:12:55 +0000 (14:12 -0700)]
Merge branch 'jk/submodule-remote-lookup-cleanup'
Updating submodules from the upstream did not work well when
submodule's HEAD is detached, which has been improved.
* jk/submodule-remote-lookup-cleanup:
submodule: look up remotes by URL first
submodule: move get_default_remote_submodule()
submodule--helper: improve logic for fallback remote name
remote: remove the_repository from some functions
dir: move starts_with_dot(_dot)_slash to dir.h
remote: fix tear down of struct remote
remote: remove branch->merge_name and fix branch_release()
Junio C Hamano [Wed, 2 Jul 2025 19:08:04 +0000 (12:08 -0700)]
Merge branch 'ag/imap-send-resurrection'
"git imap-send" has been broken for a long time, which has been
resurrected and then taught to talk OAuth2.0 etc.
* ag/imap-send-resurrection:
imap-send: fix minor mistakes in the logs
imap-send: display the destination mailbox when sending a message
imap-send: display port alongwith host when git credential is invoked
imap-send: add ability to list the available folders
imap-send: enable specifying the folder using the command line
imap-send: add PLAIN authentication method to OpenSSL
imap-send: add support for OAuth2.0 authentication
imap-send: gracefully fail if CRAM-MD5 authentication is requested without OpenSSL
imap-send: fix memory leak in case auth_cram_md5 fails
imap-send: fix bug causing cfg->folder being set to NULL
Rename `read_object_with_reference()` to `odb_read_object_peeled()` to
match other functions related to the object database and our modern
coding guidelines. Furthermore though, the old name didn't really
describe very well what this function actually does, which is to walk
down any commit and tag objects until an object of the required type has
been found. This is generally referred to as "peeling", so the new name
should be way more descriptive.
No compatibility wrapper is introduced as the function is not used a lot
throughout our codebase.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename `oid_object_info()` to `odb_read_object_info()` as well as their
`_extended()` variant to match other functions related to the object
database and our modern coding guidelines.
Introduce compatibility wrappers so that any in-flight topics will
continue to compile.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: trivial refactorings to get rid of `the_repository`
All of the external functions provided by the object database subsystem
don't depend on `the_repository` anymore, but some internal functions
still do. Refactor those cases by plumbing through the repository that
owns the object database.
This change allows us to get rid of the `USE_THE_REPOSITORY_VARIABLE`
preprocessor define.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: get rid of `the_repository` when handling submodule sources
The "--recursive" flag for git-grep(1) allows users to grep for a string
across submodule boundaries. To make this work we add each submodule's
object sources to our own object database so that the objects can be
accessed directly.
The infrastructure for this depends on a global string list of submodule
paths. The caller is expected to call `add_submodule_odb_by_path()` for
each source and the object database will then eventually register all
submodule sources via `do_oid_object_info_extended()` in case it isn't
able to look up a specific object.
This reliance on global state is of course suboptimal with regards to
our libification efforts.
Refactor the logic so that the list of submodule sources is instead
tracked in the object database itself. This allows us to lose the
condition of `r == the_repository` before registering submodule sources
as we only ever add submodule sources to `the_repository` anyway. As
such, behaviour before and after this refactoring should always be the
same.
Rename the functions accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: get rid of `the_repository` when handling the primary source
The functions `set_temporary_primary_odb()` and `restore_primary_odb()`
are responsible for managing a temporary primary source for the
database. Both of these functions implicitly rely on `the_repository`.
Refactor them to instead take an explicit object database parameter as
argument and adjust callers. Rename the functions accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: get rid of `the_repository` in `for_each()` functions
There are a couple of iterator-style functions that execute a callback
for each instance of a given set, all of which currently depend on
`the_repository`. Refactor them to instead take an object database as
parameter so that we can get rid of this dependency.
Rename the functions accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
odb: get rid of `the_repository` when handling alternates
The functions to manage alternates all depend on `the_repository`.
Refactor them to accept an object database as a parameter and adjust all
callers. The functions are renamed accordingly.
Note that right now the situation is still somewhat weird because we end
up using the object store path provided by the object store's repository
anyway. Consequently, we could have instead passed in a pointer to the
repository instead of passing in the pointer to the object store. This
will be addressed in subsequent commits though, where we will start to
use the path owned by the object store itself.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of our dependency on `the_repository` in `find_odb()` by passing
in the object database in which we want to search for the source and
adjusting all callers.
Rename the function to `odb_find_source()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In subsequent commits we'll get rid of our use of `the_repository` in
"odb.c" in favor of explicitly passing in a `struct object_database` or
a `struct odb_source`. In some cases though we'll need access to the
repository, for example to read a config value from it, but we don't
have a way to access the repository owning a specific object database.
Introduce parent pointers for `struct object_database` to its owning
repository as well as for `struct odb_source` to its owning object
database, which will allow us to adapt those use cases.
Note that this change requires us to pass through the object database to
`link_alt_odb_entry()` so that we can set up the parent pointers for any
source there. The callchain is adapted to pass through the object
database accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commits we have renamed the structures contained in
"object-store.h" to `struct object_database` and `struct odb_backend`.
As such, the code files "object-store.{c,h}" are confusingly named now.
Rename them to "odb.{c,h}" accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-store: rename `object_directory` to `odb_source`
The `object_directory` structure is used as an access point for a single
object directory like ".git/objects". While the structure isn't yet
fully self-contained, the intent is for it to eventually contain all
information required to access objects in one specific location.
While the name "object directory" is a good fit for now, this will
change over time as we continue with the agenda to make pluggable object
databases a thing. Eventually, objects may not be accessed via any kind
of directory at all anymore, but they could instead be backed by any
kind of durable storage mechanism. While it seems quite far-fetched for
now, it is thinkable that eventually this might even be some form of a
database, for example.
As such, the current name of this structure will become worse over time
as we evolve into the direction of pluggable ODBs. Immediate next steps
will start to carve out proper self-contained object directories, which
requires us to pass in these object directories as parameters. Based on
our modern naming schema this means that those functions should then be
named after their subsystem, which means that we would start to bake the
current name into the codebase more and more.
Let's preempt this by renaming the structure. There have been a couple
alternatives that were discussed:
- `odb_backend` was discarded because it led to the association that
one object database has a single backend, but the model is that one
alternate has one backend. Furthermore, "backend" is more about the
actual backing implementation and less about the high-level concept.
- `odb_alternate` was discarded because it is a bit of a stretch to
also call the main object directory an "alternate".
Instead, pick `odb_source` as the new name. It makes it sufficiently
clear that there can be multiple sources and does not cause confusion
when mixed with the already-existing "alternate" terminology.
In the future, this change allows us to easily introduce for example a
`odb_files_source` and other format-specific implementations.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object-store: rename `raw_object_store` to `object_database`
The `raw_object_store` structure is the central entry point for reading
and writing objects in a repository. The main purpose of this structure
is to manage object directories and provide an interface to access and
write objects in those object directories.
Right now, many of the functions associated with the raw object store
implicitly rely on `the_repository` to get access to its `objects`
pointer, which is the `raw_object_store`. As we want to generally get
rid of using `the_repository` across our codebase we will have to
convert this implicit dependency on this global variable into an
explicit parameter.
This conversion can be done by simply passing in an explicit pointer to
a repository and then using its `->objects` pointer. But there is a
second effort underway, which is to make the object subsystem more
selfcontained so that we can eventually have pluggable object backends.
As such, passing in a repository wouldn't make a ton of sense, and the
goal is to convert the object store interfaces such that we always pass
in a reference to the `raw_object_store` instead.
This will expose the `raw_object_store` type to a lot more callers
though, which surfaces that this type is named somewhat awkwardly. The
"raw_" prefix makes readers wonder whether there is a non-raw variant of
the object store, but there isn't. Furthermore, we nowadays want to name
functions in a way that they can be clearly attributed to a specific
subsystem, but calling them e.g. `raw_object_store_has_object()` is just
too unwieldy, even when dropping the "raw_" prefix.
Instead, rename the structure to `object_database`. This term is already
used a lot throughout our codebase, and it cannot easily be mistaken for
"object directories", either. Furthermore, its acronym ODB is already
well-known and works well as part of a function's name, like for example
`odb_has_object()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 1 Jul 2025 21:17:25 +0000 (14:17 -0700)]
send-pack: clean-up even when taking an early exit
Previous commit has plugged one leak in the normal code path, but
there is an early exit that leaves without releasing any resources
acquired in the function.
FreeBSD 13.4 is no longer supported, and 13.5 will be the last
release from that series, so jump instead to 14.3 which should
be supported for another 10 months and will be at that point
the oldest supported release with the interim release of 15.
While at it, move some variables to the environment and make
sure to skip a git grep test that assumes glibc regex.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 30 Jun 2025 21:30:31 +0000 (14:30 -0700)]
Merge branch 'jc/merge-compact-summary'
"git merge/pull" has been taught the "--compact-summary" option to
use the compact-summary format, intead of diffstat, when showing
the summary of the incoming changes.
* jc/merge-compact-summary:
merge/pull: extend merge.stat configuration variable to cover --compact-summary
merge/pull: add the "--compact-summary" option
Junio C Hamano [Mon, 30 Jun 2025 21:30:30 +0000 (14:30 -0700)]
Merge branch 'bc/stash-export-import'
An interchange format for stash entries is defined, and subcommand
of "git stash" to import/export has been added.
* bc/stash-export-import:
builtin/stash: provide a way to import stashes from a ref
builtin/stash: provide a way to export stashes to a ref
builtin/stash: factor out revision parsing into a function
object-name: make get_oid quietly return an error
Junio C Hamano [Mon, 30 Jun 2025 21:30:30 +0000 (14:30 -0700)]
Merge branch 'jc/cocci-avoid-regexp-constraint'
Avoid regexp_constraint and instead use comparison_constraint when
listing functions to exclude from application of coccinelle rules,
as spatch can be built with different regexp engine X-<.
daemon: correctly handle soft accept() errors in service_loop
Since df076bdbcc ([PATCH] GIT: Listen on IPv6 as well, if available.,
2005-07-23), the original error checking was included in an inner loop
unchanged, where its effect was different.
Instead of retrying, after a EINTR during accept() in the listening
socket, it will advance to the next one and try with that instead,
leaving the client waiting for another round.
Make sure to retry with the same listener socket that failed originally.
To avoid an unlikely busy loop, fallback to the old behaviour after a
couple of attempts.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jacob Keller [Fri, 27 Jun 2025 22:09:04 +0000 (15:09 -0700)]
send-pack: clean up extra_have oid array
Commit c8009635785e ("fetch-pack, send-pack: clean up shallow oid
array", 2024-09-25) cleaned up the shallow oid array in cmd_send_pack,
but didn't clean up extra_have, which is still leaked at program exit.
I suspect the particular tests in t5539 don't trigger any additions to
the extra_have array, which explains why the tests can pass leak free
despite this gap.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
daemon: remove unnecesary restriction for listener fd
Since df076bdbcc ([PATCH] GIT: Listen on IPv6 as well, if available.,
2005-07-23), any file descriptor assigned to a listening socket was
validated to be within the range to be used in an FDSET later.
6573faff34 (NO_IPV6 support for git daemon, 2005-09-28), moves to
use poll() instead of select(), that doesn't have that restriction,
so remove the original check.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Acked-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 25 Jun 2025 21:07:36 +0000 (14:07 -0700)]
Merge branch 'ps/maintenance-ref-lock'
"git maintenance" lacked the care "git gc" had to avoid holding
onto the repository lock for too long during packing refs, which
has been remedied.
* ps/maintenance-ref-lock:
builtin/maintenance: fix locking race when handling "gc" task
builtin/gc: avoid global state in `gc_before_repack()`
usage: allow dying without writing an error message
builtin/maintenance: fix locking race with refs and reflogs tasks
builtin/maintenance: split into foreground and background tasks
builtin/maintenance: fix typedef for function pointers
builtin/maintenance: extract function to run tasks
builtin/maintenance: stop modifying global array of tasks
builtin/maintenance: mark "--task=" and "--schedule=" as incompatible
builtin/maintenance: centralize configuration of explicit tasks
builtin/gc: drop redundant local variable
builtin/gc: use designated field initializers for maintenance tasks
Junio C Hamano [Wed, 25 Jun 2025 21:07:35 +0000 (14:07 -0700)]
Merge branch 'jc/you-still-use-whatchanged'
"git whatchanged" that is longer to type than "git log --raw"
which is its modern rough equivalent has outlived its usefulness
more than 10 years ago. Plan to deprecate and remove it.
* jc/you-still-use-whatchanged:
whatschanged: list it in BreakingChanges document
whatchanged: remove when built with WITH_BREAKING_CHANGES
whatchanged: require --i-still-use-this
tests: prepare for a world without whatchanged
doc: prepare for a world without whatchanged
you-still-use-that??: help deprecating commands for removal
In 9d2962a7c4 (receive-pack: use batched reference updates, 2025-05-19)
we updated the 'git-receive-pack(1)' command to use batched reference
updates. One edge case which was missed during this implementation was
when a user pushes multiple branches such as:
Before using batched updates, the references would be applied
sequentially and hence no conflicts would arise. With batched updates,
while the first update applies, the second fails due to D/F conflict. A
similar issue was present in 'git-fetch(1)' and was fixed by separating
out reference pruning into a separate transaction in the commit 'fetch:
use batched reference updates'. Apply a similar mechanism for
'git-receive-pack(1)' and separate out reference deletions into its own
batch.
This means 'git-receive-pack(1)' will now use up to two transactions,
whereas before using batched updates it would use _at least_ two
transactions. So using batched updates is still the better option.
Add a test to validate this behavior.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Karthik Nayak [Fri, 20 Jun 2025 07:15:44 +0000 (09:15 +0200)]
refs/files: skip updates with errors in batched updates
The commit 23fc8e4f61 (refs: implement batch reference update support,
2025-04-08) introduced support for batched reference updates. This
allows users to batch updates together, while allowing some of the
updates to fail.
Under the hood, batched updates use the reference transaction mechanism.
Each update which fails is marked as such. Any failed updates must be
skipped over in the rest of the code, as they wouldn't apply any more.
In two of the loops within 'files_transaction_finish()' of the files
backend, the failed updates aren't skipped over. This can cause a
SEGFAULT otherwise. Add the missing skips and a test to validate the
same.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 24 Jun 2025 16:48:51 +0000 (09:48 -0700)]
Merge branch 'sa/multi-mailmap-fix'
When asking to apply mailmap to both author and committer field
while showing a commit object, the field that appears later was not
correctly parsed and replaced, which has been corrected.
* sa/multi-mailmap-fix:
cat-file: fix mailmap application for different author and committer
Junio C Hamano [Tue, 24 Jun 2025 16:48:47 +0000 (09:48 -0700)]
Merge branch 'ag/send-email-edit-threading-fix'
"git send-email" incremented its internal message counter when a
message was edited, which made logic that treats the first message
specially misbehave, which has been corrected.
* ag/send-email-edit-threading-fix:
send-email: show the new message id assigned by outlook in the logs
send-email: fix bug resulting in broken threads if a message is edited
Jeff King [Mon, 23 Jun 2025 10:56:25 +0000 (06:56 -0400)]
test-lib: teach test_seq the -f option
The "seq" tool has a "-f" option to produce printf-style formatted
lines. Let's teach our test_seq helper the same trick. This lets us get
rid of some shell loops in test snippets (which are particularly verbose
in our test suite because we have to "|| return 1" to keep the &&-chain
going).
This converts a few call-sites I found by grepping around the test
suite. A few notes on these:
- In "seq", the format specifier is a "%g" float. Since test_seq only
supports integers, I've kept the more natural "%d" (which is what
these call sites were using already).
- Like "seq", test_seq automatically adds a newline to the specified
format. This is what all callers are doing already except for t0021,
but there we do not care about the exact format. We are just trying
to printf a large number of bytes to a file. It's not worth
complicating other callers or adding an option to avoid the newline
in that caller.
- Most conversions are just replacing a shell loop (which does get rid
of an extra fork, since $() requires a subshell). In t0612 we can
replace an awk invocation, which I think makes the end result more
readable, as there's less quoting.
- In t7422 we can replace one loop, but sadly we have to leave the
loop directly above it. This is because that earlier loop wants to
include the seq value twice in the output, which test_seq does not
support (nor does regular seq). If you run:
test_seq -f "foo-%d %d" 10
the second "%d" will always be the empty string. You might naively
think that test_seq could add some extra arguments, like:
# 3 ought to be enough for anyone...
printf "$fmt\n" "$i "$i" $i"
but that just triggers printf to format multiple lines, one per
extra set of arguments.
So we'd have to actually parse the format string, figure out how
many "%" placeholders are there, and then feed it that many
instances of the sequence number. The complexity isn't worth it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jacob Keller [Mon, 23 Jun 2025 23:11:35 +0000 (16:11 -0700)]
submodule: look up remotes by URL first
The get_default_remote_submodule() function performs a lookup to find
the appropriate remote to use within a submodule. The function first
checks to see if it can find the remote for the current branch. If this
fails, it then checks to see if there is exactly one remote. It will use
this, before finally falling back to "origin" as the default.
If a user happens to rename their default remote from origin, either
manually or by setting something like clone.defaultRemoteName, this
fallback will not work.
In such cases, the submodule logic will try to use a non-existent
remote. This usually manifests as a failure to trigger the submodule
update.
The parent project already knows and stores the submodule URL in either
.gitmodules or its .git/config.
Add a new repo_remote_from_url() helper which will iterate over all the
remotes in a repository and return the first remote which has a matching
URL.
Refactor get_default_remote_submodule to find the submodule and get its
URL. If a valid URL exists, first try to obtain a remote using the new
repo_remote_from_url(). Fall back to the repo_default_remote()
otherwise.
The fallback logic is kept in case for some reason the user has manually
changed the URL within the submodule. Additionally, we still try to use
a remote rather than directly passing the URL in the
fetch_in_submodule() logic. This ensures that an update will properly
update the remote refs within the submodule as expected, rather than
just fetching into FETCH_HEAD.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jacob Keller [Mon, 23 Jun 2025 23:11:33 +0000 (16:11 -0700)]
submodule--helper: improve logic for fallback remote name
The repo_get_default_remote() function in submodule--helper currently
tries to figure out the proper remote name to use for a submodule based
on a few factors.
First, it tries to find the remote for the currently checked out branch.
This works if the submodule is configured to checkout to a branch
instead of a detached HEAD state.
In the detached HEAD state, the code calls back to using "origin", on
the assumption that this is the default remote name. Some users may
change this, such as by setting clone.defaultRemoteName, or by changing
the remote name manually within the submodule repository.
As a first step to improving this situation, refactor to reuse the logic
from remotes_remote_for_branch(). This function uses the remote from the
branch if it has one. If it doesn't then it checks to see if there is
exactly one remote. It uses this remote first before attempting to fall
back to "origin".
To allow using this helper function, introduce a repo_default_remote()
helper to remote.c which takes a repository structure. This helper will
load the remote configuration and get the "HEAD" branch. Then it will
call remotes_remote_for_branch to find the default remote.
Replace calls of repo_get_default_remote() with the calls to this new
function. To maintain consistency with the existing callers, continue
copying the returned string with xstrdup.
This isn't a perfect solution for users who change remote names, but it
should help in cases where the remote name is changed but users haven't
added any additional remotes.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jacob Keller [Mon, 23 Jun 2025 23:11:32 +0000 (16:11 -0700)]
remote: remove the_repository from some functions
The remotes_remote_get_1 (and its caller, remotes_remote_get, have an
implicit dependency on the_repository due to calling
read_branches_file() and read_remotes_file(), both of which use
the_repository. The branch_get() function calls set_merge() which has an
implicit dependency on the_repository as well.
Because of this use of the_repository, the helper functions cannot be
used in code paths which operate on other repositories. A future
refactor of the submodule--helper will want to make use of some of these
functions.
Refactor to break the dependency by passing struct repository *repo
instead of struct remote_state *remote_state in a few places.
The public callers and many other helper functions still depend on
the_repository. A repo-aware function will be exposed in a following
change for git submodule--helper.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jacob Keller [Mon, 23 Jun 2025 23:11:31 +0000 (16:11 -0700)]
dir: move starts_with_dot(_dot)_slash to dir.h
Both submodule--helper.c and submodule-config.c have an implementation
of starts_with_dot_slash and starts_with_dot_dot_slash. The dir.h header
has starts_with_dot(_dot)_slash_native, which sets PATH_MATCH_NATIVE.
Move the helpers to dir.h as static inlines. I thought about renaming
them to postfix with _platform but that felt too long and ugly. On the
other hand it might be slightly confusing with _native.
This simplifies a submodule refactor which wants to use the helpers
earlier in the submodule--helper.c file.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jacob Keller [Mon, 23 Jun 2025 23:11:30 +0000 (16:11 -0700)]
remote: fix tear down of struct remote
The remote_clear() function failed to free the remote->push and
remote->fetch refspec fields.
This should be caught by the leak sanitizer. However, for callers which
use ``the_repository``, the values never go out of scope and the
sanitizer doesn't complain.
A future change is going to add a caller of read_config() for a
submodule repository structure, which would result in the leak sanitizer
complaining.
Fix remote_clear(), updating it to properly call refspec_clear() for
both the push and fetch members.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jacob Keller [Mon, 23 Jun 2025 23:11:29 +0000 (16:11 -0700)]
remote: remove branch->merge_name and fix branch_release()
The branch structure has both branch->merge_name and branch->merge for
tracking the merge information. The former is allocated by add_merge()
and stores the names read from the configuration file. The latter is
allocated by set_merge() which is called by branch_get() when an
external caller requests a branch.
This leads to the confusing situation where branch->merge_nr tracks both
the size of branch->merge (once its allocated) and branch->merge_name.
The branch_release() function incorrectly assumes that branch->merge is
always set when branch->merge_nr is non-zero, and can potentially crash
if read_config() is called without branch_get() being called on every
branch.
In addition, branch_release() fails to free some of the memory
associated with the structure including:
* Failure to free the refspec_item containers in branch->merge[i]
* Failure to free the strings in branch->merge_name[i]
* Failure to free the branch->merge_name parent array.
The set_merge() function sets branch->merge_nr to 0 when there is no
valid remote_name, to avoid external callers seeing a non-zero merge_nr
but a NULL merge array. This results in failure to release most of the
merge data as well.
These issues could be fixed directly, and indeed I initially proposed
such a change at [1] in the past. While this works, there was some
confusion during review because of the inconsistencies.
Instead, its time to clean up the situation properly. Remove
branch->merge_name entirely. Instead, allocate branch->merge earlier
within add_merge() instead of within set_merge(). Instead of having
set_merge() copy from merge_name[i] to merge[i]->src, just have
add_merge() directly initialize merge[i]->src.
Modify the add_merge() to call xstrdup() itself, instead of having
the caller of add_merge() do so. This makes it more obvious which code
owns the memory.
Update all callers which use branch->merge_name[i] to use
branch->merge[i]->src instead.
Add a merge_clear() function which properly releases all of the
merge-related memory, and which sets branch->merge_nr to zero. Use this
both in branch_release() and in set_merge(), fixing the leak when
set_merge() finds no valid remote_name.
Add a set_merge variable to the branch structure, which indicates
whether set_merge() has been called. This replaces the previous use of a
NULL check against the branch->merge array.
With these changes, the merge array is always allocated when merge_nr is
non-zero.
This use of refspec_item to store the names should be safe. External
callers should be using branch_get() to obtain a pointer to the branch,
which will call set_merge(), and the callers internal to remote.c
already handle the partially initialized refpsec_item structure safely.
This end result is cleaner, and avoids duplicating the merge names
twice.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Link: [1] https://lore.kernel.org/git/20250617-jk-submodule-helper-use-url-v2-1-04cbb003177d@gmail.com/ Signed-off-by: Junio C Hamano <gitster@pobox.com>
So we are passing "$i" as an argument to be filled in, but there is no
"%" placeholder in the format string, which is a bit confusing to read.
We could switch both instances of "$i" to "%d" (and pass $i twice). But
that makes the line even longer. Let's just keep interpolating the value
in the string, and drop the confusing extra "$i" argument.
And since we are not using any printf specifiers at all, it becomes
clear that we can swap it out for echo. We do use a "\n" in the middle
of the string, but breaking this into two separate echo statements
actually makes it easier to read.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 18 Jun 2025 17:55:02 +0000 (10:55 -0700)]
cocci: matching (multiple) identifiers
"make coccicheck" seems to work OK at GitHub CI using
$ spatch --version
spatch version 1.1.1 compiled with OCaml version 4.13.1
OCaml scripting support: yes
Python scripting support: yes
Syntax of regular expressions: PCRE
but not with
$ spatch --version
spatch version 1.3 compiled with OCaml version 5.3.0
OCaml scripting support: yes
Python scripting support: yes
Syntax of regular expressions: Str
Judging from https://ocaml.org/manual/5.3/api/Str.html, I suspect
that this probably is caused by the distinction between BRE vs PCRE.
As there is no reasonably clean way to write the multiple choice
matches portably between these two pattern languages, let's stop
using regexp_constraint and use compare_constraint instead when
listing the function names to exclude.
There are other uses of "!~" but they all want to match a single
simple token, that should work fine either with BRE or PCRE.
Jörg Thalheim [Fri, 20 Jun 2025 15:56:14 +0000 (17:56 +0200)]
imap-send: improve error messages with configuration hints
Replace basic error messages with more helpful ones that guide users
on how to resolve configuration issues. When imap.host or imap.folder
are missing, provide the exact git config commands needed to fix the
problem, along with examples of typical values.
Use the advise() API to display hints in a multi-line format with
proper "hint:" prefixes for each line.
Signed-off-by: Jörg Thalheim <joerg@thalheim.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jörg Thalheim [Fri, 20 Jun 2025 15:56:13 +0000 (17:56 +0200)]
imap-send: fix confusing 'store' terminology in error message
The error message 'no imap store specified' is misleading because
it refers to 'store' when the actual missing configuration is
'imap.folder'. Update the message to use the correct terminology
that matches the configuration variable name.
This reduces confusion for users who might otherwise look for
non-existent 'imap.store' configuration when they see this error.
Signed-off-by: Jörg Thalheim <joerg@thalheim.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 20 Jun 2025 15:29:34 +0000 (08:29 -0700)]
Merge branch 'ag/imap-send-resurrection' into jt/imap-send-message-fix
* ag/imap-send-resurrection:
imap-send: fix minor mistakes in the logs
imap-send: display the destination mailbox when sending a message
imap-send: display port alongwith host when git credential is invoked
imap-send: add ability to list the available folders
imap-send: enable specifying the folder using the command line
imap-send: add PLAIN authentication method to OpenSSL
imap-send: add support for OAuth2.0 authentication
imap-send: gracefully fail if CRAM-MD5 authentication is requested without OpenSSL
imap-send: fix memory leak in case auth_cram_md5 fails
imap-send: fix bug causing cfg->folder being set to NULL
Aditya Garg [Fri, 20 Jun 2025 06:40:33 +0000 (12:10 +0530)]
imap-send: fix minor mistakes in the logs
Some minor mistakes have been found in the logs. Most of them include
error messages starting with a capital letter, and ending with a period.
Abbreviations like "IMAP" and "OK" should also be in uppercase. Another
mistake was that the error message showing unknown authentication
mechanism used was displaying the host rather than the mechanism in the
logs. Fix them.
Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aditya Garg [Fri, 20 Jun 2025 06:40:31 +0000 (12:10 +0530)]
imap-send: display port alongwith host when git credential is invoked
When requesting for passsword, git credential helper used to display
only the host name. For example:
Password for 'imaps://gargaditya08%40live.com@outlook.office365.com':
Now, it will display the port along with the host name:
Password for 'imaps://gargaditya08%40live.com@outlook.office365.com:993':
This has been done to make credential helpers more specific for ports.
Also, this behaviour will also mimic git send-email, which displays
the port along with the host name when requesting for a password.
Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aditya Garg [Fri, 20 Jun 2025 06:40:30 +0000 (12:10 +0530)]
imap-send: add ability to list the available folders
Various IMAP servers have different ways to name common folders.
For example, the folder where all deleted messages are stored is often
named "[Gmail]/Trash" on Gmail servers, and "Deleted" on Outlook.
Similarly, the Drafts folder is simply named "Drafts" on Outlook, but
on Gmail it is named "[Gmail]/Drafts".
This commit adds a `--list` command to the `imap-send` tool that lists
the available folders on the IMAP server, allowing users to see
which folders are available and how they are named. A sample output
looks like this when run against a Gmail server:
Fetching the list of available folders...
* LIST (\HasNoChildren) "/" "INBOX"
* LIST (\HasChildren \Noselect) "/" "[Gmail]"
* LIST (\All \HasNoChildren) "/" "[Gmail]/All Mail"
* LIST (\Drafts \HasNoChildren) "/" "[Gmail]/Drafts"
* LIST (\HasNoChildren \Important) "/" "[Gmail]/Important"
* LIST (\HasNoChildren \Sent) "/" "[Gmail]/Sent Mail"
* LIST (\HasNoChildren \Junk) "/" "[Gmail]/Spam"
* LIST (\Flagged \HasNoChildren) "/" "[Gmail]/Starred"
* LIST (\HasNoChildren \Trash) "/" "[Gmail]/Trash"
For OpenSSL, this is achived by running the 'IMAP LIST' command and
parsing the response. This command is specified in RFC6154:
https://datatracker.ietf.org/doc/html/rfc6154#section-5.1
For libcurl, the example code published in the libcurl documentation
is used to implement this functionality:
https://curl.se/libcurl/c/imap-list.html
Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aditya Garg [Fri, 20 Jun 2025 06:40:28 +0000 (12:10 +0530)]
imap-send: add PLAIN authentication method to OpenSSL
The current implementation for PLAIN in imap-send works just fine
if using curl, but if attempted to use for OpenSSL, it is treated
as an invalid mechanism. The default implementation for OpenSSL is
IMAP LOGIN command rather than AUTH PLAIN. Since AUTH PLAIN is
still used today by many email providers in form of app passwords,
lets add an implementation that can use AUTH PLAIN if specified.
Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>